Item Selection & Grouping

The grouping of items can be useful, if we are interested in common properties (feature values) among items of the same group or in features that allow to distinguish items from different groups. At the same time, we can exclude items from further analysis by not including them into the groups that we define.

If items are grouped, the get the same groupID. A group has a specific group label (e.g., “Saxophone solos”) and a unique group ID (e.g., 1).

For the item grouping, we can distinguish between simple grouping and complex grouping settings.

Example 1 - Simple grouping

Here’s an example for a simple grouping of items into two groups: solos played with saxophone and solos played with trumpet:

# Item grouping
grouping: [{"groupLabel":"Saxophone Players","select":["GENERAL_SOLO_METADATA.instrument", "in", ["ts","as"]] },
           {"groupLabel":"Trumpet Players","select":["GENERAL_SOLO_METADATA.instrument", "=", "tp"] }]

In this example, we use the feature GENERAL_SOLO_METADATA.instrument, which is a metadata feature holding the instrument for each item, to group the items. Using the keyword groupLabel, we label two groups as “Saxophone Players” and “Trumpet Players”. Using the keyword select, the distinctive property of all items belonging to each group are defined. This is done by comparing the values of a specific feature (here: GENERAL_SOLO_METADATA.instrument) of all items in the dataset with one ore multiple reference values. All items that fullfil this comparison are assigned to this group.

For the second group (“Trumpet Players”), we explicitely require the instrument label to be tp, which is done using the equality comperator =. For the first group (“Saxophone Players”), we allow the instrument label to be either ts (tenor saxophone) or as (alto saxophone), hence, we require the instrument label to be inside the list ["ts","as"].

Note

As it can be seen in the example above, the syntax for the item grouping is oriented on typical Python data types such as dictionaries and lists.

Note

See the following demo files for further examples:

  • test_melvis_feature_selection_simple_grouping_hardbop_vs_postbop.yml

  • test_melvis_feature_selection_simple_grouping_tp_vs_sax.yml

Example 2 - Complex grouping

Here’s a more complex example that illustrates how multiple features can be used to define groups:

# Item grouping
grouping: [{"groupLabel":"Berg",
            "select":
            {"op":"|",
             "set1":{"op":"&",
               "set1": ["GENERAL_SOLO_METADATA.performer", "=", "Bob Berg"],
               "set2": ["GENERAL_SOLO_METADATA.title",     "in",["Angles","Blues For Bela","Second Sight"]]},
            }},
           {"groupLabel":"Adderley & Parker & Brown",
            "select":
            {"op":"|",
             "set1":
              {"op":"&",
               "set1": ["GENERAL_SOLO_METADATA.performer", "=",  "Cannonball Adderley"],
               "set2": ["GENERAL_SOLO_METADATA.title",     "in", ["High Fly","This Here"] ]
              },
             "set2":
              {"op":"&",
               "set1": ["GENERAL_SOLO_METADATA.performer", "=",  "Charlie Parker"],
               "set2": ["GENERAL_SOLO_METADATA.title",     "in", ["Donna Lee","Scrapple From The Apple"] ]
              },
             "set3":
              {"op":"&",
               "set1": ["GENERAL_SOLO_METADATA.performer", "=", "Clifford Brown"],
               "set2":["GENERAL_SOLO_METADATA.title",      "in", ["Joy Spring","Sandu"]]
              }
            }
            }]

In this example, two groups namely “Berg” and “Adderley & Parker & Brown” are defined that include one and three musicians, respectively. Furthermore, in each group, not all solos of the given artists are included but a specific selection of solos instead.

For the first group, we select all solos with the performer being Bob Berg and the title being either Angles, Blues for Bela, or Second Sight (the and is realized by "op":"&"). The usage of the = and in comperators is similar to the first example.

What’s new here is that for the group “Adderley & Parker & Brown”, we see that three sub-groups, namely “set1”, “set2”, and “set3” are defined and combined by the | operator, which assigns all items to this group which either belong to set1 or set2 or set3. The definition of feature-value-relationships such as ["GENERAL_SOLO_METADATA.performer", "=", "Clifford Brown"] is done in the same way as explained in the first example.

This way, even complicated groupings can be realized, allowing for a multitude of different research approaches.

Note

See the following demo file for further examples:

  • test_melvis_feature_selection_complex_grouping_artists.yml