Writing your own Feature Definition Files¶
In melfeature, the feature configuration files are written in the YAML language, which improves their readability and structure. The first example feature will simply compute the pitch range (ambitus) in semitones of a given note sequence:
label: Pitch range in semitones description: Pitch range in semitones feature: source: PITCH: param: pitch process: STAT: type: stat measure: range inputVec: PITCH.outputVec sink: PITCH_RANGE_FEATURE: input: STAT.outputVec label: pitch_range
The configuration file has three main categories:
label- this label is used for headlines and table entries in this documentation, but can be any textual label, even empty.
description- a more detailed description of the feature’s functionality, which is used in this documentation in the feature explanation, but can be any textual label, even empty.
feature- the actual feature definition.
feature environment, three feature groups need to be defined:
source- Defines the basic transformation as the starting point. For each used transformation a separate source module must be defined.
process- Here, all processing modules are defined and connected.
sink- Sink modules receive output data of process modules and store it as features. melfeature saves this data to a
CSVfile for further analysis or visualization.
For each module, a
type must be defined.
Examples for frequently used module types are
arithmetic- provides simple arithmetic operations.
hist- computes nominal, metrical, and ordinal histograms.
stat- allows for simple statistic operations such as minimum, maximum, or mean.
logic- computes different logical operations such as
orto compare two vectors.
ngram- compute n-grams from arbitrary input vectors.
source module(s) and the
sink module(s), no
type need to be specified.
Each module has a specific set of input parameters and output parameters, which are explained in detail in the corresponding section in this document. Some of these parameters are mandatory, i.e., they have to be defined. Some of them are optional, i.e., they can be defined, whereas a default value is used if not defined.
For example, the
stat module used in the example shown above has an input parameter named
in the config file, we connect the output parameter
outputVec of the source module named
PITCH (which provides all note pitch values in one vector) to the
inputVec of the
stat module. This allows us to further process the vector with all pitches of note events.
The second parameter
measure defines the actual statistical measure we want to compute.
If we are just interested in the range (of the input vector), we define:
The computed pitch range is then stored in the parameter
outputVec of the
stat module, which we connect—in a similar way—to the final sink module by:
This configuration allows melfeature to compute the pitch range for one or multiple given transcriptions and to finally store it using the feature label