Transformations

Introduction

Transformations, also known as viewpoints [Conklin2001] or abstractions, are one of the core concepts in the MeloSpyLib. Starting from the basic representation of melodies, which consists of sequences of (onset, duration, pitch)-triplets, transformation are mappings either to sequences or to objects of fixed dimension (scalars, vectors, matrices). The first kind produces sequences of variable length, depending on the length of the original melody. For example, the duration class transformation ‘forgets’ first about pitch and onsets and classifies then each duration into one of five classes. The second kind of transformation produces object of fixed dimensions, e.g., a scalar (single value) or a vector (multiple values). For instance, the number of tones in a melody would be a scalar feature, whereas the relative frequency of tonal pitch classes can be seen as a fixed 12-dim vector. The second kind of transformations are usually called ‘feature’ in the literature. However, the distinction between features and transformatons cannot and should not be made fully rigid, mostly for practical reasons. Some of the transformations provided by MeloSpyLib are actually features in the above mentioned sense, on the other hand, some of the pre-defined features in the Feature Definition Files are more accurately classified as transformations (of the first kind). A more practical definition used in this documentation is, that those transformations (of first or second kind) provided by the MeloSpyLib on the library level are called transformations, whereas features are those defined in Feature Definition Files. Transformations form always the first element (the source) of a feature.

A large share of the transformations are based on a single dimension of the original core representations, e.g., they use only the pitch or the onset dimension. Other transformations are based not only on the core representation of a melody but rely on the provided metadata, hence, they are not fully general. For example, chordal pitch class requires for each tone a chord context. For the jazz solo data in the Weimar Jazz Database all annotations are guaranteed to be present. For the EsAC database only phrase and metrical annotations as well as metadata are provided, whereas chord, chorus, form, and beat annotations are missing.

In this document, transformations that are entirely based on annotations are called pseudo-transformations, whereas transformation of fixed length will be called degenerated.

Available transformations

The first column in the table holds the name of the transformations. The second column indicates the length of the transformation. N always means the length of the original melody. ‘Var’ means that the length of result sequence is a non-trivial function of the length of the original sequences. Concrete integers mean here degenerate transformation, which result in a fixed scalar, vector or matrix of the indicated length. The third column gives the type of the elements of the resulting sequences, e.g., Real, Integer, Char, String, etc. along with possible constraints to the value set. The fourth colum gives the abbreviation of the transformation that has to be used in a feature defintions file. The fifth column finally indicates the necessary requirements, i.e., annotations.

Next part: Metrical System.

Transformation Length Type Abbreviation(s) Optional Parameter Requirements
MIDI Pitch N Real (Integer) pitch None
Absolute Pitch Class N Integer [0:11] pc, pitch-class None
Chordal Pitch Class N Integer [0:11] cpc, chordal-pitch-class KeepNC Chord annotation
Chordal Diatonic Pitch Class N Char (1,2,3,4,5,6,7,<,>,T,L) cdpc, chord-dpc, chordal-dpc, chordal-diatonic-pitch-class KeepNC Chord annotation
Extended Chordal Diatonic Pitch Class N Char (1, 2, 3, 4, 5, 6, 7, <, >, T, L, %, \) cdpcx, chord-dpc-x, chordal-dpc-ext, chordal-diatonic-pitch-class-ext KeepNC Chord annotation
Tonal Pitch Class N Integer [0:11] tpc, tonal-pitch-class Key annotations
Tonal Diatonic Pitch Class N Char (1,2,3,4,5,6,7,<,>,T,L) tdpc, tonal-dpc, toanl-diatonic-pitch-class Key annotation
Semitone Intervals N-1 Integer interval None
Fuzzy Intervals (Refined Contour) N-1 Integer [-4:4] fuzzyinterval None
Parsons Code (Contour) N-1 Integer [-1:+1] parsons None
Huron Contour Var Integer [0:8] or labels hor, hor-desc, hor-asc etc. huroncontour code, redcode None
Beat Track Var Real beats, beattrack Beat annotation
Chords Var String (Chord label) ch, chords types Chord annotation
Chord Events N String (Chord label) che, chord-events Chord annotation
Phrase IDs N Integer phrid, phrase-ids, phraseids Phrase annotation
Phrase Boundary Markers N Integer [0, 1] phrbd, phrase-boundaries, phrasebounds Phrase annotation
Chorus IDs N Integer chorusid, chorus-ids, chorusids Chorus annotation
Metadata 1 String meta Metadata
Onsets N Real onsets None
Inter-onset Intervals N-1 Real ioi None
Duration N Real duration None
Duration Tatum N Integer durtatum Metrical annotation
Duration Classes N Integer [-2:2] durclass abs, rel None
Inter-onset Interval Classes N Integer [-2:2] ioiclass abs, rel None
Total Duration 1 Float total-duration None
Total Metrical Duration 1 Float total-metrical-duration Metrical annotation
Metrical Position N String (Metrical position notation) meter Metrical annotation
Metrical Position Decimal N Real meter-decimal Metrical annotation
Metrical Circle Map N Integer mcm <N> Metrical annotation
Bar Numbers N Integer bars Metrical annotation
Beat Indices N Integer beats Metrical annotation
Tatum Indices N Integer tatums Metrical annotation
Metrical Weights N Integer [0:2] metricalweights , weights, mw Metrical annotation
Swing Ratio Var Real [0,1] swing-ratios Metrical annotation
Accents N Integer 0,1/Real [0,1] accent<-TYPE> <TYPE> Var

MIDI Pitch

  • Abbreviation: pitch
  • Length: N
  • Integer (Real) [0:127]
  • Optional Parameter: –
  • Requires: –

Raw MIDI Pitches of a medody (integers). All rhythm information is discarded.

Absolute Pitch Class

  • Abbreviations: pc, pitch-class
  • Length: N
  • Type: Integer [0:11]
  • Optional Parameter: –
  • Requires: –

MIDI pitch disregrading octave position (mod 12) with C = 0, C# = 1, D=2, …, B=11

Chordal Pitch Class

  • Abbreviation: cpc, chord-pc, chordal-pc, chordal-pitch-class
  • Length: N
  • Type: Integer [-1:11]
  • Optional Parameter: keepNC
  • Requires: Chord annotation

As Absolute Pitch Class, but with the root of the current chord context mapped to 0. Events without chord annotation will left out, unless the optional parameter keepNC is set, in which case they will represented by the value -1.

Chordal Diatonic Pitch Class

  • Abbreviation: cdpc, chordal-diatonic-pitch-class
  • Length: N
  • Type: Char (1,2,3,4,5,6,7,<,>,T,L)
  • Optional Parameter: keepNC
  • Requires: Chord annotation

As Chordal Pitch Class, but with a diatonic naming scheme:

Symbol Chord tone
1 root
2 major/minor second
3 third:
B minor third over major chord
> major third over minor chord
4 fourth
5 fifth
T tritone
7 seventh
6 major/minor sixth
< minor seventh over major-seventh chord
L major seventh over minor-seventh chord

Events without chord annotation will left out, unless the optional parameter keepNC is set, in which case they will represented by its raw MIDI pitch.

Extended Chordal Diatonic Pitch Class

  • Abbreviation: cdpcx, chord-dpc-x, chordal-dpc-ext, chordal-diatonic-pitch-class-ext
  • Length: N
  • Type: Char (1,2,\,3,4,5,6,%,7,<,>,T,L)
  • Optional Parameter: keepNC
  • Requires: Chord annotation

As Chordal Diatonic Pitch Class, but with two more classes:

Symbol Chord tone
1 root
\ minor second
2 major second
3 third:
B minor third over major chord
> major third over minor chord
4 fourth
5 fifth
T tritone
% minor sixth
6 major sixth
7 seventh
< minor seventh over maj-seventh chord
L major seventh over minor-seventh chord

Events without chord annotation will left out, unless the optional parameter keepNC is set, in which case they will represented by its raw MIDI pitch.

Tonal Pitch Class

  • Abbreviation: tpc, tonal-pc, tonal-pitch-class
  • Length: N
  • Type: Integer [0:11]
  • Optional Parameter: –
  • Requires: Key annotation

As Chordal Pitch Class, but with the local key as reference instead of local chords.

Tonal Diatonic Pitch Class

  • Abbreviation: tdpc, tonal-dpc, tonal-diatonic-pitch-class
  • Length: N
  • Type: Integer [0:11]
  • Optional Parameter: –
  • Requires: Key annotation

As Chordal Diatonic Pitch Class, but with the local key as reference instead of local chords.

Semitone Intervals

  • Abbreviation: interval
  • Length: N-1
  • Type: Integer
  • Optional Parameter: –
  • Requires: –

Semitone intervals between subsequent pitches. All rhythm information is discarded.

Fuzzy Intervals (Refined Contour)

  • Abbreviation: fuzzy-interval
  • Length: N-1
  • Type: Integer [-4:+4]
  • Optional Parameter: –
  • Requires: –

Fuzzy intervals between subsequent pitches. All rhythm information is discarded. Fuzzy intervals (or refined )contour values come from a simple 5-fold classification of interval sizes:

Interval class name Class borders Numerical representation
big jump down <-7 -4
jump down [-7:-5] -3
leap down [-4:-3] -2
step down [-2:-1] -1
repetition/unisone 0 0
step up [1:2] 1
leap up [3:4] +2
jump up [+5:+7] +3
big jump up >+7 +4

Parsons Code (Contour)

  • Abbreviation: parsons
  • Length: N-1
  • Type: Integer [-1:+1]
  • Optional Parameter: –
  • Requires: –

Parson’s code (often also called contour) codes the interval direction in three distinct classes:

Interval class name Class borders Numerical representation
down (D) <0 -1
repetition/unisone (R) 0 0
up (U) >0 1

Huron Contour

  • Abbreviation: huroncontour
  • Length: 1
  • Type: Integer [0:8] or labels
  • Optional Parameter: code, redcode
  • Requires: –

Huron contour is defined for sequences of pitch values and is determined based on three reference pitches: The first, the last and the average value of the pitches in between. The first and the average and the average and the last pitch are compared for magnitude. This yields nine possible combinations of the two magnitude relations with three possibilities each (greater (+), lesser (-) or equal (0)). These are coded to nine different contour values, as seen in Tab. Huron Contour Codes. (The numerical values are calculated using the formula 3c_1 + c_2, where c_i are the comparison values coded as - -> -1, 0 -> 0, + ->1.) Default output are numerical values, if the optional parameter code is set, lables will be emitted. There is also a reduced version of the code by mapping the mixed horizontal classes to their ascending/descending part with can be accessed with the optional parameter redcode.

Huron Contour Codes
Relation Full name Contour code Reduced code Numerical value
(0,0) Horizontal hor hor 0
(0,+) Horizontal-ascending hor-asc asc 1
(0,-) Horizontal-descending hor-desc desc -1
(+,0) Ascending-horizontal asc-hor asc 3
(-,0) Descending-horizontal desc-hor desc -3
(+,+) Ascending asc asc 4
(-,-) Descending desc desc -4
(+,-) Convex convex convex 2
(-,+) Concave concave concave -2

Beat Track

  • Abbreviation: beats, beattrack
  • Length: Var
  • Type: Real
  • Optional Parameter: –
  • Requires: Beat annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from an annotated beat track - if available.

Chords

  • Abbreviation: ch, chords
  • Length: Var
  • Type: String (Chord labels (s. Chords)
  • Optional Parameter: type
  • Requires: Chord annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated chord symbols. This is the raw list of chords with no obvious relation to the events. If the optional parameter type is set, only the basic chord type will be used.

Chord Events

  • Abbreviation: che, chord-events
  • Length: N
  • Type: String (Chord labels)
  • Optional Parameter: –
  • Requires: Chord annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated chord symbols. This a list of chords of the same length as the melody, where the chord context of each event is returned.

Phrase IDs

  • Abbreviation: phrid, phrase-ids, phraseids
  • Length: N
  • Type: Integer
  • Optional Parameter: –
  • Requires: Phrase annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated phrase boundaries. Phrases are enumerated starting with index 1. For each event the ID of the phrase containing the event is returned.

Phrase Boundary Markers

  • Abbreviation: phrbd, phrase-boundaries, phrasebounds
  • Length: N
  • Type: Integer [0, 1] (Boolean)
  • Optional Parameter: –
  • Requires: Phrase annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated phrase boundaries. Each element is mapped to 1, if it starts a phrase (1) or to 0 if its an inner or end element of a phrase.

Chorus IDs

  • Abbreviation: chorusid, chorus-ids, chorusids
  • Length: N
  • Type: Integer
  • Optional Parameter: –
  • Requires: Chorus annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated choruses. Choruses are enumerated starting with index 1. For each event the ID of the chorus containing the event is returned.

Metadata

  • Abbreviation: meta.<FIELD>
  • Length: 1
  • Type: String
  • Optional Parameter: –
  • Requires: Metadata

If metadata is present in the SQL database, you access the field names by using the above syntax meta.<FIELD>, e.g., meta.title will return the title as stored in the database.

Onsets

  • Abbreviation: onsets
  • Length: N
  • Type: Real
  • Optional Parameter: –
  • Requires: –

This transformation belongs to the projection, i.e. it is the raw list of onsets of the melody events.

Inter-onset Intervals

  • Abbreviation: ioi
  • Length: N-1
  • Type: Real
  • Optional Parameter: –
  • Requires: –

This transformation returns interonset-intervals, i.e. the differences of consecutive onsets of the melody events.

Duration

  • Abbreviation: duration
  • Length: N
  • Type: Real
  • Optional Parameter: –
  • Requires: –

This transformation belongs to the projections, i.e. it is the raw list of durations of the melody events.

Duration Tatum

  • Abbreviation: durtatum
  • Length: N
  • Type: Integer
  • Optional Parameter: –
  • Requires: Metical annotation

Duration Classes

  • Abbreviation: durclass
  • Length: N-1
  • Type: Integer [-2:+2]
  • Optional parameter: abs or rel
  • Requires: –

Durations are mapped to duration classes by comparing them to a reference duration. There are two possibilities, specified by an optimal paramter: relative (rel) and absolute (abs) mode. In relative mode the reference duration is the beat duration of the surrounding beat, in absolute mode the value of 500ms is used, which correspond to the beat duration in a tempo of 120 bpm. The class definitions are displayed in Duration/IOI classes, expressed as percentage of the reference duration.

Duration/IOI classes
Class name Class borders Numerical value
very short <35% -2
short 35%-70% -1
medium 70%-140% 0
long 140%-280% 1
very long >280% 2

Inter-onset Interval Classes

  • Abbreviation: ioiclass
  • Length: N-1
  • Type: Integer [-2:+2]
  • Optional parameter: abs or rel
  • Requires: –

Inter-onset interval are mapped to IOI classes by comparing them to a reference duration. There are two possibilities, specified by an optimal parameter: relative (rel) and absolute (abs) mode. In relative mode the reference duration is the beat duration of the surrounding beat, in absolute mode the value of 500ms is used, which correspond to the beat duration in a tempo of 120 bpm. The class definitions are displayed in Duration/IOI classes.

Total Duration

  • Abbreviation: total-duration
  • Length: 1
  • Type: Float
  • Optional parameter: –
  • Requires: –

This degenerated transformation retrieves the duration of a melody (in seconds).

Total Metrical Duration

  • Abbreviation: total-metrical-duration
  • Length: 1
  • Type: Float
  • Optional parameter: –
  • Requires: Metrical annotation

This degenerated transformation retrieves the duration of a melody (in fractional metrical units, where each numerical unit represents one bar).

Metrical Position

Bar Numbers

  • Abbreviation: bars
  • Length: N
  • Type: Integer
  • Optional parameter: –
  • Requires: Metrical annotation

Beat Indices

  • Abbreviation: beats
  • Length: N
  • Type: Integer
  • Optional parameter: –
  • Requires: Metrical annotation

Tatum Indices

  • Abbreviation: tatum
  • Length: N
  • Type: Integer
  • Optional parameter: –
  • Requires: Metrical annotation

Metrical Position Decimal

  • Abbreviation: meter-decimal
  • Length: N
  • Type: Integer
  • Optional parameter: –
  • Requires: Metrical annotation

Metrical Inter-onset Intervals Decimal

  • Abbreviation: meter-ioi-decimal
  • Length: N-1
  • Type: Integer
  • Optional parameter: –
  • Requires: Metrical annotation

Calculates inter-onset intervals in decimal metrical units, where each numerical unit corresponds to one bar.

Metrical Circle Map

  • Abbreviation: mcm
  • Length: N
  • Type: Integer
  • Optional parameter: <N>
  • Requires: Metrical annotation

The optional parameter N defines how many segments in the circle representing a metrical bar will be used, defaults to N=48.

Metrical Weights

  • Abbreviation: metricalweights , weights, mw
  • Length: N
  • Type: Integer [0:2]
  • Optional parameter: –
  • Requires: Metrical annotation

Gives 2 for primary and secondary accents in the bar, 1 for other beat position and 0 for every subbeat position.

Swing Ratio

  • Abbreviation: swing-ratios
  • Length: Var
  • Type: Real
  • Optional parameter: –
  • Requires: Metrical annotation

Accents

  • Abbreviation: accent-<TYPE> or accent with optional parameter
  • Length: N
  • Type: Integer, Real [0,1]
  • Optional parameter: <TYPE>
  • Requires: Variable

Accents are mostly equivalent to boolean structural markers, which are True (1) if a structural condition is met, and False (0) if not. Sole exception is accent-thom, which is a probability value between 0 and 1. The labels in the following list need to be substituted for <TYPE> in accent-<TYPE>, e.g., accent-longpr or accent-long2pr etc.

Label Type Length Condition Requirement
longpr Integer [0:1] N True for tones with longer IOI than the previous tone.  
long2pr Integer [0:1] N True for tones with an at least twice longer IOI than the previous tone.  
longmod Integer [0:1] N True for tones with a longer IOI than mode IOI of all tones.  
long2mod Integer [0:1] N True for tones with an at least twice longer IOI than mode IOI of all tones.  
longmod_win5 Integer [0:1] N True for tones with IOI that is at least 41% longer than mean of the past 5 IOIs.  
long2mod_win5 Integer [0:1] N True for tones with IOI that is at least twice as long than mean of the past 5 IOIs.  
longpr_abs Integer [0:1] N True for tones with an higher (longer) IOI class than the previous one.  
long2pr_abs Integer [0:1] N True for tones with an IOI class at least two classes away from that of the previous one.  
longmod_abs Integer [0:1] N True for tones with a higher/longer IOI class than the most frequent (mode) IOI class.  
long2mod_abs Integer [0:1] N True for tones with an IOI class at least two classes away from the most frequent (mode) IOI class.  
longmod_win5_abs Integer [0:1] N True for tones with an IOI class longer than mode IOI class of the past 5 IOIs.  
long2mod_win5_abs Integer [0:1] N True for tones with an IOI class at least two classes longer than the mode IOI class of the past 5 IOIs.  
longpr_rel Integer [0:1] N True for tones with an higher (longer) IOI class than the previous one.  
long2pr_rel Integer [0:1] N True for tones with an IOI class at least two classes away from that of the previous one.  
longmod_rel Integer [0:1] N True for tones with a higher/longer IOI class than the most frequent (mode) IOI class.  
long2mod_rel Integer [0:1] N True for tones with an IOI class at least two classes away from the most frequent (mode) IOI class.  
longmod_win5_rel Integer [0:1] N True for tones with an IOI class longer than mode IOI class of the past 5 IOIs.  
long2mod_win5_rel Integer [0:1] N True for tones with an IOI class at least two classes longer than the mode IOI class of the past 5 IOIs.  
triad Integer [0:1] Var True for chord tones (excluding upper structures). Chord annotation
inchord Integer [0:1] Var True for chord tones (including upper structures). Chord annotation
outchord Integer [0:1] Var True for non-chord tones. Chord annotation
jumpaft3 Integer [0:1] N True for tones following a large pitch jump of at least 3 semi-tones (either direction).  
jumpaft4 Integer [0:1] N True for tones following a large pitch jump of at least 4 semi-tones (either direction).  
jumpaft5 Integer [0:1] N True for tones following a large pitch jump of at least 5 semi-tones (either direction).  
jumpbef3 Integer [0:1] N True for tones before a large pitch jump of at least 3 semi-tones (either direction).  
jumpbef4 Integer [0:1] N True for tones before a large pitch jump of at least 4 semi-tones (either direction).  
jumpbef5 Integer [0:1] N True for tones before a large pitch jump of at least 5 semi-tones (either direction).  
jumpbea3 Integer [0:1] N True for tones before and after a large pitch jump of at least 3 semi-tones (either direction).  
jumpbea4 Integer [0:1] N True for tones before and after a large pitch jump of at least 4 semi-tones (either direction).  
jumpbea5 Integer [0:1] N True for tones before and after a large pitch jump of at least 5 semi-tones (either direction).  
jumploc Integer [0:1] N True for tones after a pitch interval that is at least 1 semi-tone larger than the previous interval.  
jumploc2 Integer [0:1] N True for tones after a pitch interval that is at least 2 semi-tone larger than the previous interval.  
thom Real [0,1] N Value according to Thomassen’s algorithm (1982), which is based on the seven possible pitch direction patterns that can be formed by 2-interval chains (3-note patterns). Values are not binary, but probabilities.  
thom_thr Integer [0:1] N Thresholded version of Thomassen accents with threshold, True if accent-thom>.75.  
beat1 Integer [0:1] N True for primary accent (first beat) of a bar.  
beat3 Integer [0:1] N True for secondary accent of a bar, if present, e.g., on the 3rd beat of a 4/4-measures. Metrical annotation
beat13 Integer [0:1] N True for primary and secondary accents of a bar. Metrical annotation
beatall Integer [0:1] N True for all beat positions in a bar. Metrical annotation
sync1 Integer [0:1] N True for all syncopations right before the primary accent in a bar (‘anticipated 1’). Metrical annotation
sync3 Integer [0:1] N True for all syncopations right before the secondary accent in a bar (‘anticipated 3’). Metrical annotation
sync13 Integer [0:1] N True for all syncopations right before primary and secondary accents in a bar (‘anticipated 1s and 3s’). Metrical annotation
sync1234 Integer [0:1] N True for all syncopations right before all beats in a bar. Metrical annotation
syncall Integer [0:1] N True for all syncopations on every sub-beat metrical levels (i.e., excluding half-beat level). Metrical annotation
pextrem Integer [0:1] N True for pitch extrema (no restrictions).  
pextrmf Integer [0:1] Var True for pitch extrema (excluding proper cambiata).  
pextrst Integer [0:1] N True for pitch extrema (sensu Steinbeck, with at least two intervals before and after the extrema leading strictly to it).  
phrasbeg Integer [0:1] N True for first note in a phrase Phrase annotation
phrasend Integer [0:1] N True for last note in a phrase Phrase annotation
phrasbor Integer [0:1] N True for first and last note in a phrase Phrase annotation

References

[Conklin2001]Conklin & Anagnostopoulou Conklin, Darrell & Christina Anagnostopoulou (2001). Representation and discovery of multiple viewpoint patterns. Proceedings of the 2001 International Computer Music Conference. San Francisco: ICMA.