Transformations

Introduction

Transformations, also known as viewpoints [Conklin2001] or abstractions, are one of the core concepts in the MeloSpyLib. Starting from the basic representation of melodies, which consists of sequences of (onset, duration, pitch)-triplets, transformation are mappings either to sequences or to objects of fixed dimension (scalars, vectors, matrices). The first kind produces sequences of variable length, depending on the length of the original melody. For example, the duration class transformation ‘forgets’ first about pitch and onsets and classifies then each duration into one of five classes. The second kind of transformation produces object of fixed dimensions, e.g., a scalar (single value) or a vector (multiple values). For instance, the number of tones in a melody would be a scalar feature, whereas the relative frequency of tonal pitch classes can be seen as a fixed 12-dim vector. The second kind of transformations are usually called ‘feature’ in the literature. However, the distinction between features and transformatons cannot and should not be made fully rigid, mostly for practical reasons. Some of the transformations provided by MeloSpyLib are actually features in the above mentioned sense, on the other hand, some of the pre-defined features in the Feature Definition Files are more accurately classified as transformations (of the first kind). A more practical definition used in this documentation is, that those transformations (of first or second kind) provided by the MeloSpyLib on the library level are called transformations, whereas features are those defined in Feature Definition Files. Transformations form always the first element (the source) of a feature.

A large share of the transformations are based on a single dimension of the original core representations, e.g., they use only the pitch or the onset dimension. Other transformations are based not only on the core representation of a melody but rely on the provided metadata, hence, they are not fully general. For example, chordal pitch class requires for each tone a chord context. For the jazz solo data in the Weimar Jazz Database all annotations are guaranteed to be present. For the EsAC database only phrase and metrical annotations as well as metadata are provided, whereas chord, chorus, form, and beat annotations are missing.

In this document, transformations that are entirely based on annotations are called pseudo-transformations, whereas transformation of fixed length will be called degenerated.

Available transformations

The first column in the table holds the name of the transformations. The second column indicates the length of the transformation. N always means the length of the original melody. ‘Var’ means that the length of result sequence is a non-trivial function of the length of the original sequences. Concrete integers mean here degenerate transformation, which result in a fixed scalar, vector or matrix of the indicated length. The third column gives the type of the elements of the resulting sequences, e.g., Real, Integer, Char, String, etc. along with possible constraints to the value set. The fourth colum gives the abbreviation of the transformation that has to be used in a feature defintions file. The fifth column finally indicates the necessary requirements, i.e., annotations.

Next part: Metrical System.

Transformation

Length

Type

Abbreviation(s)

Optional Parameter

Requirements

MIDI Pitch

N

Real (Integer)

pitch

None

Absolute Pitch Class

N

Integer [0:11]

pc, pitch-class

None

Chordal Pitch Class

N

Integer [0:11]

cpc, chordal-pitch-class

KeepNC

Chord annotation

Chordal Diatonic Pitch Class

N

Char (1,2,3,4,5,6,7,<,>,T,L)

cdpc, chord-dpc, chordal-dpc, chordal-diatonic-pitch-class

KeepNC

Chord annotation

Extended Chordal Diatonic Pitch Class

N

Char (1, 2, 3, 4, 5, 6, 7, <, >, T, L, %, \)

cdpcx, chord-dpc-x, chordal-dpc-ext, chordal-diatonic-pitch-class-ext

KeepNC

Chord annotation

Tonal Pitch Class

N

Integer [0:11]

tpc, tonal-pitch-class

Key annotations

Tonal Diatonic Pitch Class

N

Char (1,2,3,4,5,6,7,<,>,T,L)

tdpc, tonal-dpc, tonal-diatonic-pitch-class

Key annotation

Semitone Intervals

N-1

Integer

interval

None

Fuzzy Intervals (Refined Contour)

N-1

Integer [-4:4]

fuzzyinterval

None

Parsons Code (Contour)

N-1

Integer [-1:+1]

parsons

None

Huron Contour

Var

Integer [0:8] or labels hor, hor-desc, hor-asc etc.

huroncontour

code, redcode

None

Beat Track

Var

Real

beats, beattrack

Beat annotation

Chords

Var

String (Chord label)

ch, chords

types

Chord annotation

Chord Events

N

String (Chord label)

che, chord-events

Chord annotation

Phrase IDs

N

Integer

phrid, phrase-ids, phraseids

Phrase annotation

Phrase Boundary Markers

N

Integer [0, 1]

phrbd, phrase-boundaries, phrasebounds

Phrase annotation

Chorus IDs

N

Integer

chorusid, chorus-ids, chorusids

Chorus annotation

Metadata

1

String

meta

Metadata

Onsets

N

Real

onsets

None

Inter-onset Intervals

N-1

Real

ioi

None

Duration

N

Real

duration

None

Duration Tatum

N

Integer

durtatum

Metrical annotation

Duration Classes

N

Integer [-2:2]

durclass

abs, rel

None

Inter-onset Interval Classes

N-1

Integer [-2:2]

ioiclass

abs, rel

None

Total Duration

1

Float

total-duration

None

Total Metrical Duration

1

Float

total-metrical-duration

Metrical annotation

Metrical Position

N

String (Metrical position notation)

meter

Metrical annotation

Metrical Position Decimal

N

Real

meter-decimal

Metrical annotation

Metrical Circle Map

N

Integer

mcm

<N>

Metrical annotation

Bar Numbers

N

Integer

bars

Metrical annotation

Beat Indices

N

Integer

beats

Metrical annotation

Tatum Indices

N

Integer

tatums

Metrical annotation

Metrical Weights

N

Integer [0:2]

metricalweights , weights, mw

Metrical annotation

Swing Ratio

Var

Real [0,1]

swing-ratios

Metrical annotation

Accents

N

Integer 0,1/Real [0,1]

accent<-TYPE>

<TYPE>

Var

MIDI Pitch

  • Abbreviation: pitch

  • Length: N

  • Integer (Real) [0:127]

  • Optional Parameter: –

  • Requires: –

Raw MIDI Pitches of a medody (integers). All rhythm information is discarded.

Absolute Pitch Class

  • Abbreviations: pc, pitch-class

  • Length: N

  • Type: Integer [0:11]

  • Optional Parameter: –

  • Requires: –

MIDI pitch disregrading octave position (mod 12) with C = 0, C# = 1, D=2, …, B=11

Chordal Pitch Class

  • Abbreviation: cpc, chord-pc, chordal-pc, chordal-pitch-class

  • Length: N

  • Type: Integer [-1:11]

  • Optional Parameter: keepNC

  • Requires: Chord annotation

As Absolute Pitch Class, but with the root of the current chord context mapped to 0. Events without chord annotation will be left out, unless the optional parameter keepNC is set, in which case they will be represented by the value -1.

Chordal Diatonic Pitch Class

  • Abbreviation: cdpc, chordal-diatonic-pitch-class

  • Length: N

  • Type: Char (1,2,3,4,5,6,7,<,>,T,L)

  • Optional Parameter: keepNC

  • Requires: Chord annotation

As Chordal Pitch Class, but with a diatonic naming scheme:

Symbol

Chord tone

1

root

2

major/minor second

3

third:

B

minor third over major chord

>

major third over minor chord

4

fourth

5

fifth

6

major/minor sixth

T

tritone

7

seventh

<

minor seventh over major-seventh chord

L

major seventh over minor-seventh chord

Events without chord annotation will be left out, unless the optional parameter keepNC is set, in which case they will be represented by its raw MIDI pitch.

Extended Chordal Diatonic Pitch Class

  • Abbreviation: cdpcx, chord-dpc-x, chordal-dpc-ext, chordal-diatonic-pitch-class-ext

  • Length: N

  • Type: Char (1,\,2,3,B,>,4,5,%,6,T,7,<,L)

  • Optional Parameter: keepNC

  • Requires: Chord annotation

As Chordal Diatonic Pitch Class, but with two more classes:

Symbol

Chord tone

1

root

\

minor second

2

major second

3

third:

B

minor third over major chord

>

major third over minor chord

4

fourth

5

fifth

%

minor sixth

6

major sixth

T

tritone

7

seventh

<

minor seventh over maj-seventh chord

L

major seventh over minor-seventh chord

Events without chord annotation will be left out, unless the optional parameter keepNC is set, in which case they will be represented by its raw MIDI pitch.

Tonal Pitch Class

  • Abbreviation: tpc, tonal-pc, tonal-pitch-class

  • Length: N

  • Type: Integer [0:11]

  • Optional Parameter: –

  • Requires: Key annotation

As Chordal Pitch Class, but with the local key as reference instead of local chords.

Tonal Diatonic Pitch Class

  • Abbreviation: tdpc, tonal-dpc, tonal-diatonic-pitch-class

  • Length: N

  • Type: Integer [0:11]

  • Optional Parameter: –

  • Requires: Key annotation

As Chordal Diatonic Pitch Class, but with the local key as reference instead of local chords.

Semitone Intervals

  • Abbreviation: interval

  • Length: N-1

  • Type: Integer

  • Optional Parameter: –

  • Requires: –

Semitone intervals between subsequent pitches. All rhythm information is discarded.

Fuzzy Intervals (Refined Contour)

  • Abbreviation: fuzzy-interval

  • Length: N-1

  • Type: Integer [-4:+4]

  • Optional Parameter: –

  • Requires: –

Fuzzy intervals between subsequent pitches. All rhythm information is discarded. Fuzzy intervals (or refined) contour values come from a simple 5-fold classification of interval sizes:

Interval class name

Class borders

Numerical representation

big jump down

<-7

-4

jump down

[-7:-5]

-3

leap down

[-4:-3]

-2

step down

[-2:-1]

-1

repetition/unisone

0

0

step up

[1:2]

1

leap up

[3:4]

+2

jump up

[+5:+7]

+3

big jump up

>+7

+4

Parsons Code (Contour)

  • Abbreviation: parsons

  • Length: N-1

  • Type: Integer [-1:+1]

  • Optional Parameter: –

  • Requires: –

Parson’s code (often also called contour) codes the interval direction in three distinct classes:

Interval class name

Class borders

Numerical representation

down (D)

<0

-1

repetition/unisone (R)

0

0

up (U)

>0

1

Huron Contour

  • Abbreviation: huroncontour

  • Length: 1

  • Type: Integer [0:8] or labels

  • Optional Parameter: code, redcode

  • Requires: –

Huron contour is defined for sequences of pitch values and is determined based on three reference pitches: The first, the last and the average value of the pitches in between. The first and the average and the average and the last pitch are compared for magnitude. This yields nine possible combinations of the two magnitude relations with three possibilities each (greater (+), lesser (-) or equal (0)). These are coded to nine different contour values, as seen in Tab. Huron Contour Codes. (The numerical values are calculated using the formula 3c_1 + c_2, where c_i are the comparison values coded as - -> -1, 0 -> 0, + ->1.) Default output are numerical values, if the optional parameter code is set, labels will be emitted. There is also a reduced version of the code by mapping the mixed horizontal classes to their ascending/descending part with can be accessed with the optional parameter redcode.

Huron Contour Codes

Relation

Full name

Contour code

Reduced code

Numerical value

(0,0)

Horizontal

hor

hor

0

(0,+)

Horizontal-ascending

hor-asc

asc

1

(0,-)

Horizontal-descending

hor-desc

desc

-1

(+,0)

Ascending-horizontal

asc-hor

asc

3

(-,0)

Descending-horizontal

desc-hor

desc

-3

(+,+)

Ascending

asc

asc

4

(-,-)

Descending

desc

desc

-4

(+,-)

Convex

convex

convex

2

(-,+)

Concave

concave

concave

-2

Beat Track

  • Abbreviation: beats, beattrack

  • Length: Var

  • Type: Real

  • Optional Parameter: –

  • Requires: Beat annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from an annotated beat track - if available.

Chords

  • Abbreviation: ch, chords

  • Length: Var

  • Type: String (Chord labels (s. Chords)

  • Optional Parameter: type

  • Requires: Chord annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated chord symbols. This is the raw list of chords with no obvious relation to the events. If the optional parameter type is set, only the basic chord type will be used.

Chord Events

  • Abbreviation: che, chord-events

  • Length: N

  • Type: String (Chord labels)

  • Optional Parameter: –

  • Requires: Chord annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated chord symbols. This a list of chords of the same length as the melody, where the chord context of each event is returned.

Phrase IDs

  • Abbreviation: phrid, phrase-ids, phraseids

  • Length: N

  • Type: Integer

  • Optional Parameter: –

  • Requires: Phrase annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated phrase boundaries. Phrases are enumerated starting with index 1. For each event the ID of the phrase containing the event is returned.

Phrase Boundary Markers

  • Abbreviation: phrbd, phrase-boundaries, phrasebounds

  • Length: N

  • Type: Integer [0, 1] (Boolean)

  • Optional Parameter: –

  • Requires: Phrase annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated phrase boundaries. Each element is mapped to 1, if it starts a phrase (1) or to 0 if its an inner or end element of a phrase.

Chorus IDs

  • Abbreviation: chorusid, chorus-ids, chorusids

  • Length: N

  • Type: Integer

  • Optional Parameter: –

  • Requires: Chorus annotation

This transformation is a pseudo-transformation, not generated from the melody events themselves but from annotated choruses. Choruses are enumerated starting with index 1. For each event the ID of the chorus containing the event is returned.

Metadata

  • Abbreviation: meta.<FIELD>

  • Length: 1

  • Type: String

  • Optional Parameter: –

  • Requires: Metadata

If metadata is present in the SQL database, you access the field names by using the above syntax meta.<FIELD>, e.g., meta.title will return the title as stored in the database.

Onsets

  • Abbreviation: onsets

  • Length: N

  • Type: Real

  • Optional Parameter: –

  • Requires: –

This transformation belongs to the projection, i.e. it is the raw list of onsets of the melody events.

Inter-onset Intervals

  • Abbreviation: ioi

  • Length: N-1

  • Type: Real

  • Optional Parameter: –

  • Requires: –

This transformation returns interonset-intervals, i.e. the differences of consecutive onsets of the melody events.

Duration

  • Abbreviation: duration

  • Length: N

  • Type: Real

  • Optional Parameter: –

  • Requires: –

This transformation belongs to the projections, i.e., it is the raw list of durations of the melody events.

Duration Tatum

  • Abbreviation: durtatum

  • Length: N

  • Type: Integer

  • Optional Parameter: –

  • Requires: Metical annotation

Duration Classes

  • Abbreviation: durclass

  • Length: N

  • Type: Integer [-2:+2]

  • Optional parameter: abs or rel

  • Requires: –

Durations are mapped to duration classes by comparing them to a reference duration. There are two possibilities, specified by an optimal paramter: relative (rel) and absolute (abs) mode. In relative mode the reference duration is the beat duration of the surrounding beat, in absolute mode the value of 500ms is used, which correspond to the beat duration in a tempo of 120 bpm. The class definitions are displayed in Duration/IOI classes, expressed as percentage of the reference duration.

Duration/IOI classes

Class name

Class borders

Numerical value

very short

<35%

-2

short

35%-70%

-1

medium

70%-140%

0

long

140%-280%

1

very long

>280%

2

Inter-onset Interval Classes

  • Abbreviation: ioiclass

  • Length: N-1

  • Type: Integer [-2:+2]

  • Optional parameter: abs or rel

  • Requires: –

Inter-onset interval are mapped to IOI classes by comparing them to a reference duration. There are two possibilities, specified by an optimal parameter: relative (rel) and absolute (abs) mode. In relative mode the reference duration is the beat duration of the surrounding beat, in absolute mode the value of 500ms is used, which correspond to the beat duration in a tempo of 120 bpm. The class definitions are displayed in Duration/IOI classes.

Total Duration

  • Abbreviation: total-duration

  • Length: 1

  • Type: Float

  • Optional parameter: –

  • Requires: –

This degenerated transformation retrieves the duration of a melody (in seconds).

Total Metrical Duration

  • Abbreviation: total-metrical-duration

  • Length: 1

  • Type: Float

  • Optional parameter: –

  • Requires: Metrical annotation

This degenerated transformation retrieves the duration of a melody (in fractional metrical units, where each numerical unit represents one bar).

Metrical Position

Bar Numbers

  • Abbreviation: bars

  • Length: N

  • Type: Integer

  • Optional parameter: –

  • Requires: Metrical annotation

Beat Indices

  • Abbreviation: beats

  • Length: N

  • Type: Integer

  • Optional parameter: –

  • Requires: Metrical annotation

Tatum Indices

  • Abbreviation: tatum

  • Length: N

  • Type: Integer

  • Optional parameter: –

  • Requires: Metrical annotation

Metrical Position Decimal

  • Abbreviation: meter-decimal

  • Length: N

  • Type: Integer

  • Optional parameter: –

  • Requires: Metrical annotation

Metrical Inter-onset Intervals Decimal

  • Abbreviation: meter-ioi-decimal

  • Length: N-1

  • Type: Integer

  • Optional parameter: –

  • Requires: Metrical annotation

Calculates inter-onset intervals in decimal metrical units, where each numerical unit corresponds to one bar.

Metrical Circle Map

  • Abbreviation: mcm

  • Length: N

  • Type: Integer

  • Optional parameter: <N>

  • Requires: Metrical annotation

The optional parameter N defines how many segments in the circle representing a metrical bar will be used, defaults to N=48.

Metrical Weights

  • Abbreviation: metricalweights , weights, mw

  • Length: N

  • Type: Integer [0:2]

  • Optional parameter: –

  • Requires: Metrical annotation

Gives 2 for primary and secondary accents in the bar, 1 for other beat position and 0 for every subbeat position.

Swing Ratio

  • Abbreviation: swing-ratios

  • Length: Var

  • Type: Real

  • Optional parameter: –

  • Requires: Metrical annotation

Accents

  • Abbreviation: accent-<TYPE> or accent with optional parameter

  • Length: N

  • Type: Integer, Real [0,1]

  • Optional parameter: <TYPE>

  • Requires: Variable

Accents are mostly equivalent to boolean structural markers, which are True (1) if a structural condition is met, and False (0) if not. Sole exception is accent-thom, which is a probability value between 0 and 1. The labels in the following list need to be substituted for <TYPE> in accent-<TYPE>, e.g., accent-longpr or accent-long2pr etc.

Label

Type

Length

Condition

Requirement

longpr

Integer [0:1]

N

True for tones with longer IOI than the previous tone.

long2pr

Integer [0:1]

N

True for tones with an at least twice longer IOI than the previous tone.

longmod

Integer [0:1]

N

True for tones with a longer IOI than mode IOI of all tones.

long2mod

Integer [0:1]

N

True for tones with an at least twice longer IOI than mode IOI of all tones.

longmod_win5

Integer [0:1]

N

True for tones with IOI that is at least 41% longer than mean of the past 5 IOIs.

long2mod_win5

Integer [0:1]

N

True for tones with IOI that is at least twice as long than mean of the past 5 IOIs.

longpr_abs

Integer [0:1]

N

True for tones with an higher (longer) IOI class than the previous one.

long2pr_abs

Integer [0:1]

N

True for tones with an IOI class at least two classes away from that of the previous one.

longmod_abs

Integer [0:1]

N

True for tones with a higher/longer IOI class than the most frequent (mode) IOI class.

long2mod_abs

Integer [0:1]

N

True for tones with an IOI class at least two classes away from the most frequent (mode) IOI class.

longmod_win5_abs

Integer [0:1]

N

True for tones with an IOI class longer than mode IOI class of the past 5 IOIs.

long2mod_win5_abs

Integer [0:1]

N

True for tones with an IOI class at least two classes longer than the mode IOI class of the past 5 IOIs.

longpr_rel

Integer [0:1]

N

True for tones with an higher (longer) IOI class than the previous one.

long2pr_rel

Integer [0:1]

N

True for tones with an IOI class at least two classes away from that of the previous one.

longmod_rel

Integer [0:1]

N

True for tones with a higher/longer IOI class than the most frequent (mode) IOI class.

long2mod_rel

Integer [0:1]

N

True for tones with an IOI class at least two classes away from the most frequent (mode) IOI class.

longmod_win5_rel

Integer [0:1]

N

True for tones with an IOI class longer than mode IOI class of the past 5 IOIs.

long2mod_win5_rel

Integer [0:1]

N

True for tones with an IOI class at least two classes longer than the mode IOI class of the past 5 IOIs.

triad

Integer [0:1]

Var

True for chord tones (excluding upper structures).

Chord annotation

inchord

Integer [0:1]

Var

True for chord tones (including upper structures).

Chord annotation

outchord

Integer [0:1]

Var

True for non-chord tones.

Chord annotation

jumpaft3

Integer [0:1]

N

True for tones following a large pitch jump of at least 3 semi-tones (either direction).

jumpaft4

Integer [0:1]

N

True for tones following a large pitch jump of at least 4 semi-tones (either direction).

jumpaft5

Integer [0:1]

N

True for tones following a large pitch jump of at least 5 semi-tones (either direction).

jumpbef3

Integer [0:1]

N

True for tones before a large pitch jump of at least 3 semi-tones (either direction).

jumpbef4

Integer [0:1]

N

True for tones before a large pitch jump of at least 4 semi-tones (either direction).

jumpbef5

Integer [0:1]

N

True for tones before a large pitch jump of at least 5 semi-tones (either direction).

jumpbea3

Integer [0:1]

N

True for tones before and after a large pitch jump of at least 3 semi-tones (either direction).

jumpbea4

Integer [0:1]

N

True for tones before and after a large pitch jump of at least 4 semi-tones (either direction).

jumpbea5

Integer [0:1]

N

True for tones before and after a large pitch jump of at least 5 semi-tones (either direction).

jumploc

Integer [0:1]

N

True for tones after a pitch interval that is at least 1 semi-tone larger than the previous interval.

jumploc2

Integer [0:1]

N

True for tones after a pitch interval that is at least 2 semi-tone larger than the previous interval.

thom

Real [0,1]

N

Value according to Thomassen’s algorithm (1982), which is based on the seven possible pitch direction patterns that can be formed by 2-interval chains (3-note patterns). Values are not binary, but probabilities.

thom_thr

Integer [0:1]

N

Thresholded version of Thomassen accents with threshold, True if accent-thom>.75.

beat1

Integer [0:1]

N

True for primary accent (first beat) of a bar.

beat3

Integer [0:1]

N

True for secondary accent of a bar, if present, e.g., on the 3rd beat of a 4/4-measures.

Metrical annotation

beat13

Integer [0:1]

N

True for primary and secondary accents of a bar.

Metrical annotation

beatall

Integer [0:1]

N

True for all beat positions in a bar.

Metrical annotation

sync1

Integer [0:1]

N

True for all syncopations right before the primary accent in a bar (‘anticipated 1’).

Metrical annotation

sync3

Integer [0:1]

N

True for all syncopations right before the secondary accent in a bar (‘anticipated 3’).

Metrical annotation

sync13

Integer [0:1]

N

True for all syncopations right before primary and secondary accents in a bar (‘anticipated 1s and 3s’).

Metrical annotation

sync1234

Integer [0:1]

N

True for all syncopations right before all beats in a bar.

Metrical annotation

syncall

Integer [0:1]

N

True for all syncopations on every sub-beat metrical levels (i.e., excluding half-beat level).

Metrical annotation

pextrem

Integer [0:1]

N

True for pitch extrema (no restrictions).

pextrmf

Integer [0:1]

Var

True for pitch extrema (excluding proper cambiata).

pextrst

Integer [0:1]

N

True for pitch extrema (sensu Steinbeck, with at least two intervals before and after the extrema leading strictly to it).

phrasbeg

Integer [0:1]

N

True for first note in a phrase

Phrase annotation

phrasend

Integer [0:1]

N

True for last note in a phrase

Phrase annotation

phrasbor

Integer [0:1]

N

True for first and last note in a phrase

Phrase annotation

References

Conklin2001

Conklin & Anagnostopoulou Conklin, Darrell & Christina Anagnostopoulou (2001). Representation and discovery of multiple viewpoint patterns. Proceedings of the 2001 International Computer Music Conference. San Francisco: ICMA.