melpat¶
Introduction¶
Note
This document pertains to melpat v1.3 and higher, in which some significant changes have been made to previous version v1.1 by which backwards compatibility is partly broken. Particularly the field searches
is now called requests
and display types csv
and raw
are no longer valid and should be list
. Partitions can be requested by partition
or all
, max
became invalid.
The snippet
display mode is now absorbed in the list
mode. Besides these changes, melpat has some new powerful features such as database
mode and search with very fast regular expressions,
secondary searches as well as a larger set of available transformations. So, please update your MeloSpySuite to gain the full new melpat experience.
Note
From the 1.0 version on, the MeloSpyGUI contains (nearly) all functionality of melpat with an easier to use interface. All explanations below hold accordingly for the pattern part of MeloSpyGUI as well, though some small differences exist, will be highlighted if necessary.
melpat is a command-line tool for extracting of and searching for patterns in monophonic melodies, which is a basic task in computational musicology and particularly interesting for the investigation
of creative processes in jazz improvisation. The term “pattern” here basically refers to N-grams, i.e., subsequences, of melodic abstractions. Processing is done on a (user-)defined set of
melodies (“the repository”) and certain abstractions (transformations, viewpoints) such as raw pitch, chordal pitch class, semitone intervals, duration classes etc.
derived thereof. melpat can process an arbitrarily long list of pattern requests (MeloSpyGUI: only one request at a time) and will write results to specified output files or to the console.
There are three different types of requests: search
, partition
and database
. A pattern search searches the melodic abstractions of all occurrence of a search pattern.
A partition finds all N-Grams in a melody according to user-specified criteria, whereby the full repository serves as kind of background for a specific partition, e.g., users can demand to
include only N-grams which occur at least in a certain number different melodies. A database request calculates all N-grams up to a certain maximum N in the repository, which are typically
a lot, but output can be pruned to manage the abundance (cf. below). This mode essentially allows to extract Markov transition probabilities
of arbitrary order from the specified repository, which might be useful for other purposes as well.
melpat can be operated from the commandline alone, but because of the extensive number of parameters needed to define requests the normal mode of operation will most likely employ configuration files written in the YAML language. (MeloSpyGUI: All parameters are handled via the UI). The format of which will be explained below.
N-grams, repositories and patterns: A short introduction¶
A melody is represented as list of events which certain properties (onset, pitch, duration etc.). Sequences of abstraction, e.g., semitone-intervals, are derived from this basic representation for pattern analysis. A melody repository is a set of melodies, a transformation repository is a corresponding set of abstracted sequences. N-grams are then subsequences of length N of the transformed sequences. More specifically, one can define a N-pattern (pattern for short, sometimes also called m-type) as a set of N-grams having the same value, with instances occurring at certain positions in the repository. E.g., if we have sequences s1= “abcd” and s2= “cdab” in the repository, the the 2-pattern “ab” can be found in at position 0 in s1 and at position 2 in s2 (sequence elements are numbered starting with 0). The (absolute) frequency of this specific 2-pattern is 2, the relative frequency is the absolute frequency divided by the total number of N-grams in the repository, which is the combined length of all sequences minus the number of elements multiplied by N-1, since a sequence of length M can have at most M-(N-1) N-grams. For our example, this means that there are 4+4 - 2*(2-1) = 8-2 =6 possible N-grams, hence the relative frequency (or probability for that matter) is 2/6 = 1/3. On the other hand, the bi-pattern “bc” is only occurring once in the repository, hence its probability is 1/6. A partition of s1 with respect to our mini-repository using only patterns that occur at least twice would consist of “ab”, “cd”. The pattern database for this repository with maximum N of 4 (tetragrams) is “a” (frequency: 2), “b” (2), “c” (2), “d” (2) for unigrams and “ab” (freq. 2), “bc” (1), “cd” (2), “da” (1) for bigrams, “abc” (1), “bcd” (1), “cda” (1), “dab” (1) for trigrams and “abcd” (1) “cdab” (1) for 4-grams.
Input formats¶
melpat reads the following input formats:
MIDI (monophonic)
MCSV1 (a CSV-based melody format)
MCSV2 (a CSV-based melody format)
EsAC (one melody per file)
EsAC Database (SQLITE3)
Tony/yYIN note layers as CSV exports
Output formats¶
There are three different output formats available (selected by the parameter display
, cf. below): list
, stats
and midi
, where the last one is only availabe for
searches but not in partition
and database
mode. In list
mode , all pattern instances will be written to a CSV formatted file comprising several coordinates
(indices, onset, metrical position) and lengths (number of elements, duration) as well as the actual values and absolute and relative frequencies. In stats
mode each pattern is listed
along with it absolute and relative frequencies, no position information is given. The midi
format is kind of special. For each occurrence of the specified pattern, the corresponding
melody snippet will be rendered as MIDI and these snippets will be concatenated with 2 sec breaks in between into a single MIDI file. The purpose of this output format is to allow
quick auditory checks of the found patterns. The naming scheme for different combinations of requests and labels can be found here.
Usage¶
Here is the basic call to melpat:
melpat [-h] [--version] [-d DIR] [-o OUTDIR] [-f FILE] [-c CONFIG]
[-s SEARCH] [-m MAXN] [--verbose] [--dbtype {SQLITE3}]
[--dbpath DBPATH] [--dbuser DBUSER] [--dbpwd DBPWD]
[positional]
Options and arguments¶
-
-h
,
--help
¶
Show a help message and exits.
-
--version
¶
Shows program’s version number and exit
-
-d
<DIR>
,
--dir
<DIR>
¶ Use <DIR> as working directory for reading the project files. Default is the current directory.
-
-o
<OUTDIR>
,
--outdir
<OUTDIR>
¶ Write result file(s) to <OUTDIR>. Default is the current directory.
-
-f
FILE
,
--file
FILE
¶ Input file (wildcards allowed). If the melodies are retrieved from a database
<FILE>
must have the Query block syntax, wrapped up in a Python- style dictionary and embraced by single or double quotes.
-
-c
<CONFIG>
,
--config
<CONFIG>
¶ Use the configuration file <CONFIG>, see below. The arguments <DIR> and <OUTDIR> override the settings in the config file, but the positional argument takes precedence over the one specified in the configuration file.
-
-s
SEARCH
,
--search
SEARCH
¶ This string specificies the search (or operation) to be performed. It has the same syntax as the Search pattern syntax used in the configuration file.
-
-m
MAXN
,
--maxN
MAXN
¶ Specifies the maximal N-gram length to be used.
-
--verbose
¶
Be more talkative.
-
--dbtype
DBTYPE
¶ To read melodies from a database, the required database options need to be specified and the file option must be a query. This option indicates the type of database to use. Currently only
SQLITE3
is supported (also default). Has precedence over according value in configuration file.
-
--dbpath
DBPATH
¶ To read melodies from a database, the required database options need to be specified and the file option must be a query. Absolute or relative path to output database are allowed. Has precedence over according value in configuration file.
-
--dbuser
DBUSER
¶ If credentials are needed for database access, here the user name can be specified. Has precedence over according value in configuration file.
-
--dbpwd
DBPWD
¶ If credentials are needed for database access, here the password can be specified. Has precedence over according value in configuration file.
-
positional
¶
The last positional parameter specifies the output file basename. It is possible to use
stdout
here, in which case, the output will not be written to files but printed to stdout, i.e., the console window.
Configuration files¶
Here is a typical example:
---
dir: /My/Data/
outdir: /My/Data/Patterns/
outfile: melpat_example.csv
maxN: 30
tunes:
#will be ignored since we are working in database mode
- file: 'Miles*_PREFINAL.sv'
#all solos from the database
- query:
conditions:
solo_info:
performer: '%'
title: '%'
display:
transcription_info: filename_sv
type: sv
#use the WJazzD
database:
type: sqlite3
path: wjazzd.db
password: None
use: True
requests:
#Interval pattern partition of "So What" by Miles Davis with minimal length of 8 intervals,
#that occur at least twice in two different solos. Write only partition statistics
-
transform: interval
pattern: max
minN: 8
minOccur: 2
minSource: 2
display: stats
trillfilter: 2,2
arpeggiofilter: d
scalefilter: d
simul: False
items:
- title: So What
#Search for (strictly) rising of falling scales beginnings with respect to chord context,
#Listing of all instances
-
transform: cpc
pattern: [0, 2, '(', 3, '|', 4, ')', 5, '(', 7, ')+?(', 8, '|', 9, ')+?']
secondary:
transform: parsons
pattern: ['(', +1, '|', -1, ')', '+']
operation: match
display: list
label: scale_test
#All uni- and bigrams with frequencies and probabilities from the Metrical Circle Map (N=48)
-
transform: mcm-48
pattern: database
minN: 1
minOccur: 1
maxN: 2
display: stats
Explanation¶
The beginning of the configuration file contains global settings for the working directory dir
, the output directory outdir
and the basic output filename outfile
. This is
followed by maxN
which is global parameter for maximum N-Gram size used in the partition and database mode, but this value can (and will be) overwritten for database base mode
(cf. Database mode). The tunes
section specifies the input set of melodies to be used. See Data selection for an detailed explanation.
If the melodies are taken from a SQL-database, the very last section database
comes into play, otherwise it is ignored.
(Of course the order of sections is arbitrary and can be freely shuffled). The most important section is the request
section. It can contain a YAML list (items starting with
a hyphen) of pattern requests. As mentioned above, there are three different kinds of requests: Partition mode, Search mode and Database mode,
which will be explained in full detail in following.
Pattern requests¶
For each type of requests a different set of parameters is used, which will be explained below.
Partition mode¶
Partition mode is designed for finding pattern partitions of single pieces, i.e., complete lists of all real patterns occurring in a melody with respect to all other melodies in the repository as specified by user-definable criteria, which are are minimal and maximum length (N), minimal number of occurrences (total frequency in the whole repository) and minimum number of different sources (i.e., number of different melodies where the pattern must occur). Sub-patterns are filtered out, unless they do not occur somewhere else, hence, having a “life” on their own. This filter feature reduces the result set vastly. Furthermore, options exists to filter out patterns of lesser interest, i.e., trills, scales sections and arpeggios. Partition can be stored as Lists (every pattern listed) or just with a set of Stats for each partition.
Field |
Values |
Description |
---|---|---|
|
The transformation to be applied to the melodies. Currently, only a certain subset of transformations are available. |
|
|
|
Both values indicate the same operation (and exist only because of the indecisiveness of the programmer). |
|
positive integer > 0 |
Specifies the minimum length of N-grams that should be included in the partitioning. |
|
positive integer > 0 |
Specifies the minimum number of occurrences a pattern must have in order to be included. |
|
positive integer > 0 |
Specifies the minimum number of different source sequences in the repository in which a pattern must occur in order to be included. If |
|
|
If |
|
|
If specified, a trill filter will be applied. A trill is defined here as a pattern that consist solely of a repetition of a single sub-pattern, which is a slight generalization from the standard
musical definition. Only trills with a sub-pattern length of at least |
|
|
If the transformation is |
|
|
If the transformation is |
|
|
If set to |
|
Integer span OR list of field-value pairs specifying melodies. |
Partitions will only be calculated for the specified items. If this parameter is missing, all possible items will be considered.
Items can be specified either by an integer span, or by a list of metadata specifiations. An integer span has the form |
Search mode¶
In search mode, a specific pattern is searched across the repository of melodic sequences. Patterns are defined with respect to some Suitable transformations and can be searched using a special Search pattern syntax which includes the option to use regular expressions. The search can be complemented with a secondary search, mostly using another transformation, with allow to filter the search result, e.g, with respect to certain constraints such as patterns starting or ending on a strong beat in a bar on at the begin or end of musical phrase (Accents might be helpful.).
Field |
Values |
Description |
---|---|---|
|
The transformation to be used on the melodies. Currently, only a certain subset of transformations are usable(cf. Suitable transformations. |
|
|
A valid search pattern, other than |
|
|
YAML dictionary |
Specifies a secondary search, i.e., a search for a pattern (possibly from a different transformation) in the result set of the primary search. See Secondary search specification below for details. |
|
|
If |
|
String |
An arbitrary label for the given search. If a label is specified, the resulting patterns for a single search are written to different files each, which names are the
value of |
As explained above, secondary searches can be defined which act as a filter on the result set of the primary search. The parameters are quite similar.
Field |
Values |
Description |
---|---|---|
|
The transformation to be used on the melodies. Currently, only a certain subset of transformations are usable(cf. Suitable transformations. |
|
|
A valid search pattern, other than |
|
|
|
Specifies how the secondary search should act on the primary result set (basically a filter). For |
Database mode¶
In database mode, all information about all patterns occurring in the selected pieces using the specified transformation can be evaluated either as complete dump (list
) or as
statistics (stats
mode). The last option is specifically useful for estimating Markov transition probabilities. In database mode, the global maximum N can be overridden, which might be
useful, since there will be a lot of N-grams. Additionally, there are options to specify minimal N and minimal frequencies to prune the output.
Field |
Values |
Description |
---|---|---|
|
The transformation to be used on the melodies. Currently, only a certain meaningful subset of transformations are usable. |
|
|
|
|
|
positive integer > 0 |
Specifies the minimum length of N-grams. |
|
positive integer > 0 |
Specifies the minimum number of occurrences a N-Gram to be written. Helpful to prune the output. |
|
|
If |
|
|
If set to |
Pattern output formats¶
There are three possible output formats list
, stats
, and mid
for pattern request modes, whereas the mid
mode is only available for pattern search, and the actual format
of the output files in stats
do differ for the different mode.
list: In this mode, the patterns occurrences are written to a CSV formatted output file. Each line contains an ID of the containing melody as well as starting positions (index in the melody, raw onset in secs, and metrical position), absolute and relative frequency, length (number of elements and absolute duration in sec) as well as the value of the pattern. If a label is specified for the search, the results will be written in a file with name derived from the specified basic output file name and the label. If the label is missing, all results will be written into a single file.
stats: In this mode, only global information of the retrieved patterns are written, e.g., length, value and absolute and relative frequencies.
midi: In this mode, the occurrences of the patterns are written to a single MIDI file with a 2 seconds pause in between pattern instances. A label allows to write the result sets of each search to a different MIDI file.
Sample output in list
mode: for a search request:
id;start;N;onset;dur;metricalposition;value;freq;prob100
SonnyRollins_TheEverywhereCalypso-2_PREFINAL.sv;870;6;176.11465;0.45279;4.6.142.4.6;[0, 2, 4, 5, 7, 8];1;0.011
SonnyStitt_Elora_PREFINAL.sv;108;6;27.03469;0.94286;4.2.17.1.1;[0, 2, 4, 5, 7, 9];2;0.023
SteveTurre_Steve'sBlues_PREFINAL.sv;124;6;30.44286;1.13918;4.2.22.2.1;[0, 2, 4, 5, 7, 9];2;0.023
Sample output in stats
mode for the same search request:
value;N;freq;prob100
[0, 2, 4, 5, 7, 8];6;1;0.011
[0, 2, 4, 5, 7, 9];6;2;0.023
Sample output of stats
mode for a partition of two solos by Zoo Sims:
id;note_count;min_N;max_N;min_occur;min_source;pattern_count;coverage;avg_N;avg_overlap;over_coverage;log_excess_prob
ZootSims_DancingInTheDark-1_PREFINAL.sv;109;5;30;2;2;8;0.303;6.375;2.571;0.545;8.949
ZootSims_DancingInTheDark-2_PREFINAL.sv;168;5;30;2;2;7;0.22;6.0;0.833;0.135;8.698
Field |
Description |
Mode |
---|---|---|
|
Identifier of the melody. For WJazzD solos, |
|
|
Start ID of pattern in the melody (zero-based). |
|
|
Length of the pattern. |
|
|
Absolute frequency of the pattern in the repository. |
|
|
Relative frequency of the pattern in percent. Baseline is the number of all possible N-grams with the same length as the pattern. |
|
|
Actual value of the pattern with respect to the underlying transformation. Numerical transformation will be represented as Python arrays, hence, the with square brackets around (might be changed in the future). |
|
|
Onset of pattern in the melody in seconds (for EsAC songs this might be based on an arbitrary tempo of 120 bpm). |
|
|
Real duration of pattern in seconds in the melody (for EsAC songs this might be based on an arbitrary tempo of 120 bpm). |
|
|
Metrical start position of pattern in the melody (cf Metrical position notation for description of the syntax). |
|
|
Number of notes in melody. |
partition: |
|
Mininum pattern length as specified by the user. |
partition: |
|
Maximum pattern length as specified by the user. |
partition: |
|
Minimum number of occurrence as specified by the user. |
partition: |
|
Minimum number of different sources as specified by the user. |
partition: |
|
Total number of patterns found by the partition algorithm. |
partition: |
|
Share of note in the solo contained in a least one pattern (“Coverage”) |
partition: |
|
Average length of patterns in the partition. |
partition: |
|
Average overlap between patterns in the partition. |
partition: |
|
Indicator, in how many patterns notes are contained on the average. Equals 0 if every note of the melody is contained in exactly one pattern. |
partition: |
|
Average value of logarithm of excess probability of all patterns. Excess probability in this sense is the quotient of relative frequency of the pattern divided by the product of probabilities of each single element in the pattern. Low excess probability indicates that a pattern might have occurred just by chance from a 0-th order Markov process. |
partition: |
Suitable transformations¶
Here is a list of suitable transformations, since not all available transformations make sense for pattern retrieval.
Transformation |
Abbreviation(s) |
Type |
---|---|---|
|
Integer [0:127] |
|
|
Integer [0:11] |
|
trans_intervals |
|
Integer |
trans_fuzzy_intervals |
|
Integer [-4:4] |
trans_contour |
|
Integer [-1:1] |
|
Integer [0:11] |
|
|
String of symbols “1234567TL<>B” |
|
|
String of symbols “1234567TL<>B%” |
|
|
Integer [-2:+2] |
|
|
Integer [-2:+2] |
|
trans_mw |
|
Integer [0:2] |
trans_meter_mcm |
|
Integer [0:N-1], default N=48 |
|
Integer, Real [0:1] |
Search pattern syntax¶
The search pattern syntax is derived from corresponding Python constructs, which are basically strings and arrays, mixed and enhanced with syntactical constructs from Pythons regular expressions. Python arrays are written as comma-separated list, embraced by square brackets, strings are list of character embraced by single- or double-quotes. The third column in table Pattern transformations indicates which syntax is to use with each transformation. If a search pattern contains values outside of the indicated range, no error will issued, but melpat will return an empty result set. The regular expression symbols are added in the case of integer arrays as string elements in the array, i.e., elements surrounded by single- or double quotes. For string based transformation, this is not necessary.
Note
The syntax used in the MeloSpyGUI is little bit simplified in comparison to the melpat syntax (for convenience). First, surrounding square brackets ([
and ]
) can be left out. Second, instead of comma separated lists,
single spaces can be used as separators (but not arbitrary whitespace, just single spaces!). Regular expression bits still have to be embraced by quotes (except for string type transformations).
Let’s consider the slightly simplified example from the configuration file above:
transform: cpc
pattern: [0, 2, '(', 3, '|', 4, ')', 5]
secondary:
transform: parsons
pattern: [+1, '+|', -1, '+']
operation: match
The transformation of the primary search is cpc
, i.e., chordal pitch classes which take values in the range from 0 to 11. The specified pattern starts with two simple elements, 0 and 2,
i.e., the root and the major ninth of chords. Then follows an opening bracket (
as string element surrounded by single quotes. Next is the regular element 3
, which means the minor
third of a chord. Next up is the pipe symbol |
which means “OR” in regular expression parlance, readily followed by a 4, which stands for the major third of a chord, and the closing
bracket )
. Hence, we have the partial expression (3|4)
which means “either a minor or a major third”. Finally, there is a regular 5, which matches the fourth of a chord. Note,
that each element has to occur exactly once. Taken together, this pattern will match any cpc
-subsequence of the form 0235
or 0245
, something like the lower tetrachord. However,
because cpc
’s are pitch classes without octave, information about interval direction is lost. A sequence 0234
consisting of a jump of minor seventh down, a jump a ninth up and a jump
a major seventh down will also match the specified pattern. Now have a look at the secondary search. The transformation is Parson’s code, or interval direction, with possible values of -1, 0, and
+1. Using D
, R
, and U
for simplicity, the pattern for the secondary search translates to the regular expressions (U+|D+)
, which will match any sequence of only ups or only
downs. The operation of the secondary search is match
in this case, hence, it will find cpc
sequences of the desired primary form, which are also strictly ascending or descending.
If the operation would have been find
, any Parson’s sequence containing at least one U or one D would be matched, which most likely would not change the result set at all (since a
0234
sequence that only consists of note repetitions would be quite special). In the case of the secondary operation exclude
, the final result set would probably be empty for the same
reason.
Regular Expressions: A short introduction
Regular expression are a powerful albeit a bit cryptical tool which is an integral part of nearly all important programming languages. They provide means for pattern matching and searching
for substrings in strings or texts (sequences of alphanumeric and interpunctuations signs). Regexes, as they are frequently called, are around in the computing world since the 50’s and can be
viewed as specialised mini-programming language. There are basically three different kind of elements in a regular expression, all of which are expressed with cryptical character sequences, which
exactly makes regular expression often hard to read and digest.
The three kinds of information are “What”, “Where” and “How many”. Elements from the “What” category specify which kind of character should be matched, e.g.,
the set of lower case characters, which are in many regular expression dialects (unfortunately, there are many different dialects, which contributes to the confusion) expressed as a so-called character set
[a-z]
. The simplest case is just a single character, e.g., “A”. This is already a valid regular expression and will match any occurrence of the character “A” in a string. There are also
wildcards, such as the dot .
, which stands for any character (except the newline, probably). The “Where” category offers the least options, basically only whether a pattern should be
located at the very beginning (the caret ^
) or the very end of a string (the dollar sign $
). Finally, the “How many” symbols, also called quantifiers,
allow to be more specific about occurrence frequency, i.e., repetitions of characters. In Python’s regular expressions, these are the special symbols *
(arbitrary number, 0 to N),
+
(at least 1, 1 to N), ?
(one or none, 0-1). Furthermore, one can be even more specific with the constructs {m}
and {n,m}
which translate to “exactly m” and
“between n and m repetitions”. All quantifiers are written directly following a “What” specification and pertain only to the immediately preceding part. The quantifiers in their pure
and raw form are “greedy little suckers”, which means, that they will try to match as many characters as possible. This is often not desired. To make them non-greedy, hence, to make them match
as less as possible, one can add a ?
sign, e.g., +?
, *?
, ??
, which are the non-greedy versions.
“What” specifications can be constructed from atomic “What” specifications by means of the pipe symbol |
which means “this or that”, e.g., “a|b” matches an “a” or a “b”, and
parantheses, e.g., “(a|b)|c” will match either an “a” or a “b” or a “c”, which means that the brackets are actually redundant in this case. But combined with quantifiers they might become
significant, e.g., “(a|b)??|c+?” will match zero or one occurrence of “a” or “b” or at least one occurrence of “c” (non-greedy!). Hence, “c” will match, “a” and “b” will match, in “ac” it will find the “a”,
the leftmost options, but the empty string “” will not match. There are some more features of Python regular expressions, but most of them are not useful in the case of melodic patterns, e.g., special character
classes. But see here and there for more detailed information on Python’s regular expressions.
Output file naming scheme¶
The following table summarize the different output file namings done by melpat. <outfile> is the name of the specified output file name, <transformation> is the transformation and <label> the label of the pattern request. Optional parts are shown in square brackets, where “tf” stands for trill filter, “sf” for scale filter, “af” for arpeggio filter, and “sim<N>” for simulated N-gram databases using a Markov model of order N.
Pattern Mode |
Display Mode |
Result without label |
Result with label |
---|---|---|---|
Search |
|
<outfile>.csv |
<outfile>_<label>.csv |
Search |
|
<outfile>_stats.csv |
<outfile>_<label>_stats.csv |
Search |
|
<outfile>.midi |
<outfile>_<label>.midi |
Partition |
|
<outfile>_<transformation>_<min_N>_<min_occur>_<min_source>[_tf][_sf]_[af][_sim<N>].csv |
<outfile>_<label>.csv |
Partition |
|
<outfile>_<transformation>_<min_N>_<min_occur>_<min_source>[_tf][_sf]_[af]_stats[_sim<N>].csv |
<outfile>_<label>.csv |
Database |
|
<outfile>_<transformation>_db.csv |
<outfile>_<label>_db.csv |
Database |
|
<outfile>_<transformation>_db_stats.csv |
<outfile>_<label>_stats.csv |