Tune sources¶
Here we demonstrate included utilities for loading tune data.
from pyabc2 import Tune
from pyabc2.sources import load_example, norbeck, the_session, eskin, bill_black, hardy
A few examples are included in the package, accessible with pyabc2.sources.load_example() (returns Tune) and pyabc2.sources.load_example_abc() (returns ABC string).
load_example("For the Love of Music")
Tune(title='For The Love Of Music', key=Gmaj, type='slip jig')
The tune source modules, demonstrated below, download tune data from the internet.
Norbeck¶
norbeck.load() gives us a list of Tunes for one of Norbeck’s tune type groups (e.g. ‘jigs’, ‘reels’, ‘slip jigs’).
tunes = norbeck.load("jigs")
print(len(tunes), "jigs loaded")
tunes[0]
downloading...
done
556 jigs loaded
Tune(title="Bride's Favourite, The", key=Gmaj, type='jig')
tunes[-1]
Tune(title='Stone Step, The', key=Gmaj, type='jig')
The Session¶
the_session.load() gives us a list of Tunes loaded from a (frequently updated) archive of all of the tunes in The Session. This is a large dataset, so here we cap the processing.
tunes = the_session.load(n=500)
tunes[0]
downloading...
done
/tmp/ipykernel_628/93315575.py:1: UserWarning: 22 out of 500 The Session tune(s) failed to load. Enable logging debug messages to see more info.
tunes = the_session.load(n=500)
Tune(title="'S Ann An Ìle", key=Gmaj, type='strathspey')
tunes[-1]
Tune(title='A Trip To Galloway', key=Dmaj, type='waltz')
tune = the_session.load_url("https://thesession.org/tunes/21799#setting43712")
tune
Tune(title='The Cherrytree', key=Gmaj, type='jig')
tune.print_measures()
01: G2 d c B G
02: G F G A F D
03: G2 d A G G
04: d e e f g g
05: G2 d c B G
06: G F G A F D
07: B c c g2 B
08: B c g a f d
09: G2 d c B G
10: G F G A F D
11: G2 d A G G
12: d e e f g g
13: G2 d c B G
14: G F G A F D
15: B c c e2 B
16: A G E G A F
17: E B e d B e
18: d e g d B e
19: f B g f B f
20: f g g g a a
21: g a f f g e
22: f g d e d B
23: B d e f g A
24: A B d A G F
25: E B e d B e
26: d e g d B e
27: f B g f B f
28: f g g g a a
29: b2 a a g e
30: f g e e d e
31: d f a e d e
32: d B A A G F
Data archive¶
The Session data archive (https://github.com/adactio/TheSession-data) has many datasets (pyabc2.sources.the_session.load_meta()),
which we can use in other ways besides parsing ABCs to Tunes.
For example, we can look for the most common ABC notes in the corpus.
%%time
df = the_session.load_meta("tunes", convert_dtypes=True)
df
CPU times: user 846 ms, sys: 127 ms, total: 973 ms
Wall time: 993 ms
| tune_id | setting_id | name | type | meter | mode | abc | date | username | composer | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 15326 | 28560 | 'S Ann An Ìle | strathspey | 4/4 | Gmajor | |:G>A B>G c>A B>G|E<E A>G F<D D2|G>A B>G c>A B... | 2016-03-31 15:34:45 | danninagh | <NA> |
| 1 | 15326 | 28582 | 'S Ann An Ìle | strathspey | 4/4 | Gmajor | uD2|:{F}v[G,2G2]uB>ud c>A B>G|{D}E2 uA>uG F<D ... | 2016-04-03 09:15:08 | DonaldK | <NA> |
| 2 | 14625 | 26955 | 'S Daor An Tabac | reel | 4/4 | Bminor | |:eAAB eABB|eAAB gedB|eAAB eABB|G2AB gedB:|\r\... | 2015-07-31 02:47:47 | Charles Mackenzie | <NA> |
| 3 | 5478 | 5478 | 'S Iomadh Rud A Chunnaic Mi | reel | 4/4 | Gmajor | ABBA GEDE|G2AG EGDG|ABBA GEDE|GEDE G2GA|\r\nAB... | 2006-02-03 04:45:46 | Andy F | <NA> |
| 4 | 5478 | 11429 | 'S Iomadh Rud A Chunnaic Mi | reel | 4/4 | Dmajor | |:e|f2fe dBAB|dded B2A2|f2fe dBAB|dBAB d2d:|\r... | 2011-08-17 00:57:48 | malcombpiper | <NA> |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 54297 | 11995 | 11995 | Zoidberg's | jig | 6/8 | Dmajor | |:f2f f2a|fed B2A|A2A ABd|e2e ~e2e|\r\nf2f f2a... | 2012-06-19 12:12:46 | DrugCrazed | Patrick Rose |
| 54298 | 22155 | 44612 | Zolloko San Martinak | polka | 2/4 | Fmajor | K:F\r\na|:"A" ad' e'f'|"Dm" (3e'e'e' d'a|ac' b... | 2022-08-21 12:24:32 | Fernando Durbán Galnares | Kepa Junkera |
| 54299 | 11584 | 11584 | Zonaradikos | jig | 6/8 | Dmajor | A|:d3 e3|f3f2e|fga gfe|d2c B2A|\r\nd3 e3|f3f2e... | 2011-11-14 16:06:26 | gian marco | <NA> |
| 54300 | 9013 | 9013 | Zucchini Reel, The | reel | 4/4 | Dmajor | Ac|d2d=c ADFA|G2BG =cGBG|Add=c ADFD|GBAG FDAc|... | 2008-10-18 06:05:43 | Shelley | <NA> |
| 54301 | 13875 | 24924 | Zuppa Inglese | jig | 6/8 | Gmajor | |:GAG GAB|cBc cde|d2B GAB|A2G E2D|\r\nGAG GAB|... | 2014-10-07 22:55:26 | Edward Nunn | <NA> |
54302 rows × 10 columns
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 54302 entries, 0 to 54301
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 tune_id 54302 non-null Int64
1 setting_id 54302 non-null Int64
2 name 54302 non-null string
3 type 54302 non-null category
4 meter 54302 non-null category
5 mode 54302 non-null category
6 abc 54302 non-null string
7 date 54302 non-null datetime64[ns]
8 username 54302 non-null string
9 composer 18288 non-null string
dtypes: Int64(2), category(3), datetime64[ns](1), string(4)
memory usage: 3.2 MB
from pyabc2.note import RE_NOTE as rx
rx
re.compile(r"(?P<acc>\^|\^\^|=|_|__)?(?P<note>[a-gA-G])(?P<oct>[,']*)(?P<num>[0-9]+)?(?P<slash>/+)?(?P<den>[0-9]+)?",
re.UNICODE)
This regular expression does also match letters in tune titles, say.
["".join(tup) for tup in rx.findall("the quick brown fox jumps over the lazy dog")]
['e', 'c', 'b', 'f', 'e', 'e', 'a', 'd', 'g']
But The Session stores the tune body separately (in the abc field) and encourages a bare-bones melody-focused approach, so we can expect to mostly be matching actual notes.
from pprint import pprint
cool = df.query("tune_id == 1 and setting_id == 1")
display(cool.T)
abc = cool.abc.iloc[0]
print(abc, "\n")
pprint([m.group() for m in rx.finditer(abc)], compact=True)
| 10236 | |
|---|---|
| tune_id | 1 |
| setting_id | 1 |
| name | Cooley's |
| type | reel |
| meter | 4/4 |
| mode | Edorian |
| abc | |:D2|EBBA B2 EB|B2 AB dBAG|FDAD BDAD|FDAD dAFD... |
| date | 2001-05-14 18:45:18 |
| username | Jeremy |
| composer | <NA> |
|:D2|EBBA B2 EB|B2 AB dBAG|FDAD BDAD|FDAD dAFD|
EBBA B2 EB|B2 AB defg|afec dBAF|DEFD E2:|
|:gf|eB B2 efge|eB B2 gedB|A2 FA DAFA|A2 FA defg|
eB B2 eBgB|eB B2 defg|afec dBAF|DEFD E2:|
['D2', 'E', 'B', 'B', 'A', 'B2', 'E', 'B', 'B2', 'A', 'B', 'd', 'B', 'A', 'G',
'F', 'D', 'A', 'D', 'B', 'D', 'A', 'D', 'F', 'D', 'A', 'D', 'd', 'A', 'F', 'D',
'E', 'B', 'B', 'A', 'B2', 'E', 'B', 'B2', 'A', 'B', 'd', 'e', 'f', 'g', 'a',
'f', 'e', 'c', 'd', 'B', 'A', 'F', 'D', 'E', 'F', 'D', 'E2', 'g', 'f', 'e',
'B', 'B2', 'e', 'f', 'g', 'e', 'e', 'B', 'B2', 'g', 'e', 'd', 'B', 'A2', 'F',
'A', 'D', 'A', 'F', 'A', 'A2', 'F', 'A', 'd', 'e', 'f', 'g', 'e', 'B', 'B2',
'e', 'B', 'g', 'B', 'e', 'B', 'B2', 'd', 'e', 'f', 'g', 'a', 'f', 'e', 'c',
'd', 'B', 'A', 'F', 'D', 'E', 'F', 'D', 'E2']
%%time
note_counts = (
df.abc
.str.findall(rx)
.explode()
.str.join("")
.value_counts()
)
note_counts
CPU times: user 26.6 s, sys: 485 ms, total: 27.1 s
Wall time: 27 s
abc
A 733688
d 670685
B 664951
e 558934
c 453372
...
D'' 1
f9/ 1
A2/4 1
e2// 1
^b4 1
Name: count, Length: 1031, dtype: int64
note_counts[:20]
abc
A 733688
d 670685
B 664951
e 558934
c 453372
G 449712
f 393893
F 318347
g 305356
E 269217
D 231414
a 204906
A2 97313
d2 94422
B2 80246
G2 75557
b 60225
e2 60202
c2 47538
C 44820
Name: count, dtype: int64
👆 We can see that A (unit duration) is the leader, being a prominent pitch in many of the common keys.
5 in Dmaj
2 in Gmaj
1 in Ador, Amin, Amix, Amaj
Note
A implies A₄, the A above middle C, the A string on a violin, the lower register on the flute, etc.
Note
In general we don’t know the duration of A without context (L: header field, or based on M: if L: is not set).
However, in this case, we know that The Session presets the unit duration to 1/8,
so A is an eighth note.
from textwrap import wrap
print("\n".join(wrap(" ".join(note_counts[note_counts == 1].index))))
f'/2 _f4 b,,3 =G8 F5/4 c9 b'2 _A4 _G4 B4/ =A/4 g,/ d16
=c16 E24 _d/ E33 A6// b5/ A,12 ^c'4 ^c'3 d'/4 ^A3/4 ^B3/4
A'/ a'4 ^G,, ^A,, e,,3 e,,2 f,,3 ^f2/3 ^c2/ A11 e2/4 e4/
f4/ =e// ^g/2 ^A// b,/2 D/8 E/8 B/8 B,' f6/2 e6/2 f,3 ^b/
D11 C6/ ^D/2 _a3/2 G,12 _e8 ^c,2 ^B,2 c'3/4 d5/2 e'' ^A10
^a8 _A,6 d6/ a'/ ^G10 ^D6 ^D10 ^D8 d1/3 a8/3 c'1/2 C,12
=B5 A7/9 =F,4 f,8 =D6 e8/3 a/3 B16 G16 =a3 E23 A'4 D22 c7
f5/2 =G6 ^G' B,9 ^A5 c13 G'4 F'3 ^e/2 c3// D'/ C'/ B'/2
F,6 E,8 A/1 =B/4 G/1 B,// _a4 e,2 ^g/8 _e'/ __d d14 _c'2
e23 ^E,2 =E,2 B32 =c'4 _c' E75 =C, ^F,6 =F,2 =B,6 B,1 D,1
^C,2 ^C, _g/ G,,4 e'3/2 ^f'2 D,'3 _d'3 =F,3 =F,,3 E,,3
_B,3/2 =c1 _d6 B7/4 _c3 A,9 =c2/3 =A4 D2/3 E6/ G6/ A4/3
^a// F//// F/// =c'3/2 =e6 B11 e'1 D'' f9/ A2/4 e2// ^b4
👆 A variety of ABC note specs appear only once. Many of these have unusual durations or accidentals.
What if we ignore everything except the natural note name?
nat_cased_counts = (
note_counts
.reset_index(drop=False)
.rename(columns={"abc": "note"})
.assign(nat=lambda df: df.note.str.extract(r"([a-gA-G])"))
.groupby("nat")
.aggregate({"count": "sum"})["count"]
.sort_values(ascending=False)
)
nat_cased_counts
nat
A 925044
B 846010
d 827624
e 665590
G 590607
c 577271
f 474448
g 383882
F 382747
E 342141
D 300375
a 251176
b 71519
C 54919
Name: count, dtype: int64
👆 A is still our leader, but otherwise things have shifted a bit.
Note C, which generally implies a pitch outside of the range of most whistles and flutes,
has the lowest count.
Although b is inside that range, many tunes don’t have one.
from pyabc2 import Note
(
nat_cased_counts
.to_frame()
.assign(value=lambda df: df.index.map(lambda x: Note.from_abc(x).value))
.sort_values("value")["count"]
.plot.bar(
xlabel="ABC letters\n(accidentals, octave indicators, and context in key ignored)",
rot=0,
ylabel="Count",
title="ABC prevalance in The Session",
)
);
Eskin¶
Michael Eskin has tunebooks available at https://michaeleskin.com/tunebooks.html, viewable with his ABC Transcription Tools.
We can load selected tunebooks from there, e.g. the King Street Sessions:
df = eskin.load_meta("kss")
df
downloading...
done
| name | abc | group | |
|---|---|---|---|
| 0 | The bonniest lass in the world | X: 938\nT:The bonniest lass in the world\nR:Re... | airs_songs |
| 1 | The Brae's of Lochiel | X: 939\nT:The Braes of Lochiel\nT:Braigh Loch ... | airs_songs |
| 2 | Farewell to whiskey | X: 940\nT:Farewell to whiskey\nR:Air\nO:Scotla... | airs_songs |
| 3 | Galen's arrival | X: 941\nT:Galen's arrival\nR:Reel\nO:Scotland\... | airs_songs |
| 4 | Give me your hand | X: 942\nT:Give me your hand\nR:Air\nQ:180\nC:R... | airs_songs |
| ... | ... | ... | ... |
| 1001 | Waiting for Peter | X: 933\nT:Waiting for Peter\nR:Waltz\nC:Lee An... | waltzes |
| 1002 | Waltz of the toys | X: 934\nT:Waltz of the toys\nR:Waltz\nC:Michel... | waltzes |
| 1003 | West of the River Shannon | X: 935\nT:West of the River Shannon\nR:Waltz\n... | waltzes |
| 1004 | Westphalia waltz | X: 936\nT:Westphalia waltz\nR:Waltz\nB:The Wal... | waltzes |
| 1005 | Wind on the heath | X: 937\nT:Wind on the heath\nR:Waltz\nO:Scotla... | waltzes |
1006 rows × 3 columns
df.group.value_counts()
group
reels 353
jigs 260
hornpipes 75
scotchreels 64
polkas 46
slipjigs 37
strathspeys 33
waltzes 29
ocarolan 25
misc_tunes 21
marches 21
slides 19
airs_songs 16
long_dances 7
Name: count, dtype: int64
Tune(df.query("group == 'jigs'").iloc[0].abc)
Tune(title='The academy jig', key=Gmaj, type='Jig')
from IPython.display import display, Markdown
url = "https://michaeleskin.com/abctools/abctools.html?lzw=BoLgUAKiBiD2BOACCALApogMrAbhg8gGaICyArgM4CWAxmAEogUA2VADogFZUDmYAwiExUAXon4BDePFjNmYEiACcAegAcYTCACM6sAGkQAcTBGAogBFEFs0cQBBIwCFEAHwdG7zgCaI0333dzKxs7Rxo3RCc0DCd7F3MzRBBXMB5-PxVCFR4EpxUaFUDEdN80HgAjRAkAJmJ3Uszs3Id8wuL-F28nMKdAtIy0LJy8gqLIxvKq2olIipimnIxankjOxG7e+zdUoA"
display(Markdown(f"<{url}>"))
eskin.load_url(url)
Tune(title='For The Love Of Music', key=Gmaj, type='slip jig')
Bill Black¶
Bill Black has an extensive ABC library, available at http://www.capeirish.com/ittl/.
We can load all of the tune blocks (strings) with pyabc2.sources.bill_black.load_meta().
abcs = bill_black.load_meta()
len(abcs)
downloading...
done
10249
Tune(abcs[0])
Tune(title='A-D POLKA', key=Dmaj, type='?')
Hardy¶
Paul Hardy has a tunebook collection available at https://pghardy.net/tunebooks/. We can load selected tunebooks as a list of tune blocks (strings) with pyabc2.sources.hardy.load_meta().
abcs = hardy.load_meta("basic")
len(abcs)
downloading...
done
58
Tune(abcs[0])
Tune(title='Ash Grove, The', key=Gmaj, type='Waltz')