AOTM-2011

This page contains the data and examples described in the paper cited below. The playlist collection is derived from Art of the Mix playlist database.

AOTM-2011 spans the period from 1998-01-22 to 2011-06-17. In all, it contains 101,343 unique playlists, each of which has had its songs matched to the Million Song Dataset (MSD). Approximately 98,000 songs were matched into MSD; please see the publication below for details.

If you have any questions or comments, feel free to email the author at brian.mcfee@nyu.edu

Playlist data

aotm2011_playlists.json.gz
101,343 playlists (50M) [md5]

We provide the data in JSON format. The following python code can be used to import the data.

import cjson
import gzip

with gzip.open('aotm2011_playlists.json.gz', 'r') as file_desc:
    playlists = cjson.decode(file_desc.read())
                    

The variable playlists will then contain a list of playlists, complete with song identifiers, categorical annotation, and various other metadata. For example:

P[0] = {'category': 'Mixed Genre',
        'filtered_lists': [['SOFDPDC12A58A7D198'],
                           ['SOPIEQP12A8C13F268', 'SOKMCJK12A6D4F6105'],
                           ['SOGTGJR12A6310E08D',
                            'SOLTBYJ12A6310F2BB',
                            'SOBOXXN12A6D4FA1A2',
                            'SOUQUFO12B0B80778E']],
        'mix_id': 89567,
        'playlist': [[['peter murphy', "marlene dietrich's favourite poem"], None],
                     [['the walker brothers', "the sun ain't gonna shine anymore"],
                      'SOFDPDC12A58A7D198'],
                     [['marc almond', 'jacky'], None],
                     [['tindersticks', 'dying slowly'], None],
                     [['tori amos', 'me and a gun'], 'SOPIEQP12A8C13F268'],
                     [['suzanne vega', 'luka'], 'SOKMCJK12A6D4F6105'],
                     [['madonna', 'spanish eyes'], None],
                     [['the angels of light', 'praise your name'], None],
                     [['eurythmics', 'sex crime'], None],
                     [['tom waits', 'drunk on the moon'], None],
                     [['kate bush', 'wuthering heights'], 'SOGTGJR12A6310E08D'],
                     [['david bowie', "new york's in love"], 'SOLTBYJ12A6310F2BB'],
                     [['echo & the bunnymen', 'crocodiles'], 'SOBOXXN12A6D4FA1A2'],
                     [['peter murphy', "i'll fall with your knife"], 'SOUQUFO12B0B80778E']],
        'timestamp': '2005-03-27T10:53:00',
        'user': {'member_since': '2004-03-21T00:00:00',
                 'mixes_posted': '23',
                 'name': 'pulmotor'}}
                    

Going through the fields of P[0], we have the following fields.

category
A string describing the category of the playlist. There are about 40 unique categories.
filtered_lists
An array of contiguous segments of the playlists, where each song in the segment could be matched to MSD. This field is provided for convenience, and could be reconstructed from the playlist field described below.
mix_id
A unique numeric identifier for the playlist.
playlist
An array containing the original playlist data. Eahc element of this array is an array of length 2. The first element contains the artist name and song title, and the second contains the MSD song identifier, or None if the song could not be matched to MSD.
timestamp
The time and date when the user uploaded the playlist.
user
A dictionary containing information about the playlist's author, such as the user name, the date of joining the site, and number of playlists posted by that user (at time of upload).

References

If you use this data, please cite the following paper:
2011
bib |pdf
Hypergraph models of playlist dialects
13th International Society for Music Information Retrieval conference (ISMIR).