Mining your Own Apple Music Data

PUBLISHED ON OCT 2, 2020 — DATA SCIENCE

Streaming services like Pandora, Apple Music, and Spotify have changed how we consume music. In the olden days of purchasing music, whether a physical CD or through iTunes, users had to seek out music and make decisive choices about the music they were choosing to buy. Additionally, you had a physical or digital copy of the music you were consuming. If you wanted to listen to music you would have to look through your CD collection or scroll through your iTunes library and select what you wanted.

Now, we have access to an endless amount of music content all at our fingertips. Streaming services have simplified the process of finding and selecting music but with curated playlists and content have become more passive users. You just go to your app of choice and there are suggestions or recently listened to options. Personally, without having to go through the effort of finding and buying albums or songs I find that I am a less active participant in what I consume. These days, when I choose what music I want to play I tend to play the same set of artists, albums, and curated playlists. While having the streaming service curate playlists for me [provides diversity in the music content I listen to I find my selection is not as eclectic as the choices I used to make when I could scroll through my iTunes library and would manually select my music or create playlists.

With that in mind, I decided to use my data science skills to create a music library. I use Apple Music to stream content on an iPhone and a Mac computer. Apple collects my music history like the songs I “Love” or “Dislike” and what content I manually select to play. The data they collect is my own personal data and I just learned that you can request the data that Apple tracks and obtain a copy.

If you go to this blog post you can follow the instructions to download your own Apple Music data. Apple will have you sign into your account and then you can request your data. I only requested my music data but you can also request your location history or other data Apple collects about you. Follow the prompts and eventually they’ll send you a confirmation email noting your data is being prepared. After a few days, I got a follow up email that my data was ready for download. I downloaded my data, saved it on my computer, and will be cleaning it here so that I could mimic as much as possible a library that tracks the artists and radio stations I have played.

The data you download about yourself from Apple is rich. All your media consumption data through Apple Music is saved as .json files. I personally have never worked with these and simply want a unique list of artists and playlists so I am going to use the provided .csv music data in “Apple Media Services information/Apple_Media_Services/Apple Music Activity”.

Here are the packages we will use for this analysis:

First, let’s load the data.

Now let’s retain only the columns from each .csv that we will need to create a library. From the “Music - Favorite Stations.csv” let’s keep only the “Station Description”. From “Apple Music Likes and Dislikes.csv” let’s keep only the “Item Description”. From “Apple Music Play Activity.csv” let’s keep two columns “Artist Name” and “Content Specific Type”. We will also rename all but the “Content Specific Type” columns “Name”

From the “Music - Favorite Stations.csv” we will add a column Tag set it to “Radio” for all rows.

From the “Apple Music Likes and Dislikes.csv” we will replace the “N/A” values with NA and then drop the rows with NA. The items in the “Apple Music Likes and Dislikes.csv” are named using the convention “Artist - Item” so we also want to extract the artist name by splitting the string at " - " and then keeping the first split column. Lastly, we will retain only distinct artist names and create a column “Tag” that is assigned “Artist”.

From the “Apple Music Play Activity.csv” we will filter to songs only and then remove the “Content Specific Type” column. We will then retain only unique Artist Name’s, rename the Artist Name column to Name. Lastly, let’s add a column “Tag” and fill it with all “Artist”.

Finally, we will bind all the music data (favorite_stations, likes, activity), alphabetically arrange the data by “Name” then “Tag”, and save it as a csv.

Check out your new music library and enjoy!