DIY Spotify Wrapped in Python & Tableau

Jolie Fong
4 min readDec 8, 2021

When your Spotify Wrapped writes you off as a Basic House Music Bitch, you make your own discount version.

https://tinyurl.com/spotify-viz-jol

The Spotify Wrapped feature which provides highlights from your listening history throughout the year was released last week, and it was the only thing that everyone was posting about for a solid 12 hours. When I saw mine, I immediately thought “this can’t be the full extent of my embarrassing song choices”, so for the memes, I decided to tap into the 0.5 hours of prior Tableau experience I had under my belt, and dug into my listening habits.

I chose a timeframe of August to December 2021 (the school semester) and was able to pull 5000 songs from requesting a download of my Spotify account data (following the instructions from here).

Part 1: Prep

Just using a simple Python script to prepare the data here.

import pandas as pd
import numpy as np
df = pd.read_json('MyData\StreamingHistory1.json')
df['minsPlayed']=df['msPlayed']/60000
df.head()

Spotify emailed me the streaming history in a JSON file. It didn’t need too much cleaning for my purpose so the only change I made was converting listening time to minutes.

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

client_id = 'client-id'
client_secret = 'client-secret'
client_credentials_manager = SpotifyClientCredentials(client_id=client_id, client_secret=client_secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

Spotipy isa Python library that lets you more easily access the Spotify API (without all the requests etc). You have to key in your Spotify Developer client ID and secret which can be generated by creating a new project in Spotify for Developers.

Next, we need the genres of the artist, which can be navigated to with this function. It retrieves the genre(s) from the artist search results, if they exist.

def get_artist_genre(name):
result = sp.search(name)
if not result["tracks"]["items"]:
return []
track = result['tracks']['items'][0]
artist = sp.artist(track["artists"][0]["external_urls"]["spotify"])
return artist["genres"]

Now we can run the function for each artist in my listening history. There are more sophisticated ways of doing this like giving each artist a UID or using dictionaries but I didn’t have *that* much data so I left it as is.

gen_list = []
for name in df['artistName']:
gen_list.append(get_artist_genre(name))

Finally, we expand the DataFrame of genres (I wanted to create a bubble chart of genre counts) and save everything to CSV.

df_genre = pd.Series(gen_list)
df_genre = pd.DataFrame(df_genre,columns=["genre"])
df_genre_expanded = df_genre.explode("genre")
df_genre_expanded.head()
We Will Rock You is classified as nu metal which doesn’t seem right, but that’s because Spotify only has artist genres, not song genres. Not that I know anything about nu metal to begin with.
df.to_csv('MySpotify.csv')
df_genre_expanded.to_csv('GenresExpanded.csv')

Also, the data comes in UTC time, so I had to add 8 hours to make it Singapore time (done in Tableau).

Part 2: Visualisation

The dashboard with all its numbers is public at https://tinyurl.com/spotify-viz-jol. I chose not to rank by number of times listened, but instead by duration. This is because I skip through songs a lot, and frequently listen to songs that are 7–8 minutes in length.

Part 3: Review

On immediate inspection,

  • Genres: Touhou is this period’s mainstay! I went crazy after finding out the official game OST + my favourite doujin circles started uploading their albums on the platform. I totaled 1269 streams, making it the largest genre bubble. (Interestingly, my Spotify Wrapped didn’t really pick up on Touhou. Maybe because they gave equal weight to the first half of 2021?)
  • Top Songs/Artists: The data confirms my Wrapped report that I’m a huge fan of motivational, game battle and high tempo music in general. I mean, when you’re running on intrinsic motivation 80% of the time, you need that kind of music to keep on upping the ante.
  • Listens by Week: I was averaging 263 songs and 400 minutes of music a week, both of which were gradually rising as I decided to buckle down and prepare for projects and finals closer to the end of the term. Also, 1.5mins per song on average means that I skip through songs pretty often, lol…
  • Listens by Time of Day: Somehow, I successfully protected the sanctity of my sleep schedule, with very few streams between 11PM and 7AM. Given that I am a college student, I think this is worth mentioning.

Other interesting takeaways:

  • I listened to the “Dream SMP” genre 219 times. I have no explanation for this and frankly, I’m afraid of myself now.
  • Some genres I didn’t know existed: “afrofuturismo brasiliero”, “cybergrind”, “manguebeat”.
  • Pink Guy ranks higher than the Financial Times News Briefing.

Closing Thoughts

This was a quick project that I did for fun + to get comfy with Tableau. I’ve barely scratched the surface; there are so many combinations, aggregations and inferences you can make just with this dataset alone. Off the top of my head, I can already think of checking which genre/songs are most listened to at the time of day, average % of a track listened to, and adding to the mix some of the Spotify embeds that Tableau supports. But that’s for another day!

--

--