Expected Value: An Old Friend Returns

July 3, 2023

“Yer a (math) wizard, Mike!”

Don’t you just love when ideas you learned long, long ago pop back into your head when you need them most?

While working on my current “top secret, yet I’m about to reveal it” Python project, I found myself stuck. But when I was desperate to find my mathematical savior, it hit me.

Right now, I’m using Spotify’s  API to scrap data from my account and create an algorithm that scores how trendy my taste in music is. I finished scraping my data the other day, so yesterday I dove into a JupyterNotebook and started analyzing the data.

One metric I wanted to measure was the average number of artists per song on my playlist. At first, this seemed like child’s play. I mean, take a look at the formula for finding an average below. That ain’t Complex Analysis.

average = (sum of terms) divided by (number of terms)

So I wrote the following code to find how many times each number of artists occurred…

Python code to find number of songs per number of artist

Here’s what my playlist contains:

  • 439 songs with at least one artist
  • 282 songs with at least two artists
  • 73 songs with at least three artists
  • 25 songs with at least four artists
  • 13 songs with at least five artists
  • 5 songs with six artists

But here’s the problemo…

There aren’t 837 songs in my playlist like the above numbers sum to. I only have 437 songs. This means I made one crucial mistake – I didn’t account for overlapping values.

With this in mind, I went to work to fix my error. And that led to the code you see below.

Fixing error from my first code

Okay, now all I needed to do was use the simple average formula to find my answer. But at this point, problemo number two entered the room and slapped me silly.

You see, I couldn’t just add up all the numbers and divide them by six. That would give me a number outside the expected range of (1,6). So I thought over what to do, until an old friend whizzed through my mind…

Expected Value.

The formula for Expected Value

I studied probability/statistics in college and most of the course revolved around using expected values to predict outcomes. But the mini-genius inside of me remembered that the expected value is basically calculating the average of outcomes.

So I whipped up the code you see below to calculate the expected value for the number of artists in any given song in my playlist.

My Python code that calculates EV

And I discovered my answer sat just below 2 artists.

How lovely.

The reason I’m getting into programming is that I want to find cool ways to apply advanced math topics to the real world. So seeing my “expertise” come in handy here makes the process more fun.