3 Ways to Visualize Data Distribution

July 10, 2023

Don’t you just love drooling over kick-ass visuals? Or is that just me?

When analyzing data, it’s inevitable that you’ll look at how your data is distributed overall. And no matter how much you love reading, a clean graph/chart will always turn you on more than a messy, hard-to-read table.

That’s why understanding different ways to visualize data matters.

And that’s what I’m showing you today.

Here are 3 Important Visuals to Explore Data Distribution…

Box Plot

Box Plots use percentiles to show you a range where the majority of your data lives. And it also gives you the outliers that could skew central tendency metrics like the mean.

I’ve talked a lot about my Spotify Python project during the start of this 66 Days of Math and Programming experiment. So I’ll keep that trend alive and show you examples using the popularity values I have for the 439 songs in my playlist.

A Box Plot and the Python code that created it

Histogram

Histograms are a way to visualize frequency tables.

But you probably don’t know what frequency tables are, right?

It’s a tablet that takes a variable’s range and divides it into equally spaced intervals called bins. Then it shows you how many data values fall within each bin.

Sometimes you’ll have empty bins, and histograms don’t omit those. Those empty bins actually tell interesting stories about your data.

A Histogram and the Python code that created it

Density Plot

Last but not least, we have Density Plots. They’re like histograms on steroids.

Ehh, not really.

But they are like smooth histograms.

You see, Density Plots show the distribution of your data as a continuous line. For my fellow math dorks, if you evaluate the line’s integral over your range (aka find the area under the curve), it’ll equal 1.

A Density Plot and the Python code that created it