Hi Crawbears,
In my previous analysis of the NBA 3 Pointer’s dramatic evolution, I walked through data subsetting, time series analysis, density plots, correlation heatmaps, and box and scatter plots. Today, I’ll dive deeper into creating even more powerful, customized data visualizations, including violin and bubble plots plots and creating subgroups using pd.cut.
The late, magnificent Hans Rosling first entered the zeitgeist with his widely watched TED talk, which brought to life his organization Gapminder’s data. I walk through an exploratory analysis of the Gapminder 2007 global development data, including summary statistics, box and violin plots, histograms, heatmaps, and high level observations. I sought to answer 3 burning questions:
- What is the distribution and spread of GDP per capita and life expetancy around the world?
- How does that look like by continent and GDP per capita ranges?
- What is the relationship between GDP per capita, population, and life expectancy?
For the grand finale, I recreate Rosling’s colorful multivariate bubble chart using seaborn. In a future post, I’ll recreate Rosling’s animated time-series bubble plot using plotly.
My code, explanatory notes, observations, and visualizations are below if you’d like to explore the data yourself. As you’ll see, seaborn is more powerful than matplotlib for creating customized, complex visuals.







Is there anything you’d like me to deep-dive on? If you’d like to explore yourself, check out all of Gapminder’s data sets for a range of indicators here.
Next time, I’ll explore a national health dataset, including crosstabs and multivariate visualizations, and generating confidence intervals on proportions and means so you can make statistical inferences with confidence yourself.
Keep exploring,
Rish