Blog

A Day at the (Bar Chart) Races

Sometime back in 2019 someone coined the term bar chart race. It's essentially an animated bar chart, showing how the "top ten" (for example) of something varies over time. Some people say it's a great way to visualize data. Others say it's gratuitous animation for data that can be better visualized in a line graph (for example). I was intrigued. So I decided to investigate further. Here is my report.

The first place I remember seeing a bar chart race was in this El País article from November 2019, showing the popularity of Spanish political parties from 1977 up to today. I was mesmerized: A visualization of post-Franco Spanish politics! I don't remember how many times I watched it.

I noticed it was a Flourish data visualization. On their website you can upload your data as an Excel or CSV file, set some preferences, and generate such an animation. Voilà. "No coding required." I wanted to try to create something similar, but with coding. (I'm a programmer, remember?)

Exhibit A: Best Global Brands

At about the same time, a JavaScript data-driven graphics library called D3 caught my attention.

The D3 library is the creation of Mike Bostock (and contributors). By coincidence (or not?), Mike had created a bar chart race using D3, and he also created a "pedagogical" version for others to learn from. (Thanks Mike!) So I studied what he had done, and as a learning exercise I wrote my own bar chart race; more on that later.

Mike's example shows the most valuable global brands, as valued by Interbrand, from 2000 to 2019. He supplied the prepared data he used.

Here's his data with my version of the bar chart race. Press the Start button below the chart to let 'er rip (i.e. to start it).

NOTE: At the moment the bar chart race works best on a laptop or larger display.

Noteworthy

  • Halfway through the animation, Apple and Google appear for the first time and shoot immediately to the top, never to surrender their positions. Surprise.
  • Five of the top six brands in 2019 were tech firms. Another surprise.

Exhibit B: Most Populous World Cities

Never satisfied with just one example, I kept looking. I found this example, charting the most populous world cities from 1500 to 2018, by John Burn-Murdoch. Here is my bar chart race using his prepared data.

Noteworthy

  • In the mid-1500s Vijayanagar suddenly vanishes from 3rd place. John explains why.
  • In the early 1800s London shoots to the top and remains undisputed leader for a century.

Exhibit C: Most Popular Baby Names

The two charts above were done using prepared data that was supplied with Mike's and John's demos. I decided to make a chart with data that I hadn't seen used in such a race before.

Not long before I discovered D3, I had been reading Python for Data Analysis, where I was introduced to useful data analysis tools for filtering and analyzing raw data, like Jupyter Lab and the pandas data analysis library. One of the data analysis examples in the book used data on baby names in the US from 1880 to 2010. This gave me an opportunity to do a little bit of very basic data wrangling, which I hadn't needed for the prior two charts. It involved: loading the 130 data files -- one per year -- into a Jupyter Notebook; combining them; sorting them by year and births to find the "top ten"; and exporting them as a single CSV file.

The chart is below. Relax, get comfortable, because the complete animation runs over three minutes 😳, a veritable eternity in the Internet age. But it's worth it.

Noteworthy

  • Mary was the undisputed girl's name, averaging twice the popularity of the second girl's name, until the mid-1940s.
  • Linda shot to the top after World War II, but then slowly sank into obscurity.
  • There were long periods containing two or fewer girl's names. This doesn't mean there were fewer girls born, but possibly that there was more diversity in girl's names.
  • The decline in births into the 21st century could have (at least) two explanations: lower birth rates, and/or more diversity in baby names for both genders.

My Version

As I wrote above, my goal was to code the thing. I studied Mike's pedagogical demo, primarily to learn about D3, and then I started from scratch.

I created a "reusable" chunk of code (a JavaScript class) to contain all the implementation details. (Just to be clear: The chunk of code I wrote relies very heavily on the very powerful D3 library. For example: That gracefully expanding X axis? D3 does it, (almost) automatically. All I did was rewrite Mike's demo from scratch.) The same code is used for each of the three charts on this page. The only differences between the charts are the chart data, and a few options like: delay between frames, bar colors, etc.

I also decided to make my race a bit more "sedate." Having numbers that change so fast is impressive, but not very readable. So I don't constantly update the bar values, I have longer inter-year delays, and for the cities chart I only show every 10th year.

Published: 3 Feb 2020