Saturday, August 1, 2009

Infographic in Python using Chaco

This week I ran across a blog post about this New York Times infographic, which explains one of the measures of the "business cycle" based on industrial production (the data comes from the OECD originally). [Update: I also saw the post over at Juice Analytics in which they implemented this in excel.] Never one to pass up an opportunity to re-invent the wheel, I thought it would be a good exercise to implement this in Python using the excellent Chaco plotting toolkit which comes included in some python distributions. So, here's the beginning of that effort, after a few hours digging around the docs and hacking together a GUI:

This is a good start. I've posted the code for this at github.

To really flesh it out, you'd need to add in the Composite Leading Indicator data and make some of the elements update based on the selected range. It would also be cool to dynamically switch out the data for various countries, or view them concurrently. Any takers?

Information Density

I think what makes such a simple interface so compelling is that you are able to see the relationship between three pieces of data. Cross-plots are a great mechanism for visualizing relationships between two data sets, but they're made even more useful when you can highlight a range in a common index (e.g. time, in the case of time-series data, or depth, in the case of depth indexed data in the geophysics arena.) Even with a lot of information presented, the display is very clean--even sparse.

State and State-Transition

This particular graphic also reveals the "state" of the business cycle by partitioning the graph into quadrants. This data set has a very straightforward state inherent in it's construction, but one might imagine more sophisticated calculations of state decorating time series data such as this. I'm beginning to investigate the application of this to stock price streams and some derived state that can be displayed in ways that can be "replayed" and analyzed. Whether real "information" can be teased out of the data will remain to be seen, but I'll try to leverage the visual cortex to gain intuition about the data.

Any comments/suggestions about the approach are welcome.

2 comments:

  1. Hi Travis. This is really nice. I've been wanting to try Chaco and this is a great jumping off point.

    ReplyDelete
  2. Thanks, Chris. Let me know if you have any trouble running it on your system. I'm on OS X, but it should run everywhere. Please let me know if you have any feedback/suggestions.

    ReplyDelete