— In the News —
There's a lot of data-related inspiration coming out of the White House lately. Last week, President Obama named Dr. DJ Patil as the first-ever U.S. Chief Data Scientist. Don't miss these:
In a paper published in Nature this week, Google researchers revealed AI techniques that can figure out how to play video games. This is a big deal because the software figures out the game rules on its own. Articles about the paper sprung up in the New York Times, Wired, The New Yorker, and Technology Review. There was a lot of enthusiasm being flung around but this was the only article that also discussed Ms. Pac Man, the showstopper.
It doesn't seem that it would be difficult to come up with a set of rules that determine how ingredients should be used together. But by treating large collections of recipes as networks, researchers have discovered some interesting anomalies.
— Tools and Techniques —
Waterfall plots are effectively bar charts but designed to be read left-to-right so that you can see how a series of intermediate steps leads to a final conclusion. This is a good description of the technique and includes a link to sample code.
Base R plots may not be as slick as those made with ggplot but these are fast and work on just about anything. This is a nice overview of what you can do with Base R plots and how to use them.
— Resources —
Looking for data? Sebastian Raschka maintains this curated collection of free and open-source datasets. Many are particularly well suited to data science research. He's interested in expanding the list so if you have a favorite dataset that's not listed, let him know.
Several people have written to let me know about their favorite data podcasts. It's true that there are some really good ones that are worth paying attention to. With the exception of Talking Machines, this post has everything on my list.
— Data Viz —
Beautiful set of data visualizations that compares the strength of a particular Texas hold'em hand against all other hands. There are some surprises here but the explanations are great and may lead you to scratch your head a bit.
Great talk by Jeffrey Heer about trends in data visualization. This is a worthwhile 10-minutes and if you aren't already familiar with it, be sure to check out his interactive ACM article, A Tour through the Visualization Zoo.
Half art. Half data viz. There are some real beauties here. This is a great article by the folks at Accurat Studio that describes their design process. Highly recommended.