— In the News —
Bayesian statistics are rippling through everything from physics to cancer research, ecology to psychology. Enthusiasts say they are allowing scientists to solve problems that would have been considered impossible just 20 years ago. This article from the New York Times explains how.
Big Data is only as good as the data it contains. This article explores the dark side of Big Data. Think you know its blind spots? Read on to find out.
— Tools and Techniques —
Detailed case-study of how to tackle a Kaggle competition. With sample code and data, this is a very worthwhile introduction to machine learning.
Version 1.0 of Beaker was just released. This is an open-source notebook-style development environment for working interactively with large and complex datasets. Beaker works with a variety of languages and is definitely worth exploring.
This new tool can compute stress majorization based layouts for graphs with as many as several hundred thousand nodes. That's well beyond the limits of standard stress majorization layout algorithms. The technical details at the bottom of the project's page explain how that's done.
— Resources —
Interested in learning about machine learning? Check out this video series, presented by two Stanford professors. If 15 hours of video is a bit daunting, the text that goes along with this series is freely available and is a well-organized reference: "An Introduction to Statistical Learning."
— Data Viz —
This is an article about design. And specifically, the little things in design that can make or break a data visualization. This is very well done with lots of examples.
Here's one more great resource for learning about machine learning. MLDemos is an open-source visualization tool for studying how these algorithms work. Fun to play with!
D3 has become the standard for building data visualizations on the web. Scott Murray's book, "Interactive Data Visualization" offers a great introduction to D3 and has just been made available as a free online resource with interactive examples. If you have any interest in building data visualizations for the web, this is a must play-with, online read.