— In the News —
Insightful interview with DJ Patil. Along with offering insights into what he'll do as the U.S. Chief Data Scientist, he also reflects on the key things he's looking for when hiring for his team. Perhaps not surprisingly, technical skills like Python aren't on his shortlist.
According to this article in Wired, businesses are struggling to realize the promise of big data. If that's true, can data scientists still justify their salaries? This is an interesting read that concludes with an answer that's sure to fan the debate.
— Tools and Techniques —
If this article is any indication, Sebastian Raschka's upcoming book about Machine Learning is going to be awesome. This is an easy to follow introduction with lots of diagrams and code snippets. Both this article and his blog are highly recommended.
If you're interested in learning about how to approach image classification problems, don't miss this tutorial! This is a great breakdown of the National Data Science Bowl's winning solution.
Streamgraphs can be used to make compelling visualizations, especially when displaying large datasets. Here's a tutorial and R package for making streamgraphs via an htmlwidget.
Python and R are both popular for data exploration but which should you use and why? This is by no means comprehensive but is a well-reasoned contribution to the Python versus R discussion.
— Resources —
Julia is a relatively new language that could become the go-to choice for scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing. This is a great overview of what there's to like about Julia, useful packages, what's missing, tutorials, videos, resources, and examples.
— Inspiration —
Very effective approach for making static data useful and engaging. Click through the narrative and prepare to be amazed.
Fantastic idea. Dear Data is a year-long drawing project by visualization masters Giorgia Lupi and Stefanie Posavec. Every week, they exchange postcards and get to know each other through the data they draw. Highly recommended.