— In the News —
The White House put data science on a pedestal this week. Not only did it name Dr. DJ Patil as the First U.S. Chief Data Scientist, Patil was introduced at this week's Strata Conference by President Obama. It’s not entirely clear what this new position will entail but this briefing from the White House is a good rundown of what's known so far.
Few people have been more closely associated with Deep Learning than Yann LeCun. LeCun has been instrumental in developing the convolutional neural network technique that promises big improvements in things like computer vision, speech recognition, and natural language processing. Since 2013, LeCun been working as the head of Facebook's Artificial Intelligence Research Lab. This is a great interview by Lee Gomes that covers a broad range of topics in AI and why it's important.
Using DNA to store arbitrary data has been under investigation for the past few years. Potentially, this approach could lead to data storage that persists for a million years or more. At the least, this is a fascinating data archival strategy. Here's a quick recap of recent progress and remaining issues.
— Sponsored Link —
MLconf gathers communities to discuss the recent research and application of Algorithms, Tools, and Platforms to solve the hard problems that exist within organizing and analyzing massive and noisy data sets. Join us for one of four events in 2015:
Mention "DataElixir" when registering and save 15%!
— Tools and Techniques —
Fantastic overview of Go and why Go AI is hard. This 25 minute video is very well done. It starts with a short history of the game and quickly gets into the rules and strategies before diving into AI challenges and approaches. Highly recommended.
Netflix recently open-sourced their outlier detection function for big data called Robust Anomaly Detection (RAD). This post from their engineering blog describes how it works, where to get it, and why you should care.
— Resources —
Sebastian Raschka’s mlxtend library has a number of tools to extend Python's data analysis and machine learning libraries. This is definitely worth checking out.
— Data Viz —
Bokeh is a visualization library for creating interactive, web-based plots. Bokeh has interfaces for Python, Scala, Julia, and, as of this week, R. This is a great tutorial for working with Bokeh’s new R interface.
Nice language-agnostic tutorial for creating bivariate choropleth maps. Not sure what that means? That's where the tutorial begins. This is very well done.