— In the News —
Naftali Tishby is a researcher with an important new idea about how deep learning might work. He proposes that the most important part of learning is actually forgetting and his ideas about "information bottlenecks" are getting a lot of attention around the web this week.
— Profiles —
Hilary Mason speaks to audiences around the world about data, machine learning, AI, and how to build real, functional, and robust products. She's the Founder of Fast Forward Labs and is the Data Scientist in Residence at Accel. In this interview, Hilary discusses a variety of topics including things like careers, data products, and black-box deep learning algorithms.
— Tools and Techniques —
This is a great idea! This post uses text mining techniques to explore the color themes across a "corpus" of LEGO sets. It's a fun read with great visualizations along the way.
Here's Part 2 of Tom Augspurger's new series on Scalable Machine Learning. In this part, Tom shows how to fit a model on a dataset that doesn't fit in RAM using dask and scikit-learn in a pipeline.
There's a gap between the idealized data science projects that students are exposed to and what actually happens in the real world. This article explores ten of the most important surprises.
After years of being left for dead, SQL today is making a comeback. How come? And what effect will this have on the data community?
Franchise is an open-source SQL tool with a notebook interface. Supports CSVs, JSON, XLSX files and offers a variety of ways to explore your data.
— Data Viz —
We use maps to understand physical space but increasingly, our computers and devices make understanding that space less important. These days, we're often more interested in the time that separates places rather than the geography. What would a map of time look like?
Nice tutorial for people starting out with ggplot2. Includes useful descriptions, code snippets, and lots of screenshots.
— In Case You Missed It —
Be sure to catch the most popular links from last week's issue...
— About —
Data Elixir is curated and maintained by @lonriesberg. If some awesome person forwarded this issue to you, subscribe for free at dataelixir.com and get it delivered every week.