— In the News —
Just about every AI advance you’ve heard of depends on a breakthrough that’s three decades old. Keeping up the pace of progress will require confronting AI’s serious limitations.
In this final part of a series, FiveThirtyEight explores how the media’s demand for certainty - and its lack of statistical rigor - is a bad match for our complex world.
Using code and the web, a data scientist follows two unnamed people and learns just how much our anonymous location data can say about who we are.
— Sponsored Link —
Springboard ensures you will get a data science job with personalized mentorship, career coaching, a curated curriculum and portfolio-worthy projects. Get a data science job within six months of graduating, or get your money back. Monthly payments as low as $27/month. Graduates have gotten senior data science roles at Nielsen, Verizon, and more.
— Tools and Techniques —
Great overview of how R is used at Airbnb. Includes insights regarding their daily workflow, predictive modeling, experimentation, scaling, data visualization, the tools they use, and practical insights for incorporating R into your own organization.
xgboostExplainer makes your XGBoost model as transparent and 'white-box' as a single decision tree. Here's a very clear description of why you might want to use xgboostExplainer and how it works.
If you've ever taken a taxi in New York City, you may have wondered if you would have been better off riding a bike. That turns out to be a smart question. This is a great post by Todd Schneider that shows how to think through the issues with data.
K-Means is a very simple algorithm which clusters data into K number of clusters. It's a strategy for unsupervised learning, which is often used when data isn't labeled. Here's how it works, how to implement it from scratch, common issues, and links to useful resources.
Here are some practical considerations for maintaining your machine learning models in a production environment. This is far from complete but is a good entry point for thinking through some of the issues.
— Resources —
This machine learning reference from the Google Developer's Blog is definitely worth bookmarking.
— In Case You Missed It —
Be sure to catch the most popular links from last week's issue...
— About —
Data Elixir is curated and maintained by @lonriesberg. If some awesome person forwarded this issue to you, subscribe for free at dataelixir.com and get it delivered every week.