Issue 173
— In the News —
When an AI finally kills someone, who will be responsible?
Legal scholars are furiously debating which laws should apply to AI crime. Could an AI system be held criminally liable for its actions? This is a super interesting debate.
The key to getting value from machine learning
The gap for many companies isn’t that machine learning doesn’t work, but that they struggle to actually put it to use. This article from the Harvard Business Review shows how getting value from machine learning is often less about cutting-edge models, and more about making deployment easier.
Barbara Engelhardt develops machine learning models that search for the causes of disease in human genomes. The problems she faces are very different than the problems that data scientists working on business applications face. In this interview, she explains why traditional machine learning techniques often fall short for genomic analysis, and how researchers are overcoming the challenges.
— Profiles —
A Statistical Search for Genomic Truths
Earn your Master's in Data Science online from Syracuse in as few as 18 months. GRE waivers available.
— Sponsored Link —
Earn Your Data Science Degree Online. GRE Waivers Available.
Airbnb’s marketplace contains millions of listings that users explore through search results that are generated from a sophisticated machine learning model. Searches are personalized in real-time and drive 99% of Airbnb's bookings. This post describes their "Listing Embedding" technique that helps make the search results useful and are applicable to "any type of online marketplace on the Web." Stream processing enables organizations to access huge amounts of data in real-time but so far, it's required expert software skills to utilize. Now, there's a new approach gaining ground. KSQL is the new streaming SQL engine for Apache Kafka and it's dramatically lowering the bar. Here's a gold mine of "documents" that demonstrate a variety of statistical concepts and programming. Many of these are interactive. Topics include things like Bayesian Basics, Mixed Models, Latent Variables, and tools for R. This is a great resource. Have you ever developed a great solution that never got used? It may not be intuitive but even internal tools need to be marketed. This is a great post that walks through 7 key steps for making sure your data products have the impact you're hoping for. This new post on the Distill site might just blow your mind. It's an interactive tutorial that shows what each layer of a deep learning network "sees." It's long and amazing so be prepared to spend some time with it. For a higher-level perspective, check out the New York Times article about the post, "Google Researchers Are Learning How Machines Learn."
— Tools and Techniques —
Using Embeddings for Real-time Recommendations
Big, fast, easy data with KSQL
Documents
A 'Go-to-Market’ Plan for Your Next Data Product
The Building Blocks of Interpretability
Once you know why they're outliers, there are a variety of ways to handle data points that don't quite fit with the others. In his latest post, Nathan Yau explores ways to use visualization to provide meaning and context to data outliers.
— Data Viz —
Visualizing Outliers
This looks awesome. It's a fantastic lineup of speakers, it's free to attend and it's hosted by Facebook so it definitely has some force behind it. Apply if you qualify and if not, do a colleague a favor and share the link.
— Career —
Women in Analytics Conference
Data Elixir is curated and maintained by @lonriesberg. For additional finds from around the web, follow Data Elixir on Twitter, Facebook, or Google Plus.
— About —
Sign up for Free
and join the thousands of data lovers who already start their week with us.
No spam, ever. We'll never share your email address and you can opt out at any time.
No spam, ever.