— In the News —
Legal scholars are furiously debating which laws should apply to AI crime. Could an AI system be held criminally liable for its actions? This is a super interesting debate.
The gap for many companies isn’t that machine learning doesn’t work, but that they struggle to actually put it to use. This article from the Harvard Business Review shows how getting value from machine learning is often less about cutting-edge models, and more about making deployment easier.
— Profiles —
Barbara Engelhardt develops machine learning models that search for the causes of disease in human genomes. The problems she faces are very different than the problems that data scientists working on business applications face. In this interview, she explains why traditional machine learning techniques often fall short for genomic analysis, and how researchers are overcoming the challenges.
— Sponsored Link —
Earn your Master's in Data Science online from Syracuse in as few as 18 months. GRE waivers available.
— Tools and Techniques —
Airbnb’s marketplace contains millions of listings that users explore through search results that are generated from a sophisticated machine learning model. Searches are personalized in real-time and drive 99% of Airbnb's bookings. This post describes their "Listing Embedding" technique that helps make the search results useful and are applicable to "any type of online marketplace on the Web."
Stream processing enables organizations to access huge amounts of data in real-time but so far, it's required expert software skills to utilize. Now, there's a new approach gaining ground. KSQL is the new streaming SQL engine for Apache Kafka and it's dramatically lowering the bar.
Here's a gold mine of "documents" that demonstrate a variety of statistical concepts and programming. Many of these are interactive. Topics include things like Bayesian Basics, Mixed Models, Latent Variables, and tools for R. This is a great resource.
Have you ever developed a great solution that never got used? It may not be intuitive but even internal tools need to be marketed. This is a great post that walks through 7 key steps for making sure your data products have the impact you're hoping for.
This new post on the Distill site might just blow your mind. It's an interactive tutorial that shows what each layer of a deep learning network "sees." It's long and amazing so be prepared to spend some time with it. For a higher-level perspective, check out the New York Times article about the post, "Google Researchers Are Learning How Machines Learn."
— Data Viz —
Once you know why they're outliers, there are a variety of ways to handle data points that don't quite fit with the others. In his latest post, Nathan Yau explores ways to use visualization to provide meaning and context to data outliers.
— Career —
This looks awesome. It's a fantastic lineup of speakers, it's free to attend and it's hosted by Facebook so it definitely has some force behind it. Apply if you qualify and if not, do a colleague a favor and share the link.