— In the News —
OpenAI is said to have trained an unsupervised language model that can read and write at a level that's never been seen before. It's called GPT-2 and they say it's so good, they're afraid to release it. This article in The Verge explores the claims and the presumed dangers, including samples of GPT-2's capabilities. Follow the links for more info, code and related articles.
It seems that everyone wants to use AI in their business these days but many, and maybe most, will fail. This article from the MIT Technology Review shows what many businesses get wrong and why deploying AI is often slower and more expensive than expected.
— Sponsored Link —
In Paco Nathan's latest column, he explores the role of curiosity in data science work as well as Rev 2, an upcoming summit for data science leaders. This episode unpacks curiosity as a core attribute of effective data science, looks at how that informs process for data science (in contrast to Agile, etc.), and digs into details about where science meets rhetoric in data science. Overall, these topics are among the themes you can expect at the next Rev!
— Tools and Techniques —
Lex Fridman's 2nd lecture from his Deep Learning course at MIT is a great overview of the cutting edge in deep learning. It introduces a wide variety of applications in natural language processing, AutoML, use of synthetic data, image synthesis, semantic segmentation, etc. Includes a clickable outline that links to each section of the lecture.
SageMaker is a fully managed platform for building, training, and deploying machine learning models on AWS. It includes an API for working with a variety of common libraries or you can connect a Docker image to its low-level API and do whatever you want inside the Docker image. There are a lot of reasons you may want to do that. This article shows you how.
This open-source project from Microsoft is a collection of Jupyter notebooks with examples and best practices for building recommendation systems.
G Elliott Morris is a data journalist for The Economist with a particular interest in politics. This tutorial shows how to import, wrangle, model, and visualize election results using his politicaldata package in R. If you're interested in more like this, check out his weekly newsletter, The Crosstab.
Realize the promise of data analytics and find the opportunity in the numbers. The Master of Management Analytics from Smith School of Business is essential training to unleash the potential of data and generate competitive advantage.
— Resources —
These summaries of key papers in natural language processing include links to the full papers, core ideas, key achievements, potential applications, and thoughts from the community. This is a nice overview of this rapidly evolving space.
I missed this when it came out last year but the picks here are timeless.
— Career —
This post by Vicki Boykis has gotten a lot of attention around the web this week. It's a must-read about the realities and the hype of data science as a career path, especially for newcomers. If you're junior or just thinking about getting into the field, here's why you should not go into data science and what you should do instead.