— In the News —
A Czech startup called Semantic Visions takes an analytical approach to news commentary and, not surprisingly, discovers differences between Russian and Western media sources. What's interesting here is the level of insight they're able to get by essentially aggregating large quantities of simple metrics.
The datasets that these sports analytics devices will create are going to be amazing.
— Sponsored Link —
The AI Conference will be in SF on Friday, June 2nd. Don't miss talks from big thinkers like David Brin, author of The Postman and Earth and practitioners from Amazon's Alexa, NVIDIA, Baidu, Slack & More. This event will also host a startup showcase of interesting AI startups and a panel discussion on chatbots. Mention "DataElixir" & save $50.
— Tools and Techniques —
The complexity of some of the most accurate classifiers, like neural networks, is what makes them perform so well. But that also makes it challenging to explain their output, which, in some domains, is a big problem. In this post, Shirin Glander explores an approach called "LIME" that helps to make complex models at least partly understandable.
David Robinson's latest post uses the tidytext package to explore a natural language dataset of story plots. This is an easy-to-follow tutorial that shows how to quickly gain insights from a large dataset of text.
The NBA is well-known for its awesome datasets. This is a fantastic tutorial that explores data from the NBA's Last Two Minute Report. It starts with basic data access and quickly moves on to develop a variety of models. Includes lots of code snippets and data visualization along the way. Highly recommended.
— Resources —
If you're actively engaged in an AI startup or just interested, definitely don't miss this article. This is a well-organized and thorough overview of accelerators and early-stage funding options. If you're interested in funding options for established startups, check out Part 1.
— Data Viz —
Matplotlib has a reputation for being difficult to work with but it's super flexible and there's a big ecosystem of Python tools built around it. This post starts with key tips and shows how to easily get started with matplotlib.
t-SNE is great at capturing a combination of the local and global structure of a dataset in 2d or 3d. But when plotting points in 2d, there are often interesting patterns in the data that only come out as "texture" in the point cloud. When the plot is colored appropriately, these patterns can be made more clear.