Issue 129
— In the News —
Big Data Exposes Big Falsehoods
A Czech startup called Semantic Visions takes an analytical approach to news commentary and, not surprisingly, discovers differences between Russian and Western media sources. What's interesting here is the level of insight they're able to get by essentially aggregating large quantities of simple metrics.
Bringing IoT to sports analytics
The datasets that these sports analytics devices will create are going to be amazing.
The complexity of some of the most accurate classifiers, like neural networks, is what makes them perform so well. But that also makes it challenging to explain their output, which, in some domains, is a big problem. In this post, Shirin Glander explores an approach called "LIME" that helps to make complex models at least partly understandable. David Robinson's latest post uses the tidytext package to explore a natural language dataset of story plots. This is an easy-to-follow tutorial that shows how to quickly gain insights from a large dataset of text. The NBA is well-known for its awesome datasets. This is a fantastic tutorial that explores data from the NBA's Last Two Minute Report. It starts with basic data access and quickly moves on to develop a variety of models. Includes lots of code snippets and data visualization along the way. Highly recommended.
— Tools and Techniques —
Explaining complex machine learning models with LIME
Examining the arc of 100,000 stories: a tidy analysis
NBA Foul Calls and Bayesian Item Response Theory
If you're actively engaged in an AI startup or just interested, definitely don't miss this article. This is a well-organized and thorough overview of accelerators and early-stage funding options. If you're interested in funding options for established startups, check out Part 1.
— Resources —
Unsupervised Investments (II): A Guide to AI Accelerators and Incubators
Matplotlib has a reputation for being difficult to work with but it's super flexible and there's a big ecosystem of Python tools built around it. This post starts with key tips and shows how to easily get started with matplotlib. t-SNE is great at capturing a combination of the local and global structure of a dataset in 2d or 3d. But when plotting points in 2d, there are often interesting patterns in the data that only come out as "texture" in the point cloud. When the plot is colored appropriately, these patterns can be made more clear.
— Data Viz —
Effectively Using Matplotlib
Coloring-t-SNE
No spam, ever.