— In the News —
Huge game changer... a machine plays chess by evaluating the board rather than using brute force to work out every possible move. This is a great overview of how it was accomplished.
Companies brag about the size of their datasets the way fishermen brag about the size of their fish. The bigger, the better, right? Not necessarily...
— Tools and Techniques —
Fantastic slide deck by Jake VanderPlas. This deck replaces some of the theory and jargon of statistics with intuitive computational approaches. There are some fundamentals you need to know but the overall theme here is that if you can write a for-loop, you can do statistical analysis.
Python/Pandas is a powerful combination but getting non-techies to use a command line to run your scripts can be a challenge. Here's an easy way to add native-looking GUIs.
Blaze is a Python library that provides a common interface for data that's stored in different storage systems and Impala is a SQL engine for Hadoop. This is a great tutorial for exploring a large dataset with these technologies.
Nice introduction to building a simple AI that will make reasonable decisions for a variety of board games. This is well-written and includes lots of code snippets.
If you use a Mac, you'll definitely want to check this out. Pineapple is a standalone front-end for IPython that's completely self-contained, with native controls and an integrated viewer.
— Resources —
This is an updated version of a popular cheatsheet that was first released a few months ago. The cheatsheet is a 10-page PDF that summarizes important probability concepts, formulas, and distributions, and includes examples, stories, and solved problems. This version has lots of additional diagrams and has been reformatted for clarity.
A curated list of data engineering tools for software developers.
— Conferences —
Extract gathers 600 of the biggest and baddest minds in the dataverse to share cool stories about how they have used data to evolve products, grow teams and build global companies. Extract has incredible keynotes and data workshops from the likes of Reddit, Lyft, Tableau, Baidu, CartoDB, Import.io, Kaggle and more.
Data Elixir readers get a special 25% discount on tickets when they enter this promo code: Extract-Elixir
See you October 30 in San Francisco!
— Inspiration —
Pascal’s triangle, which may just look like a neatly arranged stack of numbers, is actually a mathematical treasure trove. In this 5 minute TED-Ed Lesson, Wajdi Mohamed Ratemi shows why. There are some great tricks here and be sure to check out the Dig Deeper section.
— About —
Data Elixir is curated and maintained by @lonriesberg. If you find this newsletter worthwhile, please help spread the word! Forward to your colleagues or use the links below to share to your favorite network: