— In the News —
Very clear and fun read about how to get a bunch of computers to agree on the information they store. The analogies are awesome in this article. Between the imaginary islands, tyrants, and sheep, this will forever change how you think about your data.
AI is everywhere in the news these days and it’s advancing into the workplace at a dizzying speed. This article in the Harvard Business Review explores how companies are getting real value out of AI, where things are going, and some things to watch out for.
— Sponsored Link —
Python is the ideal language for data science, but getting set up with all the libraries you need can be time-consuming. ActivePython is pre-bundled with over 300 packages including NumPy, SciPy, scikit-learn, TensorFlow, Theano and Keras, and is integrated with the Intel Math Kernel Library (MKL) for optimized NumPy and SciPy computations. It’s free to use in development, so you can get started in minutes.
— Tools and Techniques —
This is a good entry point for a data science iteration tool called "DVC." It stands for "data version control" and is based on concepts that are used in software engineering to facilitate ongoing development. DVC makes it easy to create versions of machine learning algorithms and to share the corresponding code, dependencies, and data in a single, reproducible environment.
How hard can it be to compute conversion rate? Take the total number of users that converted and divide them by the total number of users, right? Not exactly...
Step by step tutorial that demonstrates how to approach common business questions. This is Part 1 of a multi-part tutorial by Shirin Glander. This part starts with data exploration and continues with answering simple questions regarding transactions, customers, income, item purchases, etc.
— Deep Learning —
Great article about LSTMs, starting with the basics of how neural nets work. This is a long read but it's easy to follow with lots of diagrams, code snippets and an interactive web app.
Here's how to build your own deep learning box, including considerations for purchasing components (e.g. GPUs, CPU, RAM, motherboard, etc), assembly, software installation, and system testing.
— Data Viz —
The above graphic is a collection of plots that show data from each state of the United States and is arranged in a way that preserves the relative geography of each state. For instance, plots for the West coast states are displayed on the left edge of the graphic. As The New York Times keeps showing, this can be useful and is easy to do for any given region with the R package called "geofacet."
Here's a nice exploration of map projections with helpful interactives to play with.