— In the News —
This is an amazing project. The Venice Time Machine Project is digitizing over 1000 years of historical maps and documents. These are hand-drawn documents and there are so many of them, just storing the documents requires over 80 km of shelving. Machine learning is being used to make sense of the handwriting, map social networks, track the city's infrastructure development, etc, etc. Great read.
— Sponsored Link —
Mode is a SQL editor, Python notebook, and visualization builder all rolled into one. Explore data with SQL and pass results instantly into a Python notebook for further exploration and visualization. Pick and choose output cells to present to others, or send the whole notebook—you can even share with people who don't have a Python environment set up.
— Tools and Techniques —
David Robinson from Stack Overflow recently published an article that went viral: Developers Who Use Spaces Make More Money Than Those Who Use Tabs. In this post, Evelina Gabasova analyzes what's really going on with the Tabs vs Spaces debate. It turns out that people who use spaces instead of tabs really do make more money. Here's why.
Nice overview of the data infrastructure ecosystem, which can be daunting for newcomers. Includes guidelines for small, medium, and big data and when it makes sense to level-up. This was written by Nate Kupp, an infrastructure pro at Thumbtack. This looks super useful.
This post by Kendrick Tan shows how to build a visual search engine for Instagram. It's a fun post with lots of code snippets along the way.
— Deep Learning —
Google recently released a paper that introduces a MultiModel that does better when trained on a variety of tasks. It's not a master algorithm that can learn everything at once but by being trained broadly, it performs better on specific tasks - and with less data.
Nice collection of resources for building a foundation in deep learning.
— Data Viz —
This collection of basic plots is a great resource to add to your bookmarks. These aren't fancy from a UI perspective but they're functional and the key thing is that each plot includes sample code from multiple libraries such as Seaborn, Matplotlib, plotnine, and ggplot2.
Several datashader projects have been making their way around the web recently. This post by Jeremy Stanley uses datashader to visualize grocery deliveries at Instacart. The technology here is impressive. Datashader enables the ability to interact with millions or even billions of points. If you're interested after reading the article, definitely follow the links.
— Career —
If you wish to begin a career in data science, you can save yourself days, weeks, or even months of frustration by avoiding these 9 costly beginner mistakes.