— In the News —
@realDonaldTrump went head-to-head with the media this week in a dispute over the size of his inauguration crowd. You wouldn't know it from the rhetoric but it turns out that counting crowds is complicated.
Data engineering is a rapidly growing discipline that overlaps both data science and software engineering. This is a great article about what data engineering is exactly and what it's evolving into.
Here are six areas of AI that are particularly noteworthy in their ability to impact the future of digital products and services. This article describes what they are, why they are important, and how they are being used today.
— Tools and Techniques —
This is a great post that chronicles the evolution of a recommendation engine from a minimum viable product to a large-scale, production-ready solution.
The JOIN operation is one of SQL's most powerful features. It is the envy of all non-relational databases because it easily lets you combine two data sets. This deep dive describes a variety of JOIN techniques with example uses and sample code.
Build a recommendation system for exploring food and the world's cuisines with food2vec. Includes algorithm descriptions, live demos, and a Github repo.
NumPy is a key Python package that provides support for large, multi-dimensional arrays and matrices. This tutorial shows how to get started with NumPy and explores some of its most important features.
— Data Viz —
Fantastic study by Nathan Yau. "You must help the data focus and get to the point. Otherwise, it just ends up rambling..."
Here's the latest in curated guides for visualization tools and resources. There are several of these collections out there and many are very good. What's new here is the inclusion of commercial products, in addition to open source tools, which gives it a bigger universe to select from.
— Career —
This collection of interview questions is well-organized and includes background information and linked references.