— Insight —
In this annual review, Matt Turck of FirstMark Capital offers an in-depth look at the rapidly evolving data ecosystem. Includes an updated map of the landscape, a link to a spreadsheet with hundreds of additional companies, and perspectives regarding privacy and regulation in 2019. This year, the annual review is split into two parts. The second part covers trends in infrastructure, analytics and AI/ML. These are very worthwhile long reads.
"Long live the revolution..." 🤣 / 😟
GDPR has been in effect for over a year now. It was an expensive hassle for a lot of businesses. Here's what we've gained.
— Sponsored Link —
How to choose the right ML approach for your business goals and how to determine the best data labeling technique for your use cases.
— Tools and Techniques —
This new survey paper reviews current methods for Monte Carlo gradient estimation in machine learning and across the statistical sciences. It's a survey paper but it covers a lot of ground.
This is a great visual introduction to NumPy that shows how data is represented for a broad range of common use cases. Nice reference.
This post on Lyft's Engineering blog walks-through the machine learning system that enables Lyft's marketing at scale. It's fairly high-level but it's a good read and includes worthwhile details along the way.
Paco Nathan‘s latest post covers AutoPandas, program synthesis, and model-driven data queries. Program synthesis could disrupt how software is written and at the least, it promises to be a hot new buzzword that's worth being familiar with. Paco's posts are conversational in style with useful links along the way.
Vettery specializes in tech roles and is completely free for job seekers. Interested? Submit your profile, and if accepted onto the platform, you can receive interview requests directly from top companies growing their data science teams.
— Data Viz —
Google's Data Visualization team recently released detailed guidelines for creating visualizations. The guidelines distill their top principles and considerations and were released as a public resource for helping everyone create visualizations. This post by Manuel Lima describes the thinking behind the guidelines and offers six key principles for designing any chart. The complete guidelines are available here >>
kepler.gl is an advanced geospatial visualization tool that was open sourced by Uber’s Visualization team in 2018. At Uber, kepler.gl is the defacto tool for geospatial data analysis and now, it's available for Jupyter as a widget! Here's what it offers and how to get started.