ISSUE 320 · January 26, 2021Insight"Very few techies are trying to do the big hard things."Many of the large tech companies have started "AI for Good" programs and claim they want to help solve big problems like climate change, sustainability, and social issues. Are they for real? And if so, what challenges do they face besides who's actually going to do the work? Seven Legal Questions for Data ScientistsFrom privacy and algorithmic bias to security and third-party dependencies, there are a lot of ways that data can get you in trouble. Thinking through the questions discussed in this article will help ensure that you and your organization stay aligned with the law. Sponsored LinkWhat is a Feature Store?Feature Stores manage the lifecycle of features that power ML applications. They enable data scientists to build features quickly and reliably, and provision them to production instantly. This post explains what a feature store is and describes its main components. Tutorials, Projects & OpinionsUnsupervised Data MonitoringIn this first post in a new series, Jeremy Stanley shows how relying on rules to monitor data quality doesn't scale and why you should use unsupervised learning instead. The examples here are great and, if you're interested in data quality, this will definitely be a series to watch. Building a team of internal R packagesIf your organization has started building its own packages, this post by Emily Riederer offers a great approach for designing and tying together an ecosystem of internal tools. Using the jobs-to-be-done framework as a guide, she explores strategies for API design, docs, testing, and more. ML Theory with bad drawingsNice introduction to machine learning concepts with lots of helpful diagrams along the way. Recommending Articles With Help From Our FriendsIt seems like a simple problem. Let readers select interests from a list of categories and then send them personalized content. But at scale, that quickly gets complicated. Here's how and what the New York Times learned from a different approach using machine learning. AMAX: Experience the Fastest Solutions for AITake a test drive on AMAX’s GPU-accelerated server solutions with the latest NVIDIA Ampere GPU architecture to kick start your most demanding analytics, inference, or training workloads with access to AI Expertise for a seamless testing integration. Experience the most powerful end-to-end AI & HPC data center platform now! CareerBuilding a data science startupInterested in turning a data science side project into a small startup? Listen to this discussion with four current founders to learn about some ripe opportunities and key skills you need. Wanted: Data Scientists with Technical Brilliance AND Business SenseTechnical skills are important but when it comes to making big impacts, business intuition is key. In this post, Alok Gupta describes how entrepreneurial data scientists at DoorDash approach problems differently and get more done, faster. Data Viz![]() Density PlotsScatterplots and line charts often encounter performance and overplotting issues as the amount of data increases. Density plots are an alternative that can work better for larger volumes of data. Here's how they work, including a library and interactive examples to play with. Data Elixir is curated and maintained by Lon Riesberg. For full-text search of prior issues, visit Data Elixir's Search Page. If you have suggestions or questions for the newsletter, just reply back to this email. Sign up to get Data Elixir's data science newsletter in your Inbox >> |