Data Elixir logo

ISSUE 426  ·   February 28, 2023

 

Resources

Awesome Polars

Nice selection of Polars posts, tutorials, talks, tools, a cheatsheet, and examples from around the web. This is an evolving collection and contributions are welcome.
GitHub | Damien Dotta

 

Introduction to Data-Centric AI

Data-Centric AI is an emerging discipline that studies techniques to improve datasets for ML applications. The link goes to a new online MIT course that introduces the field and explores algorithms to find and fix common issues in ML data. The course is intended to be highly practical and is focused on impactful aspects of real-world ML applications. Free.
MIT CSAIL

 

Sponsored Link

Webinar: Small Team, Big Impact

Webinar: Small Team, Big Impact

Join us for a panel webinar unlocking the power of data for startups and gathering insights from early-stage leaders on growth. Register Now.

 
 
 

Have a product, service, job, or event you'd like to share with over 55,000 subscribers?

Sponsor an Issue | Job Board | Talent Collective

 

Tutorials, Projects & Opinions

Setting up a new machine for data science

New machine? This guide walks through a typical data science setup and how to get everything installed correctly. It's written with MacOS Ventura in mind but for the most part, it's OS agnostic. Covers R, Julia, Python, related IDEs, terminal settings, command line tools, shortcuts, Git, Docker, Postgres and more. 
Rami Krispin

 

Content Moderation - Patterns in Industry

Great post that explores techniques that are used in industry to learn and infer the quality of human-generated content such as product reviews, social media posts, and ads. Considers industry papers and tech blogs from a wide variety of businesses that rely on content moderation. This post covers a lot of ground but is easy to follow and is a nice survey of content moderation techniques in the real world.
Eugene Yan

 

pandas 2.0 and the Arrow revolution

pandas 2.0 will be released soon. This first post in a series describes one of the most important changes to look forward to.
Marc Garcia 

 
 

If you have 3+ years of data science experience, join the Data Elixir Talent Collective where top companies apply to you. For details, check out the Collective 👉

 

Data Visualization

Data Visualization Fundamentals and Best Practices

Learn the fundamentals of data visualization in this online course by Robert Kosara. This course will walk through basic chart types and show how to decide which to use; how to use aggregations, such as binning and smoothing; the difference between using charts for exploration and analysis vs. presentations; and more. Starts on March 7.
Observable | Robert Kosara

 

Tomorrow's weather

This step by step tutorial shows how to create orthographic weather maps using model data from the Global Forecast System (GFS) and ggplot2. There's a lot of detail here, making it easy to follow and modify for your own specific interests.
Dr. Dominic Royé

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
« Previous Issue
 
 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions, send a note!