Data Elixir logo

ISSUE 413  ·   November 15, 2022

 

Insights

Data’s day of reckoning

Data is sometimes said to be the "new oil" but for most businesses, that analogy doesn't work. In the most successful of data organizations, data might burn bright but, as Benn Stancil puts it here, in most businesses, data burns more like peat moss. Here are some reasons why, with some ideas for all the median businesses.
 Benn Stancil

 

Sponsored Link

Web scraping datasets made easy - ScrapFly.io

Web scraping datasets made easy - ScrapFly.io

The web is full of quality data though scraping it can be difficult. ScrapFly API can retrieve any web page or simplify the web scraping process through cloud web browsers - click buttons, input forms and retrieve the data. ScrapFly comes with a Python SDK making scraping in notebooks a breeze - Try ScrapFly for free!

 

Reach Data Elixir readers by sponsoring an issue. Click here for details.

 

Tutorials, Projects & Opinions

How Federated Learning Protects Privacy

Most machine learning models are trained by collecting vast amounts of data on a central server. This is a great visual explainer that shows how federated learning makes it possible to train models without any user's raw data leaving their device.
PAIR Explorables | Nicole Mitchell and Adam Pearce

 
 

Forecasting with Structural AR Timeseries

The strength of a Bayesian model is largely the flexibility it offers for different modeling tasks. In this tutorial, Nathaniel Forde shows how to fit and predict a range of auto-regressive structural timeseries models and how to predict future observations of the models.
PyMC | Nathaniel Forde

 
 

Method Chaining in Pandas: Bad Form Or a Recipe For Success?

Matt Harrison has written books on pandas and Python and regularly trains data science teams at top companies. And yet, his code is sometimes met with derision online. In this interview, he explores his approach to code, how to think about method chaining, and what separates naive code from good code.
David Amos

 

Using Functional Analysis to Model Air Pollution Data

Functional analysis is one approach to understand how your data changes within a given timeframe, such as a day, or between timeframes such as many days. This is an easy-to-follow tutorial that shows how to apply functional analysis to some messy air pollution data using R.
Nicola Rennie

 
 

How I learn machine learning

In a rapidly evolving field like machine learning, you need to figure out what works for you to navigate the never-ending task of staying up to date. In her latest post, Vicki Boykis shares her own process, including lots of links and resources along the way.
Vicki Boykis

 

Tools & Code

Debirdify

This is a great tool if you're looking for Mastodon accounts to follow and want something more nuanced than a haphazard list of user handles. Debirdify searches a specific Twitter user's Lists and/or Followed Accounts for associated Mastodon handles and returns a Mastodon-friendly csv file.
Pruvisto

 

Resources

Advanced NLP - Carnegie Mellon 2022

Graham Neubig's "Advanced NLP" is one of the best resources you'll find for current state-of-the-art techniques and algorithms in modern NLP. Follow the links for the slides and an awesome collection of readings and resources. Go here for the lecture videos 👉 
Carnegie Mellon University | Graham Neubig

 

Career

Looking for Ambitious Machine Learning Engineers

Ratio is a revenue-generating startup that's looking for ambitious machine learning engineers to help automate a big part of the advertising space. The product is "like a self-driving car of marketing" and largely uses existing models from OpenAI. Remote OK.
Data Elixir Talent Collective

 

New Opportunities

In addition to office-based positions around the world, Data Elixir's Job Board currently has 35+ listings for remote positions, including roles for data scientists, data analysts, researchers, data architects, machine learning engineers, and more. The roles cover a variety of job levels, from Junior to Senior.
Data Elixir Talent Collective

 

If you're HIRING, join the Data Elixir Talent Collective and get regular drops of outstanding data practitioners and leaders who are open to new opportunities 👉

 

Data Visualization

Images by Daniel Coe / CC BY-NC-ND 2.0 / Links: Image 1 Image 2 Image 3

Visualizing Rivers and Floodplains with USGS Data

Awesome tutorial that shows how to create visualizations of the flow of water through rivers and floodplains using publicly available USGS data and open source tools. Includes links to tools, data, key resources, and a gallery of stunning visualiztions.
Beautiful Public Data | Jon Keegan

 

Galileo’s Telescopic Discoveries:
Thinking Visually in the History of Science

People who are trained in visual thinking are able to see things that others miss. When Galileo discovered Jupiter's moons, he did so because he was seeing with the eyes of an artist. This is a great capstone presentation from the recent VIS2022 conference that explores how the process of visualization gives superpowers to the sciences.
IEEE Vis 2022 | Dr. Kerry V. Magruder

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions, just reply back to this email.

Unsubscribe