ISSUE 414 · November 29, 2022
The important purple people outside the data team
It's not just "data people" who work with data. These days, data is commonly used by people throughout an organization and many are highly skilled. In this post, Mikkel Dengsøe takes a look at bringing these people into the data team and what to consider before doing it.
Look ahead to the future of BI and a new way to Mode
Join Mode founder, Benn Stancil, and Adam Smith, Analytics Manager at Imperfect Foods, at this special event to learn how Imperfect Foods gained the trust of stakeholders with data teams at the center. We'll talk about the future of Mode and reveal exciting new product updates focused on modern business intelligence in 2023.
Tutorials, Projects & Opinions
Demystifying Fourier analysis
Great explainer that takes a bottoms-up approach to show how the Fourier transform works. Includes several interactives to play with.
The Betting Equation
In this step-by-step guide, David Sumpter walks through the maths and coding needed to build a betting model for the World Cup. This is excerpted from his book, The Ten Equations That Rule The World.
First impressions of DataFrames.jl and accessories
DataFrames.jl is a Julia package for data wrangling. If you're curious about Julia and/or are interested in how Julia does things differently, this post explores DataFrames.jl from an R-user perspective.
Tools & Code
SkyPilot is an open-source framework that lets you run ML and Data Science jobs on any cloud through a unified interface. It's designed to save you money by choosing the cloud service that's currently the most cost-effective, regardless of where it is. This post is a good overview of how it works, use-cases in the wild, and how to get started.
CausalPy - causal inference for quasi-experiments
CausalPy is a Python package for causal inference in quasi-experimental settings. The package allows for sophisticated Bayesian model fitting, in addition to traditional Ordinary Least Squares (OLS).
Trying to link records that refer to the same person? Name Match uses probabilistic linking to identify and link records that refer to the same person. It's an open-source python package that works within and across datasets.
The Turing Way
The Turing Way is a handbook to reproducible, ethical and collaborative data science. The goal is to provide all the information that researchers and data scientists in academia, industry and the public sector need at the start of their projects to ensure that they are easy to reproduce at the end. There's a community here too and the online book is free.
Goodbye, Data Science
"Data Science" has been widely hyped over the past several years but it's not without its issues. This post is a rant from the trenches that sure has struck a nerve in the community. It's a rant but there are useful insights here if you've been thinking about career moves.
Looking for Ambitious Machine Learning Engineers
Ratio is a revenue-generating startup that's looking for ambitious machine learning engineers to help automate a big part of the advertising space. The product is "like a self-driving car of marketing" and largely uses existing models from OpenAI. Remote OK.
If you're interested in new opportunities, join the Data Elixir Talent Collective and get reach-outs from vetted companies. You can join anonymously and leave anytime.