Data Elixir logo

ISSUE 418  ·   January 3, 2023

 

Resources

Soccer Analytics 2022 Review

Awesome roundup of ⚽ analytics content from 2022. Covers research papers, blog posts, podcasts, books, code repos and more!
Jan Van Haaren

 

Statistical Rethinking 2023

The latest version of Richard McElreath's Statistical Rethinking course starts this week. This is a popular, online course that teaches data analysis, with a focus on scientific models. The course prioritizes conceptual, causal models and uses Bayesian data analysis to connect scientific models to evidence. Lectures are posted online each week.
Richard McElreath

 

Sponsored Link

Webinar: Don’t Get Lost in the Semantics

Webinar: Don’t Get Lost in the Semantics

Join us for a dynamic conversation between Anna Filippova, Director of Community & Data at dbt Labs, and Benn Stancil, co-founder and Chief Analytics Officer at Mode about how data teams can make the most of the Semantic Layer. RSVP Now. 

 

Reach Data Elixir readers by sponsoring an issue. for details.

 

Tutorials, Projects & Opinions

Computing the Eigendecomposition and the Singular Value Decomposition

In part 5 of this series on Principal Component Analysis, Peter Bloem walks through three methods for computing the eigendecomposition and the singular value decomposition. Building these simple algorithms from scratch can teach a lot about what PCA actually does. 
Peter Bloem

 

How Shapley Values Work

Shapley values - and their popular extension, SHAP - are machine learning explainability techniques that are easy to use and interpret but the theory can be intimidating to learn. This post explores how Shapley values work - not by using cryptic formulae, but through code and simplified explanations.
Aidan Cooper

 

Code & Tools

Top Python libraries of 2022

Tryolabs' annual list of top Python libraries is consistently a must-read post. This well-researched list includes tools for distributed computing, putting notebooks in production, monitoring ML models, fast linting, profiling memory, interpretability, anomaly detection, and much more.
Tryolabs

 
 

RTutor - Talk to your data via AI

RTutor enables users to interact with data via natural language. After uploading a dataset, users can ask questions about or request analyses in plain English. The app generates and runs R code to answer the questions with plots and numeric results. It can also explain statistical concepts and help users decide which tests to use. It's experimental but looks like a good tool for learning.
RTutor

 

Career

Why Business Data Science Irritates Me

A recent post called "Goodbye, Data Science" struck a nerve for a lot of people. In this post, another insider shares his own journey and frustrations. There's a lot of thought here, including practical ways to approach similar issues in your own career and workplace.
Metaheuristics | shakoist

 

Data Visualization

Annotated Forest Plots using ggplot2

Great tutorial that shows how to make annotated forest plots using ggplot2. This shows how to build them from scratch, without using packages like forester, forestplot, and ggforestplot. The approach outlined here gives you a lot of control and flexibility.
KHstats | Katherine Hoffman

 

Exploratory spatial data analysis with Python

Kyle Walker's book, Analyzing US Census Data, has been a popular free download amongst Data Elixir readers but it's specifically for R users. In this first post of a new series, Kyle shows how to create some of his favorite examples from the book using Python.
Kyle Walker

 
 

Sign up to get Data Elixir's  data science newsletter in your Inbox >>

 
« Previous Issue   Next Issue  »  
 
 
 
Data Elixir logo

Data Elixir, LLC
P.O. Box 21255
Boulder, CO 80308

Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions, send a note!