ISSUE 418 · January 3, 2023ResourcesSoccer Analytics 2022 ReviewAwesome roundup of ⚽ analytics content from 2022. Covers research papers, blog posts, podcasts, books, code repos and more! Statistical Rethinking 2023The latest version of Richard McElreath's Statistical Rethinking course starts this week. This is a popular, online course that teaches data analysis, with a focus on scientific models. The course prioritizes conceptual, causal models and uses Bayesian data analysis to connect scientific models to evidence. Lectures are posted online each week. Sponsored LinkWebinar: Don’t Get Lost in the SemanticsJoin us for a dynamic conversation between Anna Filippova, Director of Community & Data at dbt Labs, and Benn Stancil, co-founder and Chief Analytics Officer at Mode about how data teams can make the most of the Semantic Layer. RSVP Now. Tutorials, Projects & OpinionsComputing the Eigendecomposition and the Singular Value DecompositionIn part 5 of this series on Principal Component Analysis, Peter Bloem walks through three methods for computing the eigendecomposition and the singular value decomposition. Building these simple algorithms from scratch can teach a lot about what PCA actually does. How Shapley Values WorkShapley values - and their popular extension, SHAP - are machine learning explainability techniques that are easy to use and interpret but the theory can be intimidating to learn. This post explores how Shapley values work - not by using cryptic formulae, but through code and simplified explanations. Code & ToolsTop Python libraries of 2022Tryolabs' annual list of top Python libraries is consistently a must-read post. This well-researched list includes tools for distributed computing, putting notebooks in production, monitoring ML models, fast linting, profiling memory, interpretability, anomaly detection, and much more. RTutor - Talk to your data via AIRTutor enables users to interact with data via natural language. After uploading a dataset, users can ask questions about or request analyses in plain English. The app generates and runs R code to answer the questions with plots and numeric results. It can also explain statistical concepts and help users decide which tests to use. It's experimental but looks like a good tool for learning. CareerWhy Business Data Science Irritates MeA recent post called "Goodbye, Data Science" struck a nerve for a lot of people. In this post, another insider shares his own journey and frustrations. There's a lot of thought here, including practical ways to approach similar issues in your own career and workplace. Data VisualizationAnnotated Forest Plots using ggplot2Great tutorial that shows how to make annotated forest plots using ggplot2. This shows how to build them from scratch, without using packages like forester, forestplot, and ggforestplot. The approach outlined here gives you a lot of control and flexibility. Exploratory spatial data analysis with PythonKyle Walker's book, Analyzing US Census Data, has been a popular free download amongst Data Elixir readers but it's specifically for R users. In this first post of a new series, Kyle shows how to create some of his favorite examples from the book using Python. |