— In the News —
Great interview with Hadley Wickham about the current state of R and where it's going, including insights about the different cultures of R and Python users.
DeepMind, likely the world’s largest research-focused artificial intelligence operation, is losing a lot of money fast. Does this mean that AI is falling apart? "Probably not," says Gary Marcus. But that money might be better spent elsewhere.
— Tools and Techniques —
Dagster is an open-source Python library for building data applications like ETL processes and ML pipelines. This is a great introduction to the library's origins, why it's important and how to get started. This is worth spending some time with. For tutorials and examples, see the Github Repo.
The inspection paradox is a statistical illusion you’ve probably never heard of. It’s a common source of confusion, an occasional cause of error, and an opportunity for clever experimental design. And once you know about it, you see it everywhere.
Data Scientists learn to avoid loops and recursion because they make Python and R code slow. In this post, Daniel Moura shows how Julia sets you free.
Snorkel is a Python library for programmatically building and managing training datasets. In their latest update, the Snorkel Team walks through key new features, new tutorials, and the road ahead.
Thrive in the fast-growing world of analytics with the Global Master of Management Analytics from Smith School of Business. Earn your degree while you work from anywhere in the world.
— Resources —
This is super niche but it's super interesting too: DNA interpretation, firearms analysis, fingerprints, shoe impressions, etc. Sections are easy to follow with case studies, code snippets, screenshots and linked references throughout.
— Data Viz —
This curriculum of notebooks on the Observable platform shows how to use Vega-Lite for an Interactive Grammar of Graphics in the browser. Covers visual encoding, data transformation, interaction, maps, and more. There's also a corresponding collection of notebooks for Python. This is top-notch work from the University of Washington's Interactive Data Lab.
PassSonar is a creative approach for displaying ⚽ analytics that's fairly new but gaining ground quickly. This introduction shows where they came from and how they're evolving.
— Career —
Great post that explores scraped LinkedIn data to identify the education and experience of people who work as "Data Scientists." The intent is to figure out what people who are successfully working in the field have actually done. This is well-written with useful insights along the way.