ISSUE 315 · December 8, 2020InsightMost important statistical ideas of the past 50 years?This new paper explores eight major statistical ideas of the past 50 years, including overviews and discussion of what they each have in common, how they differ and what to expect over the next few decades. Sponsored LinkJoin the data science community at SIAM - new journal, book series, conference, and activity group!Society for Industrial and Applied Mathematics (SIAM) has launched a collection of new data science resources! The most recent additions are the SIAM Activity Group on Data Science (a place to network and collaborate!) and the SIAM Journal on Mathematics of Data Science. Learn more about SIAM’s data science offerings and stay connected to the community. Tutorials, Projects & OpinionsAirbnb-quality data for allGreat follow-on article to the recent data quality series from Airbnb. In this article, Jeremy Stanley, founder of Anomalo, shows how to build and maintain high quality data "without raising billions." Machine Learning model governance at scaleThe dynamic nature of machine learning makes model governance particularly challenging — especially at scale. This is a best-practices article from Microsoft that explores the issues and approaches. Code & Tools(Re-)introducing Distill for R MarkdownDistill is a package for R Markdown that makes it easy to create technical articles, websites, and blogs in the style of the Distill Machine Learning Journal. Output is clean, interactive and engaging. Here are the highlights for the new 1.0 release, including links to key resources. RipTableHigh performance 64 bit python analytics engine for numpy arrays with multi-threaded support. Enhances or replaces numpy or pandas and claims it can crunch numbers 1.5 to 10 times faster. Looking to Annotate Your Video Data?Alegion just opened up its powerful video annotation capabilities for self-serve. Label your video with complete end-to-end control. Upload, configure, and label with speed, accuracy, rich annotation, ML efficiencies, real-time playback and download your annotated data, all on your time. Resources2020’s Top AI & Machine Learning Research PapersThese short summaries of recent AI and Machine Learning research papers cover a wide variety of authors, topics and venues. Includes key points, diagrams and links for each paper. Data VisualizationA ggplot2 Tutorial for Beautiful Plotting in RAwesome ggplot2 tutorial with lots of examples. Includes a linked Table of Contents and useful resources at the end. Worth bookmarking. Why use a radial data visualization?Radial visualizations include circular layouts like pie charts, circular trees, sunbursts and weather wheels. They're an efficient way to present data but they can also be counter-productive. Here's a visual orientation of circular visualization options, why you might use them and the cases when clearly, there are better alternatives. And Finally...Data Elixir is curated and maintained by Lon Riesberg. For full-text search of prior issues, visit Data Elixir's Search Page. If you have suggestions or questions for the newsletter, just reply back to this email. Sign up to get Data Elixir's data science newsletter in your Inbox >> |