ISSUE 316 · December 15, 2020Phew. 2020 is almost over! I've never been so excited for a year to end. To celebrate and reflect on new directions for 2021, Data Elixir will be taking the next 2 weeks off. I hope you and all your families have a great holiday season! Cheers, InsightYou Can’t Escape Hyperparameters and Latent Variables: ML as a Software Engineering EnterpriseGreat "keynote" from NeurIPS 2020 about the past, present and future of ML research — especially as it pertains to algorithmic bias. It's 45 minutes but don't let that deter you. This is a well-paced presentation that's full of insightful conversations with researchers, engineers, and data scientists. Starts at 29:04. Sponsored LinkLearn R, Python, and SQL with Dataquest, the data science platform recommended by 97% of learners.No video lectures, no fill-in-the-blank quizzes. Learn through high-quality, step-by-step courses developed by a team of data experts. Sign up for free to start analyzing data today -- all in your browser. Tutorials, Projects & OpinionsEasier Code Reviews For Jupyter NotebooksThe raw content of Jupyter Notebooks is a mix of dissimilar source code, Markdown, and HTML, making Jupyter notoriously challenging for code reviews. This tutorial shows how to use an open-source utility called nbautoexport to simplify the process. Why Some Models Leak DataMachine learning models use large amounts of data, some of which can be sensitive. If the models aren't trained correctly, sometimes that data is inadvertently revealed. This interactive essay shows how. Data Catalogs Are Dead; Long Live Data DiscoveryData catalogs aren't cutting it anymore when it comes to metadata management and data governance. Here's how data discovery can help. Financial Times Data Platform: From zero to heroGreat walk-through of the evolution of the Financial Times data platform. Includes detailed timelines and decisions along the way. End-to-end Production ML MonitoringThis post bills itself as a "deep dive" into production machine learning monitoring but it's organized in a way that also makes it easy to get a high level view of what ML monitoring is all about. Covers outlier detectors, drift detectors, metrics servers and explainers. AMAX: Revolutionary GPU solutions for data insights at desksideThe fastest time-to-results for deep learning training and visualization - AMAX GPU workstations with latest AMD EPYC™ Processors provide unprecedented data-transfer speed and performance for data and graphic intensive applications. Special pricing for academia/startups. Kick start your AI insights
> ResourcesAdvanced Data Science 2020This GitBook accompanies a semester-long course that focuses on the "hard" part of data science. The assumption is that you already have a background in statistics, programming and the basics of reproducible research. The focus in this course is on synthesizing those tools into a data analysis and then communicating that analysis to an audience. Papers with Code is ExpandingPapers with Code recently launched new sites for statistics, math, computer science, and more. Use these sites to discover new projects and/or sync your code to show on arXiv. See the post for details. Project PickBlob OperaBlog Opera is a surprisingly addictive machine learning experiment by David Li in collaboration with Google Arts and Culture. Nothing to learn here. Just a fun machine learning toy to brighten up your holidays! ✨ Data Elixir is curated and maintained by Lon Riesberg. For full-text search of prior issues, visit Data Elixir's Search Page. If you have suggestions or questions for the newsletter, just reply back to this email. Sign up to get Data Elixir's data science newsletter in your Inbox >> |