ISSUE 329 ยท March 30, 2021In the NewsMatrix Multiplication Inches Closer to Mythic GoalA recent paper set the fastest record yet for multiplying two matrices. But it also marks the end of the line for a method researchers have relied on for decades to make improvements. Platforms vs. PhDs: How tech giants court and crush the people who study themAlthough scraping public websites may be legal, Big Tech writes the rules for its own platforms. Some of those rules are being challenged in courts but meanwhile, the dynamic between Big Tech platforms and the people who study how those platforms use data is rife with tension. Sponsored LinkHow to get to a higher-performing model faster, with active learningCurious what active learning could do for your model? Alegion breaks down the three major advantages to using active learning as part of your model development. Alegion helps customers with their model development when models need to get better and better at granular classification of low confidence instances to get to higher quality, faster. Learn more. Tutorials, Projects & OpinionsHow We Built a Context-Specific Bidding System for Etsy AdsThis is a great post that walks through the considerations and algorithms of Etsy's online ad platform. Here's how they moved from making batch predictions once per day to dynamic predictions up to 12,000 times per second. The post is easy to follow and gets into details that aren't often shared. Unit testing Python code in Jupyter notebooksWhether you are a unit testing purist or you just want to sprinkle a few tests into your notebooks, here are a few options to consider. Includes links and sample code for each approach. The ghosts in the dataImplicit knowledge is knowledge that exists within expert communities but isn't written down. To be effective, it's key to understand what that knowledge is and how to access it. In her latest post, Vicki Boykis explores this "ghost" knowledge in the data world. GPT-3 Powers the Next Generation of AppsGPT-3 launched just nine months ago and already, it powers more than 300 applications with semantic search, summarization, sentiment analysis, content generation, translation, and more. Here's an update from the project including platform improvements and a selection of applications that show its range of capabilities. R vs. Python vs. JuliaThis article isn't about which is better. If you want to write code that's efficient, which language should you choose? Code & ToolsCreate Your Own VS Code ThemesCreate your own or install a theme from the gallery with this theme builder for VS Code. Data VisualizationDimension reduction 1This set of slides is an awesome introduction to Principal Components Analysis (PCA). The examples are easy to follow, there are worthwhile references at the end and, for R users, there's sample code too. Elevation ScanElegant approach for visualizing max/mean elevation across the world. This is a work of art in code. Sign up to get Data Elixir's data science newsletter in your Inbox >> Data Elixir is curated and maintained by Lon Riesberg. If you have questions or suggestions for the newsletter, just reply back to this email. To find specific content from prior issues or to research topics, check out the catalogued Archives on Data Elixir's Search Page >> |