ISSUE 374 · February 15, 2022In the NewsML Becomes a Mathematical CollaboratorMathematicians often work together when they’re searching for insight into a hard problem. It’s a kind of a freewheeling collaborative process that seems to require a uniquely human touch. But in two new results, the role of human collaborator has been replaced in part by a machine... Sponsored LinkDelivering Accurate Ground Truth Data for AI/ML Models30+ years experience working with leading data-centric AI/ML models. With 3,500+ global SMEs and experience with any data type, we accelerate operations and advance models in record time. Scale your AI with your model's new secret weapon, Innodata. Tutorials, Projects & OpinionsData Distribution Shifts and MonitoringDeploying a model to production isn't the end of the process because the model's performance will degrade over time. In this easy-to-follow deep dive, Chip Huyen explores the issues, including data distribution shifts, monitoring, and typical causes of ML failures. This post is intended for a machine learning systems design course at Stanford. Privacy-preserving insurance quotesConcrete Numpy is an open-source python package that compiles various numpy functions into their Fully Homomorphic Encryption (FHE) equivalents. In other words, Concrete Numpy allows models to work with sensitive data while encrypted. This tutorial walks through basic concepts of how Concrete Numpy works and how to use it. Faster Python calculations with NumbaIf you're writing array-oriented Python code that uses For loops, it doesn't help that NumPy is fast because the For loops are in Python, so it's slow. Here's how numba can get you a 13x speed increase with just two lines of code. Read: The Best Tools for ETL in 2022Learn to efficiently integrate your data sources and get to analysis Code & ToolsD-TaleD-Tale is a visualization tool that makes it easy to view and analyze Pandas data structures. D-Tale supports a variety of pandas objects and it works seamlessly with Jupyter notebooks and python terminals. There's a lot of info here, including links to demos, tutorials and articles. Ask HN: Tools to visualize data in SQL databases?Nice discussion about the various tools that are available to visualize data in a SQL table. There are a variety of use-cases here and discussion of pros/cons for both off-the-shelf products and open-source options. ResourcesThe Effect: An Intro to Research Design and CausalityThis new book is a great introduction to design-based causal inference. The first half takes an intuitive approach to develop an understanding for research design. The second half is more technical and introduces a standard toolset for doing causal inference. The entire book is written in a conversational style that's easy to follow. Free to read online. Data VisualizationThis map went viral! Here's how to make it.There's a lot of data presented in this state by state, stream graph representation of population data in the U.S. It's part of a bigger project that was done for a law firm and it's super effective. Here's a step-by-step tutorial showing how to make maps like this using R. |