ISSUE 415 · December 6, 2022
Extreme numbers get new names
Prolific generation of data has led to the creation of new prefixes. Guess how many yottabytes are in a quettabyte...
Python and the Future of Programming
Guido van Rossum is the creator of the Python programming language. In this interview with Lex Fridman, he discusses the current state of Python, fads, IDEs, machine learning, and what to expect in Python 4.0.
Combine AI models with the Pinecone vector database to make your applications understand and act on what your users want… without making them spell it out. Pinecone provides the cloud infrastructure that makes vector search easy, fast, and scalable. Understand more about Pinecone and try it →
Tutorials, Projects & Opinions
Building Airbnb Categories with Machine Learning and Human-in-the-Loop
Airbnb recently flipped the travel search experience on its head by having their inventory dictate user destinations. In the new scheme, users aren't asked where they want to go. Instead, they're presented with categories of destination types. It may sound simple but there's a lot going on behind the scenes to make it work. This first post in a series explores how ML is used to build the listings and solve related tasks.
The cloudy layers of modern-day programming
If you're building in the cloud these days, your work is hindered by millions of dependencies, in fragmented and proprietary environments that you can't introspect. Even smallish side projects aren't immune. This is a great post that peels back the layers to show how development has become so frustrating and how, ultimately, it's not all bad.
How to Lead from The Center: A Guide for Data Leaders
As an industry, we’ve published countless articles, papers, videos, and diagrams about the modern data stack—describing what it is, what it isn’t, where it’s going, and how it’s evolving. Download this guide to learn how to turn your data into your true competitive advantage in a new era of BI transformation.
Tools & Code
mvSQLite is an open-source distributed database with full SQLite compatibility. There are already many nice "multi-machine" SQLite flavors: rqlite, dqlite, and Litestream. What mvSQLite offers is unique: it is not just replicated but really distributed, it offers not only read but also write scalability and it provides the strictest consistency.
Kùzu is an in-process property graph database management system (GDBMS) built for query speed and scalability. Kùzu is optimized for handling complex join-heavy analytical workloads on very large graph databases. Kùzu is being actively developed at University of Waterloo and is available under a permissible license.
Becoming An Analytics Engineer in 2023
Nice overview of the Analytics Engineer role, especially for data people who are thinking about a career move. Includes discussion of the specific skills and tools you need to know and links to key resources.
ChatGPT: Optimizing Language Models for Dialogue
Last week, OpenAI introduced a language model called ChatGPT that interacts in a conversational way and can carry on a dialogue. Already, there have been more than a million users and examples being shared online are impressive. This announcement offers a high-level view of how it works. Follow the links for details and demos. For interesting examples, check these out:
Hiring? Join the Data Elixir Talent Collective and get regular drops of outstanding data practitioners and leaders who are open to new opportunities.