ISSUE 424 · February 14, 2023InsightsBig Data Is DeadMost data isn't very big. And if it is, you probably don't query it all once. And even if you do, the data probably still fits on a single machine. And if it doesn't, there's a good chance you don't actually use it all. Why are data tools built for the remaining 1%? There a lot of gems here. ChatGPT Is a Blurry JPEG of the WebThis essay by Ted Chiang offers a useful perspective for understanding the tech that drives ChatGPT. OpenAI’s chatbot offers paraphrases, whereas Google offers quotes. Which do you prefer? Tutorials, Projects & OpinionsData-Free DisneylandBetween browsers, cell phones, credit card tracking, facial recognition, and license plate readers, it's hard to be incognito in 2023 — even at Disneyland! This is a great post that walks through a typical family's data exhaust and ways to obfuscate it. GPT in 60 Lines of NumPyThis GPT-from-scratch tutorial is a technical introduction to GPT that's intended to be as simple — and hackable — as possible. There's a lot here, including a code repo and clear descriptions of GPT architecture and how everything works. Data Mishaps Night - February 23, 7:00 pm CSTThe third annual Data Mishaps Night is coming up in just two weeks! This free online event features a lineup of data mistake stories that focus on the human aspect of data work with lessons learned along the way. It looks like it will be worthwhile — check out the lineup! Tools & CodePySportsPySports is a curated collection of sports-related open-source analytics projects. Includes python libraries and R packages for accessing, analyzing, modeling, and visualizing data from a wide variety of sports, such as baseball, basketball, football, hockey, cycling and more. Data VisualizationPopulation Around a PointThis site calculates the human population that's within a selected radius from any point in the world. Includes a detailed rundown of where the data comes from and how the site works. There are ~95K people inside of a 5km radius from me. How about you? The mapping software you didn't know you neededGreat introduction to the open-source geospatial data system called QGIS. It starts with a real world use-case that shows why QGIS is worth knowing about and then continues with a detailed tutorial that explores available datasets, how to access that data, build maps, add layers, customize maps, and then take your maps into the field. |