ISSUE 447 · August 8, 2023Posts & TutorialsDo ML Models Memorize or Generalize?In 2021, researchers discovered "grokking," where tiny models suddenly shift from memorizing to generalizing unseen inputs. This interactive article explores this phenomenon and the emerging field of mechanistic interpretability, seeking insights into whether large language models generalize or merely memorize. LLMs, explained with a minimum of math and jargonIf you're new to large language models or looking for a good explainer to share with colleagues, here's an easy-to-follow, gentle primer. Sponsored LinkGenerative AI Skills ChallengeGreat ideas wanted! 💡 data.org is looking for innovative proposals on training and upskilling in generative AI to drive social impact. The Generative AI Skills Challenge will award funding and technical assistance to awardees -- click here to learn more and apply by August 15, 2023 (7:00 PM ET). Functions are VectorsConceptualizing functions as infinite-dimensional vectors lets you apply the tools of linear algebra to a vast landscape of new problems, from image and geometry processing to curve fitting, light transport, and machine learning. Great post! Jazz up your ggplots!Useful tricks for customizing ggplot design, with complete code examples to try on your own. Covers plot animation with gganimate, chart composition with cowplot, shapes with ggimage, annotations with geomtextpath, highlighting elements with gghighlight, special effects with ggfx, custom themes, and more. Log transforms, geometric means and estimating population totalsWhile log transformation can create robust models with lower heteroskedasticity and better compliance with standard assumptions, it could potentially distort population estimates. This post uses a practical example to show the possible consequences of log transformations, including diagnostic plots and estimations. Tools & CodeFinance ToolkitThe FinanceToolkit is an open-source toolkit for stock market analysis. It offers a comprehensive set of financial ratios, inidicators and performance ratios and all calculations are simple, clearly presented, and can be customized. This is an awesome resource for anyone interested in either learning about or working with finance data. Generative AI in JupyterJupyter AI brings generative AI to Jupyter notebooks, giving users the power to explain and generate code, fix errors, summarize content, ask questions about their local files, and generate entire notebooks from a natural language prompt. Resources🕶️ Awesome QuartoAwesome selection of Quarto docs, tutorials, talks, posts, tools and examples from around the web. ML⇄DB Seminar SeriesDatabases and machine learning are inextricably linked. Databases provide the storage for the vast volumes of data that's required by ML algorithms and, in turn, the ML algorithms infuse the databases with new capabilities. In this free seminar series, speakers from industry explore this growing convergence. |