ISSUE 447 · August 8, 2023
In 2021, researchers discovered "grokking," where tiny models suddenly shift from memorizing to generalizing unseen inputs. This interactive article explores this phenomenon and the emerging field of mechanistic interpretability, seeking insights into whether large language models generalize or merely memorize.
PAIR Explorables | Adam Pearce, et al.
If you're new to large language models or looking for a good explainer to share with colleagues, here's an easy-to-follow, gentle primer.
Understanding AI | Timothy B Lee and Sean Trott
Great ideas wanted! 💡 data.org is looking for innovative proposals on training and upskilling in generative AI to drive social impact. The Generative AI Skills Challenge will award funding and technical assistance to awardees -- click here to learn more and apply by August 15, 2023 (7:00 PM ET).
Conceptualizing functions as infinite-dimensional vectors lets you apply the tools of linear algebra to a vast landscape of new problems, from image and geometry processing to curve fitting, light transport, and machine learning. Great post!
Useful tricks for customizing ggplot design, with complete code examples to try on your own. Covers plot animation with gganimate, chart composition with cowplot, shapes with ggimage, annotations with geomtextpath, highlighting elements with gghighlight, special effects with ggfx, custom themes, and more.
While log transformation can create robust models with lower heteroskedasticity and better compliance with standard assumptions, it could potentially distort population estimates. This post uses a practical example to show the possible consequences of log transformations, including diagnostic plots and estimations.
free range statistics | Peter Ellis
The FinanceToolkit is an open-source toolkit for stock market analysis. It offers a comprehensive set of financial ratios, inidicators and performance ratios and all calculations are simple, clearly presented, and can be customized. This is an awesome resource for anyone interested in either learning about or working with finance data.
Jupyter AI brings generative AI to Jupyter notebooks, giving users the power to explain and generate code, fix errors, summarize content, ask questions about their local files, and generate entire notebooks from a natural language prompt.
Jupyter Blog | Jason Weill
Awesome selection of Quarto docs, tutorials, talks, posts, tools and examples from around the web.
GitHub | Mickaël Canouil
Databases and machine learning are inextricably linked. Databases provide the storage for the vast volumes of data that's required by ML algorithms and, in turn, the ML algorithms infuse the databases with new capabilities. In this free seminar series, speakers from industry explore this growing convergence.
Carnegie Mellon University Database Research Group