TLDR AI 2023-08-09
Generative AI in big tech π§βπ», Stability AI launches StableCode π», LLMs as agents π€
π§
Research & Innovation
Evaluating the Capabilities of Large Language Models as Agents (GitHub Repo)
This project introduces AgentBench, a benchmark tool to test Large Language Models (LLMs) in various interactive settings. Initial tests across 25 LLMs showed that commercial models outperformed open-sourced ones.
How do examples in training influence a final model (4 hour read)
This behemoth from Anthropic explores the question of how training examples affect the final model performance using a counterfactual technique called "influence functions". It is a pretty technical paper that outlines many challenges with this approach and some surprising findings. Primarily, they find that instance influence decays to zero when the order of key phrases is flipped. Language models are sensitive!
A New Way to Enhance Image Quality (GitHub Repo)
Researchers have created a new method called the Dual Aggregation Transformer (DAT) that uses both space and channel attention to make image super-resolution better. DAT performs better than other current methods by using special tools like the adaptive interaction module and spatial-gate feed-forward network.
π¨βπ»
Engineering & Research
Mini SDXL implementation (GitHub Repo)
Ever wanted to peer into the depths of implementation of the new SDXL model? This diffusers compatible repo is just a few hundred lines of code. Perfect for learning.
Humanscript (GitHub Repo)
Humanscript is a script interpreter that infers the meaning behind commands written in natural language using large language models.
Sweep (GitHub Repo)
Sweep is an open-source AI junior developer that turns issues into pull requests. You make a GitHub Issue like "use os agnostic temp directory for windows" and Sweep writes a pull request to replace all occurrences of "/tmp" with "tempfile.gettempdir()".
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email