TLDR AI 2023-08-09

Generative AI in big tech 🧑‍💻, Stability AI launches StableCode 💻, LLMs as agents 🤖

🚀

Headlines & Launches

Stability AI Launches StableCode (4 minute read)

Stability AI announced the first public release of StableCode, its new open large language model designed to help users generate programming language code.

Roundup of Generative AI in Big Tech (10 minute read)

An awesome roundup covering what the big tech companies are up to in Generative AI, based on some of the discussions in their earnings calls for Q2. The reaction from big tech to AI has been swift, demonstrating the near term importance of capitalizing on this opportunity.

Train models on HuggingFace backed by AWS or Nvidia cloud (2 minute read)

HuggingFace is working with AWS and Nvidia to bring one click training to the platform. You can fine-tune state-of-the-art models directly from the hub just by uploading your data.

🧠

Research & Innovation

Evaluating the Capabilities of Large Language Models as Agents (GitHub Repo)

This project introduces AgentBench, a benchmark tool to test Large Language Models (LLMs) in various interactive settings. Initial tests across 25 LLMs showed that commercial models outperformed open-sourced ones.

How do examples in training influence a final model (4 hour read)

This behemoth from Anthropic explores the question of how training examples affect the final model performance using a counterfactual technique called "influence functions". It is a pretty technical paper that outlines many challenges with this approach and some surprising findings. Primarily, they find that instance influence decays to zero when the order of key phrases is flipped. Language models are sensitive!

A New Way to Enhance Image Quality (GitHub Repo)

Researchers have created a new method called the Dual Aggregation Transformer (DAT) that uses both space and channel attention to make image super-resolution better. DAT performs better than other current methods by using special tools like the adaptive interaction module and spatial-gate feed-forward network.

👨‍💻

Engineering & Resources

Mini SDXL implementation (GitHub Repo)

Ever wanted to peer into the depths of implementation of the new SDXL model? This diffusers compatible repo is just a few hundred lines of code. Perfect for learning.

Humanscript (GitHub Repo)

Humanscript is a script interpreter that infers the meaning behind commands written in natural language using large language models.

Sweep (GitHub Repo)

Sweep is an open-source AI junior developer that turns issues into pull requests. You make a GitHub Issue like "use os agnostic temp directory for windows" and Sweep writes a pull request to replace all occurrences of "/tmp" with "tempfile.gettempdir()".

🎁

Miscellaneous

MeowLearning (Online Course)

A pick-your-problem style guide that was created to educate everyone, from the leadership team to the ML engineers in a startup, on how to work with AI in production settings. This is the stuff you won't learn in most ML/AI courses.

Glaive raises $3.5m seed round for small synthetic models (4 minute read)

Glaive is a platform that lets people train hyper-focused small language models for their use case with the help of a synthetic data generation system. These small models are 2-20x faster, at least 2x cheaper, and outperform general purpose LLM APIs on the task they were trained for.

Swift Transformers for Apple devices (9 minute read)

A really nice tutorial on how to use Swift to deploy language models from the HuggingFace Hub.

⚡️

Join 500,000 readers for

Privacy Careers Advertise