TLDR AI 2024-02-21

SoftBank’s $100B chip venture 🚀, GPT-4 new knowledge cutoff 🧠, fine-tuning with LoRA+ ➕

Headlines & Launches

GPT-4 Knowledge Cutoff is December 2023 (1 minute read)

According to the new documentation on the OpenAI platform, the training data for GPT-4 models has been updated to include information up to December 2023.

SoftBank Founder Masayoshi Son Aims To Raise $100 Billion For New Chip Venture (1 minute read)

Masayoshi Son's SoftBank Group Corp. is launching Izanagi, a $100 billion chip venture aimed at competing with Nvidia and focusing on AI applications.

Scribe $25M Series B (4 minute read)

Scribe has raised a Series B funding round led by Redpoint Ventures to accelerate its AI-driven platform, which automates the creation of visual step-by-step guides and facilitates knowledge sharing within organizations. Over a million teams use Scribe, including 97% of the Fortune 100. The company is expanding its AI features to make workplace information even more accessible.
Research & Innovation

Enhancing Fine-Tuning with LoRA+ (25 minute read)

This paper presents LoRA+, an advancement over the existing Low-Rank Adaptation (LoRA) method for fine-tuning large models. LoRA+ achieves better performance and faster fine-tuning, without increasing computational demands, by using different learning rates for key components in the process.

Generative Representational Instruction Tuning (24 minute read)

The Contextual team has trained and released a model that both generates text and embeddings. It dramatically outperforms single specialist models. The model is an interesting take on the multi-modal trend where the output modality is an embedding.

DeepDive: Mamba The Hard Way (30 minute read)

Sasha Rush has released an annotated tutorial for accelerating Mamba with custom Triton kernels. It doesn’t scale yet due to a bug in the Triton compiler, but it is an extreme illustration of the technology and great for those looking to dive deeper into the state space Transformer alternative world.
Engineering & Resources

Generating Images at Any Resolution (GitHub Repo)

The Flexible Vision Transformer (FiT) is a novel architecture designed to create images at any resolution and aspect ratio. Unlike traditional models, FiT treats images as sequences of variable-sized tokens, allowing it to adapt to different image sizes more effectively during training and inference.

Boosting AI's Defense Against Adversarial Attacks (GitHub Repo)

This project introduces a new method to strengthen multi-modal models like OpenFlamingo and LLaVA against visual adversarial attacks. By fine-tuning the CLIP vision encoder in an unsupervised manner, the authors effectively shield these models from manipulative image attacks, enhancing their reliability and security in real-world applications without the need for retraining the entire model.

3D object from as few as 4 pictures (GitHub Repo)

This repository allows you to take four images and turn them into a high-quality 3d representation with Gaussian Splatting.

How To Lose At Generative AI (7 minute read)

Generative AI, while hyped, is likely to disappoint most startups because it favors incumbents with data advantages, existing workflows, and the ability to integrate AI into these systems without major overhauls. Despite venture capital flowing into the GenAI space, startups focusing on prompt engineering and UX improvements at the workflow layer are essentially prepping the market for incumbents who can easily adopt and integrate AI innovations into their dominant platforms, suggesting a challenging path ahead for startups aiming to capture significant value in the Generative AI domain.

New LLM Benchmark (12 minute read)

Preeminent researcher Nicholas Carlini has released the benchmark he uses for evaluating large language model performance. Interestingly, it has GPT-4 further ahead than most other benchmarks.

Strategies For An Accelerating Future (5 minute read)

The recent advancements in AI, particularly with Google's Gemini offering a context window of over a million tokens and Groq's hardware enabling almost instantaneous responses from GPT-3.5 models, signify a major leap forward in practical AI applications and underscore the urgency for leaders to understand and adapt to AI's rapidly evolving landscape.
Quick Links

AdGen AI (Product Launch)

AdGen AI confronts the chaos of traditional ad creation, offering a streamlined, AI-driven solution. Generate 100+ ad variations from a single URL in minutes.

BoCoEL (GitHub Repo)

Accurately evaluate LLMs with Bayesian Optimizations.

Amazon AGI Team Say Their AI Is Showing “Emergent Abilities” (2 minute read)

Amazon AGI researchers developed a language model named "Big Adaptive Streamable TTS with Emergent Abilities" (BASE TTS) that shows "state-of-the-art naturalness" in conversational text, demonstrating language skills it wasn't specifically trained on.
The most important AI, ML, and data science news in a free daily email.
Join 500,000 readers for