OpenAI is planning to release a new AI product called "Strawberry" in the fall. It will feature advanced reasoning capabilities, such as the ability to solve previously unseen math problems, and can perform high-level tasks like developing market strategies.
The 3rd part of a series, this paper explores scaling laws and how many bits are required to store knowledge in a model. The answer seems to be around 2 bits of knowledge per parameter.
Boson AI has introduced Higgs-Llama-3-70B-v2, a new model that excels in dialog and comprehension benchmarks like Arena-Hard and AlpacaEval 2.0. The model reduces response regeneration rates by 21.6% and boosts day 1 retention by 5.3% compared to Claude 3.5 Sonnet. Enhanced by an in-house reward model, Higgs Judger, it ties with Google's Gemini 1.5 Pro on performance.
Pre-training hybrid (Mamba style) models is different from pre-training normal Transformers. This post explores how to scale different hyperparameters, data acquisition, and others to get the performance you want.
This is a framework that fine-tunes small models to serve as a fallback when closed API models go down. It shows how to migrate from a large to a small model smoothly.
LitServe is an easy-to-use, flexible serving engine for AI models built on FastAPI. Features like batching, streaming, and GPU autoscaling eliminate the need to rebuild a FastAPI server per model.
Customers don't always use the words you expect them to, which tends to break keyword-based solutions for knowledge base and in-product search. Luckily, genAI does much better. Make your help center actually helpful with Ask AI's Generative Search. Minutes to deploy, no developer needed. Get 2 months free
While large language models (LLMs) may not significantly improve in reasoning capabilities, their decreasing costs and increasing speeds will make them more useful for repetitive tasks. Although these models may lack true understanding, they can still efficiently handle straightforward tasks.
Llava BitNet is the first ternary (-1, 0, 1) weight model trained on VLM tasks. The model, weights, and scripts are in the process of being fully open sourced. The technical report will be released soon and suggests the model has promising performance.
Using a number of models in sequence, this group was able to create an amazing model that generates fully playable 3D game scenes based on a single input sketch.