TLDR AI 2024-02-09

Gemini Advanced 🚀, 1X robotics demonstration 🤖, transformer-based decision tree 🌲

Headlines & Launches

Gemini Advanced and a Mobile App (6 minute read)

Google launched a new service that allows you to finally interact with Gemini Ultra 1.0 - it will charge a monthly subscription for access to the model. There is also a companion mobile app.

1X robotics demonstration (4 minute read)

1X is a robotics company that has made strides in video-to-control models. The company has showcased its robot performing many tasks, all driven by neural networks that emit 10hz control signals from video input.
Research & Innovation

Boosting AI Agents' Gameplay with Task Guidance (19 minute read)

This paper explores the development of a generalist AI agent capable of understanding and following gameplay instructions, a step towards "read-to-play" abilities. The researchers enhance the agent's multitasking and generalization skills by integrating multimodal game instructions into a decision transformer.

Transformer-based decision tree (GitHub Repo)

MetaTree is a transformer-based decision tree algorithm. It learns from classical decision tree algorithms for better generalization capabilities.

FunSearch: Making new discoveries in mathematical sciences using Large Language Models (7 minute read)

FunSearch, a new AI-powered method that combines Large Language Models with evaluative algorithms, has made verifiable discoveries in mathematical sciences, including solutions for the longstanding cap set problem and more efficient algorithms for the bin-packing problem. It introduces evolutionary approaches to generating and evaluating code that offers human-interpretable outputs, representing a significant leap in AI-driven scientific discovery.
Engineering & Resources

HuggingFace lighteval Library (GitHub Repo)

HuggingFace released a lightweight evaluation library for language model training based on HELM and Eluther AI’s evaluation harness.

Local RAG Cookbook (GitHub Repo)

You can build a sophisticated and powerful RAG system that runs on your hardware using Ollama, pgvector, and local data.

A Vision-Language Model with Enhanced Visual Reasoning (GitHub Repo)

CogCoM, a new general vision-language model, incorporates a unique Chain of Manipulations mechanism. This allows it to handle multi-turn visual reasoning by actively adjusting input images.

How We Got Fine-Tuning Mistral-7B To Not Suck (5 minute read)

HelixML was able to better fine-tune Mistral-7B by implementing a suite of qapair prompts that extracted content from all sorts of different perspectives and generating a content-addressed hash for each document.

Google’s Gemini Advanced: Tasting Notes And Implications (7 minute read)

Google's newly released Gemini Advanced, a GPT-4 class AI model, showcases capabilities equivalent to OpenAI's GPT-4. It shows strengths in explanations and integrating images and search.

An angel investor comments on AI (2 minute read)

This point of view, from an investor, helps to bring a little context to the layers of value within AI right now by breaking them down into the infrastructure layer, like cloud providers and chip makers, modeling and core, like OpenAI and Anthropic, and AI-enhanced products, like all of you that use AI to improve your products.
Quick Links

AR Glasses With Multimodal AI Nets Funding From Pokemon GO Creator (3 minute read)

Singapore-based Brilliant Labs has introduced Frame, lightweight AR glasses featuring a multimodal AI assistant, Noa, that performs visual processing, image generation, and more with integrated AI models like GPT-4 and Stable Diffusion.

Shortwave mobile assistant & instant summaries (Product Launch)

Shortwave released a mobile email assistant for iOS & Android that can create instant summaries during its AI Launch Week last week.

Anime Bench (Hugging Face Hub)

A benchmark dataset that contains facts about various anime characters and quotes to assess language model performance.
The most important AI, ML, and data science news in a free daily email.
Join 500,000 readers for