TLDR AI 2024-05-23

OpenAI & News Corp 🤝, Stanford HAI Transparency Index 📖, Microsoft Phi-Silica 🤖

Headlines & Launches

OpenAI Partners with News Corp (2 minute read)

OpenAI and News Corp (which includes WSJ, NYP, The Times, and other publishers) agreed to bring its news content to OpenAI's platform, which they claim will increase the usability and accuracy of generations.

Stanford HAI Releases Updated Foundation Model Transparency Index (7 minute read)

Stanford HAI released the latest version of its Foundation Model Transparency Index, which evaluates the transparency of 14 major AI developers, including OpenAI and Google. These companies disclosed new information previously unavailable to the public, marking a significant improvement and willingness to open up the discourse about their models. Despite this progress, the average transparency score was just 58 out of 100, highlighting significant gaps in areas like data access, model trustworthiness, and downstream impact.

Microsoft Releases Phi-Silica (1 minute read)

Microsoft has announced the general availability of its Phi-3 models and introduced Phi-3-Silica, a small language model optimized for Neural Processing Units in Copilot+ PCs. Phi-3-Silica, with 3.3 billion parameters, offers fast local inferencing with low power consumption, marking a significant step in integrating advanced AI directly into Windows devices to enhance productivity and accessibility. It will be available in June.
Research & Innovation

Aurora Atmospheric Prediction Model (12 minute read)

Microsoft has trained a foundation model for atmospheric predictions which has set a new state-of-the-art on 5 and 10 day global weather prediction tests.

Introducing MathBench: A Comprehensive Math Benchmark for LLMs (24 minute read)

MathBench is a new benchmark designed to provide a thorough assessment of large language models' mathematical abilities.

A Faster Neural Network Architecture (20 minute read)

Researchers have developed Wav-KAN, a neural network framework that uses wavelet functions to improve interpretability and performance. Unlike traditional models, Wav-KAN captures both high and low-frequency data components, leading to faster training and increased robustness.
Engineering & Resources

Powerful Vision Models on Your Phone (GitHub Repo)

MiniCMP-V has a new version trained on top of Llama 3. This 8B model outperforms many closed proprietary models on a wide variety of tasks. It can handle 30 different languages and excels at OCR and visual question answering.

Improving Medical AI Accuracy (3 minute read)

MedLFQA is a new benchmark dataset designed to improve the factual accuracy of long-form responses from large language models in the medical field. OLAPH is a framework that trains LLMs to reduce inaccuracies by using automatic evaluations and preference optimization.

Do we really need Mamba for vision (GitHub Repo)

Mamba is a strong Transformer alternative that has been gaining steam for its ability to use fewer FLOPs while maintaining performance. However, it may not be necessary for some applications. This work shows that a well-tuned CNN baseline outperforms Mamba on a set of vision tasks.

Chaos and tension at OpenAI (2 minute read)

Ilya Sutskever has departed from OpenAI amidst concerns over the company's commitment to AI safety, signaling a potentially worrying trend as three other key personnel have also recently resigned. These departures raise questions about the impact on the company's safety-focused mission and its nonprofit status as it pursues commercialization. These events may also reverberate through legal and regulatory landscapes, prompting scrutiny from stakeholders in Washington.

Tarsier (GitHub Repo)

Reworkd has released Tarsier, a new tool to enhance LLMs for web interaction tasks by visually tagging elements on webpages using brackets and IDs. Tarsier allows an LLM without vision to understand a webpage's structure via OCR-generated text representations outperforming vision-language models in benchmarks.

The Old-Fashioned Library at the Heart of the A.I. Boom (4 minute read)

OpenAI's headquarters, a renovated mayonnaise factory with a library-themed design, is emblematic of the company's language-focused success with ChatGPT. The office contrastingly serves as a reminder of the ongoing legal debates over the use of copyrighted material in AI training. Despite these disputes, OpenAI staff view the library as a space for inspiration, reinforcing their belief in the synergy between human and AI-driven creativity.
Quick Links

Mistral 7B Instruct V3 (Hugging Face Hub)

Mistral has released the next iteration of its 7B model, which has extended context length and improved performance.

Suno AI raises $125m (4 minute read)

Suno, a music generation platform, has raised $125 million to continue building a future where anyone can create music.

The ChatGPT desktop app is more helpful than I expected (4 minute read)

OpenAI has launched a ChatGPT desktop app for macOS, enabling Plus subscribers immediate access with plans for a broader rollout.
The most important AI, ML, and data science news in a free daily email.
Join 500,000 readers for