TLDR AI 2024-05-17

OpenAI & Reddit partnership 🀝, Hugging Face $10M compute πŸ’°, Llama 3 in NumPy πŸ’»

Headlines & Launches

OpenAI Partners With Reddit To Integrate Unique User-Generated Content Into ChatGPT (3 minute read)

Reddit and OpenAI have announced a strategic partnership that aims to revolutionize user interactions with online communities through AI-powered features. OpenAI will gain access to Reddit's Data API, enhancing ChatGPT with real-time, relevant information from Reddit's vast content.

Hugging Face Is Sharing $10 Million Worth Of Compute (3 minute read)

Hugging Face is committing $10 million in free shared GPUs to help developers, academics, and startups create new AI technologies, aiming to counteract the centralization of AI advancements dominated by tech giants.

Meta is reportedly working on camera-equipped AI earphones (2 minute read)

Meta is reportedly working on AI-powered earphones equipped with cameras. Internally codenamed 'Camerabuds', the earphones will leverage AI capabilities for real-time object identification and foreign language translation. Meta's leadership sees AI-powered earphones as the next logical step in the evolution of wearable technology. It has partnered with Kansas-based electronics company Ear Micro to explore the possibilities of this emerging technology.
Research & Innovation

Pose Estimation with Dense Correspondence (16 minute read)

A new approach to 6DoF pose estimation that uses single RGB-D images has been developed. It focuses on dense correspondence instead of traditional keypoint-based methods.

Cursor's instant full file edits with speculative editing (12 minute read)

Researchers used a custom Llama 3 70B model with a speculative prior to edit files at 1,000 tokens per second, giving them near instant rewrites. They did this without diffs and with some clever output formatting.

Improvements to Data Analysis (8 minute read)

ChatGPT has launched a number of quality improvements to GPT-4o's ability to read and analyze data. It now supports the ability to view tables, view updates, visualize the changes, and perform general data analysis more quickly than was previously possible.
Engineering & Resources

Blind Image Super-Resolution (GitHub Repo)

CDFormer is a transformative approach to blind image super-resolution that integrates content and degradation understanding through a novel diffusion-based module.

A Lightweight Vision-Language Model (GitHub Repo)

Xmodel-VLM is a vision language model optimized for consumer GPU servers. Addressing high service costs that limit large-scale multimodal systems' adoption, this 1B-scale model uses the LLaVA paradigm for modal alignment. (GitHub Repo)

A pure NumPy implementation for Llama 3 models.

ChatGPT-4o Vs Math (10 minute read)

This article tests GPT-4o's ability to solve a math problem involving a roll of tape, finding that text-only prompts with zero-shot Chain-of-Thought prompt engineering yield the most consistent and accurate results. Despite its capabilities, GPT-4o still struggles with understanding finer details in images.

Ask HN: Disillusioned after AI? (Hacker News Thread)

In this thread, developers discuss how they feel about AI technology and its impact on their careers. Advances in AI may leave some developers feeling disillusioned. Some developers feel that whatever they build will just get gobbled away by some big tech company. AI is de-democratizing tech in a big way - a handful of giga-scale companies hold all the cards. The field is still very much in its hype phase, so while the technology is impressive, there is still a lot it cannot do.

OpenAI Rules The Changes But Meta Changes The Rules (7 minute read)

Meta's release of the Llama 3 AI models, along with a 405B-parameter version still in training, is transforming the AI landscape by making advanced models publicly accessible, challenging OpenAI's business model based on private, paid access. This strategic move by Meta leverages open-source innovation to potentially redefine industry standards and enhance its position in a market dominated by monetized AI services.
Quick Links

3D Scene Completion for Autonomous Vehicles (GitHub Repo)

Researchers have developed a new technique to fill in the gaps in 3D LiDAR scans, allowing autonomous vehicles to better understand their surroundings.

Stable Artisan (Product)

A multimodal generative AI Discord bot that utilizes the products on the Stability AI Platform API within the Discord ecosystem.

Personalized Image Generation (3 minute read)

MasterWeaver is a new method to improve personalized text-to-image generation models.
The most important AI, ML, and data science news in a free daily email.
Join 500,000 readers for