TLDR AI 2023-11-13
Google to invest in Character.AI π°, FigJam AI ποΈ, adversarial attacks on language models π
Google In Talks To Invest Hundreds Of Millions Into Character.AI (2 minute read)
Google is in talks to deepen its relationships with Character.AI by investing hundreds of millions of dollars.
Zapier AI Actions (2 minute read)
Zapier introduced AI Actions, a tool for developers to let any AI platform run Zapier's 20,000+ automation actions. AI Actions works by letting users send natural language commands to the AI platform, which then performs the desired action. The service supports several AI platforms, with simple setup and inherent API integrations.
Introducing AI to FigJam (3 minute read)
Figma has incorporated AI assistance into FigJam, its digital whiteboard tool, to simplify and enhance design collaborations. Utility-oriented enhancements, like those derived from the AI-powered project Jambot, help users collaborate more effectively on a virtual canvas. Figmaβs goal is to broaden the applicability across various user requirements by tapping machine learning capabilities for visual design.
π§
Research & Innovation
Enhancing Audio-Visual Models with a New Attention (GitHub Repo)
This project introduces the Dual-Guided Spatial-Channel-Temporal (DG-SCT) attention mechanism that enhances pre-trained audio-visual models for multi-modal tasks.
Deep Dive: Adversarial attacks on language models (24 minute read)
This blog post is about different attacks that are emerging against language model systems. It contains excellent information about different types of attacks and some mitigations that teams have found to be effective.
Stylization of 3D Meshes Using 2D Diffusion Models (16 minute read)
This research presents the 3DStyle-Diffusion model, a novel approach for detailed stylization of 3D meshes, integrating 2D Diffusion models for added control over appearance and geometry. It works by first parameterizing a 3D mesh's texture into reflectance and lighting, using implicit MLP networks, and then using a pre-trained 2D Diffusion model to align the rendered images with the text prompt and ensure geometric consistency.
π¨βπ»
Engineering & Research
Building trustworthy AI through systematic testing (Sponsor)
Audio super resolution (GitHub Repo)
Audio super-resolution is the process of increasing the quality and fidelity of any audio, real or synthetic. Most super-resolution systems are task-specific, with single models trained for single audio data types (e.g., speech vs music). This new work is an amazing step forward where a single model can serve to increase the quality of audio across tasks.
Toolkit for web agents (GitHub Repo)
With the advent of powerful new vision models, many groups are attempting to build agents that use vision to interact with web elements. Tarsier toolkit introduces a standard set of tools (e.g., element tagging). You can use any vision system to understand the web page and take action. It also includes utilities for non-vision language models to browse.
HuggingFace alignment handbook (12 minute read)
With the recent release of the excellent Zephyr language model, the HuggingFace team showcases how you can train personalized models built on top of the few powerful pre-trained open source models available.
Get the most interesting AI stories and breakthroughs delivered in a free daily email.
Join 920,000 readers for
one daily email