Fly simulations, Nemotron 3, Gemini Embedding, MatAnyone, WorldFM, FishAudio: AI NEWS

Welcome to the AI Search newsletter. Here are the top highlights in AI this week.

Mar 15, 2026

Gemini Embedding 2 is a new multimodal embedding model that can map text, images, videos, and audio into a single embedding space. This allows for more efficient and effective multimodal processing and can be used for a variety of applications, including search and recommendation systems. The model is designed to be flexible and can handle a variety of input types. Read more

Holi-Spatial is a new model for holistic spatial intelligence that can generate 3D scenes and videos. It uses a combination of 2D and 3D techniques to generate consistent and realistic scenes. The model can be used for a variety of applications, including video game development and architectural visualization. Read more

MatAnyone 2 is a new tool that can perfectly cut out moving objects from videos by judging its own work quality. It uses a special "evaluator" to check if the edges of an object are blurry or sharp, which helps it learn from millions of real-world video frames. This makes it much better at handling long videos and complex backgrounds than previous versions. Read more

Do you prefer to watch instead of read? Check out this video covering the top AI news this week:

NVIDIA’s Nemotron-3 Super is a new AI "brain" designed to help robots and digital agents reason through difficult tasks. It uses a hybrid design that combines two different types of AI architectures to be both fast and incredibly smart at solving problems. It is specifically built for "agentic reasoning," which means it's great at following multi-step plans. Read more

EffectMaker is a new model for visual effects that can generate realistic and consistent visual effects. It uses a combination of deep learning and computer vision techniques to generate high-quality effects. The model can be used for a variety of applications, including film and video production.
Read more

ComfyUI has introduced a new "App Mode" that lets creators turn their complex AI workflows into simple, user-friendly apps. Instead of seeing a messy web of wires and nodes, a user only sees a few simple buttons and sliders to generate their art. This makes it easy for anyone to use advanced AI tools without needing to be a technical expert. Read more

We’re partnering with NVIDIA to giveaway a GeForce RTX 5090 GPU! Simply register and attend any session in their upcoming GTC event, from March 16 to 19, 2026. Free online sessions available. Enter here

RL3DEdit is a smart system that lets you edit 3D scenes using simple text instructions while keeping everything looking realistic. It uses “reinforcement learning” to practice making changes that stay consistent from every angle, so a chair doesn’t look different when you walk around it. It is over 20 times faster than older methods, making 3D editing much more practical. Read more

InSpatio-WorldFM is an AI that turns a single photo into a 3D world you can actually walk through in real-time. Unlike traditional video models that can be slow and “hallucinate” weird details, this model uses 3D “anchors” to make sure the room stays exactly the same as you move the camera. It is designed to run fast even on normal home gaming computers. Read more

Fish Audio has open-sourced S2, a powerful model that can mimic almost any voice with just a few seconds of audio. It is designed to be highly realistic, capturing the unique emotions and tone of a speaker’s voice. By making it open-source, they are allowing developers to build new tools for gaming, translation, and accessibility. Read more

The CL1 biological computer has successfully trained 200,000 living human neurons to play the classic video game Doom. This "wetware" system works by turning game data into electrical pulses that stimulate the neurons, which then fire back to control movement and shooting. While the neurons currently play like a total beginner, they are showing clear signs of real-time learning and goal-directed behavior.

TADA is an open-source AI from Hume that can generate text and speech at the exact same time. Because it processes text and audio together, it avoids the "stuttering" or "hallucinating" errors where an AI says something different from what it writes. It is much faster than traditional systems and can run directly on a laptop. Read more

Researchers have built a digital simulation of the fly brain that can run on a laptop and accurately predict real insect behavior. By plugging this digital "brain" into a virtual fly body, the simulation correctly reacts to things like the taste of sugar or dirt on its antennae. This project, known as Eon, aims to eventually scale up to simulate more complex brains, like those of mice and humans. Read more

Mobile-GS is a new model for mobile graphics synthesis that can generate realistic graphics on mobile devices. It uses a combination of deep learning and graphics techniques to generate high-quality graphics. The model can be used for a variety of applications, including mobile gaming and virtual reality.
Read more

AI Search

Fly simulations, Nemotron 3, Gemini Embedding, MatAnyone, WorldFM, FishAudio: AI NEWS

Welcome to the AI Search newsletter. Here are the top highlights in AI this week.

Also check out our Youtube for more AI news & reviews!

Ready for more?