New DeepSeek, image to realtime game, Seed-OSS, NVIDIA Nano2, Google AI mode, InfiniteTalk, Qoder
Welcome to the AI Search newsletter. Here are the top highlights in AI this week.
Mirage 2 is a revolutionary AI-powered game engine that lets you create and play your own games in real-time, using just your imagination and a few inputs. It uses advanced technology to generate interactive worlds, allowing you to upload images, describe your game, and play it instantly. Read more
InfiniteTalk is a new AI that can create realistic videos of people talking and moving in sync with audio, while keeping their identity, background, and camera movements intact. It's different from traditional video dubbing, which only edits the mouth region, and can use a single image to create long animations. This technology has the potential to revolutionize video creation and editing. Read more
RynnEC is a video multi-modal large language model designed for embodied cognition tasks, such as object understanding, spatial understanding, and video object segmentation. It's a powerful tool that can be used for various applications, including general object recognition and understanding, spatial understanding with 3D awareness, and video object segmentation with text-based instructions. Read more
Do you prefer to watch instead of read? Check out this video covering all the highlights in AI this week
DeepSeek-V3.1 is a new AI model that introduces hybrid inference, allowing it to think and reason in two different modes: Think and Non-Think. This model is faster and more efficient than its predecessor, DeepSeek-R1-0528, and has stronger agent skills for tasks like tool use and multi-step problem-solving. The new model also includes updates to its API, tools, and pricing. Read more
Qwen-Image-Edit is a powerful image editing model that can perform semantic and appearance editing, as well as precise text editing, with state-of-the-art performance. It can be used to edit images in various ways, such as changing the color of objects, adding or removing elements, and modifying text, while preserving the original visual semantics. The model is available on the Hugging Face platform and can be used through the Qwen Chat interface. Read more
Google's AI Mode in Search is getting smarter by doing tasks for you, like finding restaurant reservations with your exact preferences, and it’s now available in 180 more countries. This new feature searches multiple platforms and presents real-time options so you can easily book without extra work. It will soon also help with service appointments and event tickets, making online searching more helpful and personalized. Read more
To celebrate reaching 500K subscribers on Youtube, we are giving a away a DJI Mini 4 Pro! This is a small, versatile, powerful drone equipped with a high resolution 48MP sensor and can capture 4K/60fps HDR video. Enter for FREE!
ByteDance released the Seed-OSS family of open-source large language models, including a 36 billion parameter model that can handle super long texts up to 512,000 tokens—twice as long as OpenAI’s GPT-5. These models are designed for powerful reasoning, coding, and multilingual tasks, and come with flexible features like adjustable reasoning length to improve efficiency. They are free to use and modify under the Apache-2.0 license. Read more
NVIDIA released the Nemotron Nano 2 family, a super fast and smart AI model designed for reasoning tasks, which is up to 6 times faster than similar-sized models without losing accuracy. It can handle very long text inputs (up to 128,000 tokens) on a single midrange GPU, making it great for complex tasks like math, coding, and multilingual challenges. NVIDIA also shares most of the training data and model details openly for developers to use and improve. Read more
GeoSAM2 is a new framework for 3D part segmentation that allows users to control the segmentation process using simple 2D prompts, such as clicks or boxes. This approach enables fine-grained, part-specific control without requiring text prompts or full 3D labels, and achieves state-of-the-art performance on several benchmarks. GeoSAM2 has the potential to unlock controllability and precision in mesh-level part understanding. Read more
With Monica, you can use the top AI models, image generators, and video generators, all in one integrated platform. Use code AISEARCH10 to get 25% OFF 'Unlimited Annual Plan' within 24h of registration, or enjoy 10% OFF. Try it for free today!
Alibaba launched Qoder, an AI-powered coding platform that helps developers write, test, and manage software more efficiently using advanced context understanding and natural language commands. It offers two modes: Agent Mode for smart pair programming and Quest Mode for AI-driven, fully automated coding based on detailed specifications. Qoder also features long-term memory, automated documentation, and integration with popular AI models like Claude and GPT to boost coding productivity. Read more
TINKER is a new 3D editing framework that can edit 3D scenes with just one or a few images, without needing to optimize each scene individually. This is made possible by using pre-trained diffusion models, which allows TINKER to generate high-quality, multi-view consistent edits. TINKER also supports video reconstruction and enhancing the quality of 3D graphics. Read more