Grok 4, realtime AI videos, open-source robots, Kimi K2, Moonvalley, Phi-4
Welcome to the AI Search newsletter. Here are the top highlights in AI this week.
xAI launched Grok 4 and Grok 4 Heavy, which are now among the most advanced AI models available. Grok 4 scored nearly twice as high as the next best model on tough reasoning tests, and Grok 4 Heavy uses multiple agents working together to solve problems, like a digital study group. The top-tier subscription for Grok 4 Heavy is $300/month, giving early access to its latest features. Read more
Moonvalley released Marey, a video AI model trained only on licensed, high-quality footage for pro filmmakers. Marey gives users precise control over things like camera movement and scene composition, making it safe and reliable for commercial projects. It’s designed to avoid copyright issues and is available through a subscription. Read more
Hugging Face launched SmolLM3, a small but powerful 3B-parameter language model. SmolLM3 beats other 3B models like Llama-3.2-3B and Qwen2.5-3B, and can even compete with some larger 4B models, all while being fully open-source and efficient. It supports long context, multilingual tasks, and is great for both research and real-world use. Read more
Do you prefer to watch instead of read? Check out this video covering all the highlights in AI this week
Microsoft released Phi-4-mini-flash-reasoning, a super-fast, small AI model for logical reasoning. This model is up to 10 times faster and 2-3 times lower latency than before, making it ideal for devices with limited resources, like phones or edge devices, without losing reasoning skills. It uses a new hybrid architecture to boost speed and efficiency. Read more
Kimi K2 is a powerful language model developed by Moonshot AI, with 32 billion activated parameters and 1 trillion total parameters, making it a state-of-the-art mixture-of-experts (MoE) model. It achieves exceptional performance in various tasks, including coding, reasoning, and tool use, while being optimized for agentic capabilities. Kimi K2 is available for deployment and can be accessed through its API, with examples and guides provided for integration.
Read more
Perplexity introduced Comet, a web browser powered by an AI assistant that can handle tasks for you. Comet can schedule meetings, send emails, and automate workflows, making browsing and productivity much smarter and more efficient. Read more
Hailuo 02 is a new SOTA video model that excels at prompt understanding, physics, camera control, and coherence. It's based on a new architecture called Noise-aware Compute Redistribution (NCR), which boosts training and inference efficiency by 2.5x, allowing 1080p video generation at a competitive cost. With unmatched efficiency and precision, Hailuo 02 is rated among the top video models in the world. Try it for free today!
Hugging Face launched Reachy Mini, a small, open-source robot made for easy coding and creative projects. It’s fully programmable in Python (and soon JavaScript and Scratch), making it perfect for learning robotics, experimenting with AI, or building interactive gadgets, starting at $299. Read more
StreamDiT is a new AI model that can generate high-quality videos in real-time from text prompts, making it possible to create interactive and immersive experiences. It uses a novel streaming video generation approach that enables real-time performance at 16 FPS on a single GPU, opening up new possibilities for applications like live content creation, gaming, and virtual reality. StreamDiT can generate continuous video streams without length limitations, making it a powerful tool for creative and interactive applications. Read more
OmniPart is a new AI framework that generates 3D objects with explicit, editable part structures, making it easier to create interactive 3D content. It uses a two-stage process to achieve high-quality 3D generation, first planning the part structure and then generating the 3D parts. This approach enables a range of downstream applications, including animation, material editing, and geometry processing. Read more
Jarvis Art is an AI-powered photo retouching tool that transforms user instructions into professional-grade image enhancements. It uses natural language processing to understand user intent and applies intelligent adjustments to achieve the desired look autonomously in Adobe Lightroom. Read more
A good photo on your Linkedin or business profile makes a huge difference. You could do a physical photoshoot, which costs you over $200 and hours posing awkwardly at a camera. Or, with AI Portrait, just upload one photo, and get a portfolio of 50 professional photos in minutes. Save time and money - try it today!
ThinkSound is a new AI framework that uses Chain-of-Thought reasoning to generate high-quality audio from videos, allowing for step-by-step, interactive audio generation and editing. It outperforms existing video-to-audio generation models and achieves state-of-the-art performance in both audio metrics and Chain-of-Thought metrics. ThinkSound can also be used to edit audio from videos, making it a powerful tool for audio generation and editing. Read more
OmniVCus is a powerful AI tool that can customize videos by combining different input signals, such as images, text, and depth sequences. It can perform various tasks like single-subject video customization, instructive editing, and even zero-shot more-subject video customization. The model can also compose different control conditions to customize videos in challenging cases. Read more
Researchers have developed a deep-learning system that teaches soft, bio-inspired robots to move using only a single camera. The system uses a deep neural network to reconstruct the shape and range of mobility of a robot from a single image, allowing it to achieve high-precision control. This approach eliminates the need for extensive sensors or custom models, making it a more practical solution for controlling soft robots. Read more