OpenAI AgentKit, 3D modeling for blind, ChatGPT apps, Google Opal, Samsung TRM & more AI NEWS
Welcome to the AI Search newsletter. Here are the top highlights in AI this week.
OpenAI launched the ability for apps to run directly inside ChatGPT, allowing users to do things like book travel, create playlists, or design graphics without leaving the chat. Apps from companies like Spotify, Figma, Zillow, and Coursera are integrated, making ChatGPT more interactive and useful by blending conversation with app functions. Read more
OpenAI introduced AgentKit, a toolkit designed to make building and deploying AI agents easier and faster for developers and enterprises. AgentKit provides visual tools like Agent Builder to create workflows without heavy coding, plus ChatKit for embedding chat interfaces in other apps. This platform helps automate complex tasks and optimize agent performance, targeting enterprise use cases. Read more
A new AI tool called A11yShape has been developed to help blind and low-vision programmers create and refine 3D models independently. This tool combines a code-based 3D modeling editor with a large language model to provide plain-language descriptions of the model, allowing users to understand and edit their designs without relying on sight. A11yShape has been tested with positive results, but still has some limitations that need to be addressed. Read more
Ling-1T is a powerful AI model with 1 trillion total parameters, designed for efficient reasoning and scalable cognition, achieving SOTA performance on multiple complex reasoning benchmarks. It excels in visual reasoning, front-end code generation, and emergent intelligence, making it a strong foundation for general, collaborative human-AI intelligence. Ling-1T is also highly efficient, with a 15%+ end-to-end speedup and improved memory efficiency. Read more
Do you prefer to watch instead of read? Check out this video covering the top AI news this week:
Apriel-1.5-15b-Thinker is a new AI model, fine-tuned for advanced reasoning and general AI tasks. This model showcases strong capabilities in understanding and creating complex content, even outperforming SOTA models that are many times larger. The model is also highly memory-efficient, fitting on a single GPU. Read more
Paper2Video is an AI system that automatically turns scientific papers into presentation videos with slides, speech, and talking-head animation. It tackles the challenge of creating research presentation videos by coordinating multiple content types and uses a multi-agent framework to automate the whole process efficiently. This technology aims to save researchers hours of work and improve how research is shared visually. Read more
MIMIX is a video generation model that mixes multiple characters into a single video, allowing creative combinations in generated content. It focuses on combining different visual elements smoothly to produce realistic multi-character video scenes. This innovation helps in generating complex videos that look natural involving various characters. Read more
Enter for FREE to win a Legion Pro 7i, the world’s most powerful AI-tuned gaming laptop. The Legion Pro 7i stands out with its AI Engine+ technology, optimizing power between the CPU and GPU for seamless multitasking and gaming. This is the ultimate portable workstation for running local AI tools and gaming. Enter here
Lumina-DiMOO is a powerful open-source image generator and editor. It uses a special diffusion method that makes producing these outputs faster and better than older models, and it can do things like image editing and understanding with strong results. Lumina-DiMOO outperforms many other models in benchmarks for various multimodal tasks. Read more
Google expanded access to Opal, a no-code AI mini-app builder that lets anyone create AI-powered apps just by using natural language. It now supports 15 more countries and includes new debugging tools and faster app performance, making it easier to build and fix complex workflows without coding. Opal is great for automating tasks, marketing, or creative projects simply by describing what you want. Read more
ChronoEdit by NVIDIA is an AI image editor that adds a sense of time and physical consistency by treating editing as a video generation problem. It “reasons” about how edits should unfold over time, making sure changes look natural and realistic, especially for tasks related to physical AI like robotics or autonomous cars. The model shows how it thinks through edits step-by-step to create physically plausible results. Read more
With Monica, you can use the top AI models, image generators, and video generators, all in one integrated platform. Use code AISEARCH10 to get 25% OFF 'Unlimited Annual Plan' within 24h of registration, or enjoy 10% OFF. Try it for free today!
Tiny Recursive Models (TRM) is a small but smart AI approach that uses recursive reasoning to solve hard problems effectively without needing huge models. TRM improves its answers step-by-step recursively with a tiny neural network, proving that smaller models can compete with big expensive ones in tasks like the ARC-AGI challenge. This method simplifies complex reasoning with less computing power, providing a fresh AI research direction. Read more
Large language models (LLMs) can be compromised by just 250 malicious documents, regardless of the model’s size or the amount of clean data used. This means that even the largest and most advanced AI models can be vulnerable to data poisoning attacks, which can install secret backdoors and make the AI perform harmful actions. Researchers are calling for improved security measures to mitigate this risk. Read more