Stable Hair Swapper • New image generator beats Midjourney • Robot adapts to injury • AI news this week
Welcome to the AI Search newsletter. Here are the top highlights in AI this week.
Researchers have developed a visual-linguistic framework that enables robots to grasp objects they've never seen before. This framework, called OVGNet, uses a combination of computer vision and natural language processing to allow robots to recognize and pick up objects, even if they don't know what they are. The system is trained on a large dataset of images and language descriptions, and can generalize to new objects and scenarios. Read more
Flux is a new state-of-the-art text-to-image generation model developed by Black Forest Labs, a company founded by the original creators of Stable Diffusion. Early demonstrations suggest that Flux's output quality rivals or even surpasses that of popular closed-source models like Midjourney v6.0 and DALL-E 3. Read more
Google’s Gemini 1.5 Flash is more accessible and affordable than ever. The cost of using Gemini 1.5 Flash has been reduced by up to 85%, making it more affordable for businesses to use. This will enable companies to build more advanced AI applications and reach a wider global audience. Read more
Turbotype is a Chrome extension that allows you to type faster by setting customizable shortcuts, designed to boost productivity and save time. Easily create, save, and use keyboard shortcuts for frequently used phrases. It’s free forever - try it out today!
Virtual try-on has become easier and more efficient with the introduction of CatVTON, a new diffusion model that uses concatenation to achieve high-quality virtual try-on effects. This model eliminates the need for additional network modules and reduces the number of trainable parameters, making it more lightweight and efficient. CatVTON achieves superior qualitative and quantitative results with fewer prerequisites and trainable parameters than baseline methods. Read more
Figure, an OpenAI-backed startup, is teasing its new humanoid robot, Figure 02, which is designed to perform tasks in various industries such as manufacturing, logistics, and retail. The robot is envisioned to enhance productivity and safety by taking on unsafe and undesirable jobs, ultimately contributing to a more automated and efficient future.
MIT researchers have developed a novel approach to train robots in simulations of scanned home environments, allowing them to learn robust policies for specific tasks. This approach, called RialTo, uses digital twins of real-world environments to train robots in simulation, which can then be applied to the real world. This method eliminates the need for extensive reward engineering and allows robots to learn quickly and efficiently. Read more
ChatLLM by Abacus AI is an integrated platform that allows enterprises to use multiple LLMs, deploy custom agents, and collaborate with team members. Choose from state-of-the-art LLMs such as GPT-4o, Claude 3 Opus, and their new open source Smaug. Try it for free today!
Most large language models (LLMs) tend to lean left when asked politically charged questions, a new study finds. The study tested 24 different LLMs, including popular ones like OpenAI's GPT 3.5 and Google's Gemini, and found that most of them produced responses that were classified as left-of-center by political orientation tests. The study also found that it's possible to fine-tune LLMs to shift their political preferences, but the reasons behind their initial leanings are still unclear. Read more
Robots can adapt to injuries like animals, thanks to machine learning and bio-inspired design. Researchers at Caltech have developed a robotic flapper that can swim efficiently and adapt to damage by changing its stroke mechanics, similar to how fish and insects adapt to injuries. This ability to adapt could make autonomous underwater vehicles (AUVs) and micro air vehicles (MAVs) more robust and able to complete their missions even if damaged. Read more
Stable-Hair is a novel hairstyle transfer method that can robustly transfer a diverse range of real-world hairstyles. This method uses a two-stage pipeline to transfer hairstyles onto user-provided faces for virtual hair try-on. The first stage removes hair from the user's face image, and the second stage transfers the target hairstyle with highly detailed and high-fidelity to the bald image. This approach achieves state-of-the-art results among existing hair transfer methods. Read more