OpenAI’s voice cloning AI model only needs a 15-second sample to work (3 minute read)
OpenAI’s Voice Engine can create synthetic voices based on a 15-second clip of someone’s voice. The text-to-voice generation platform is currently in limited access. Developers testing the platform are required to get explicit and informed consent from speakers, not build ways for individual users to create their own voices, and disclose to listeners that the voices are AI-generated. Example clips generated by Voice Engine are available in the article.
xAI announced its next model, with 128k context length and improved reasoning capabilities. It excels at retrieval and programming.
Apple Researchers Boast On-Device Model That Outperforms GPT-4 (3 minute read)
Apple’s AI researchers have developed a new system called ReALM that improves Siri’s ability to understand context by considering on-screen, conversational, and background entities. It outperforms ChatGPT 4.0 in benchmarks.
DALL-E now lets you edit images in ChatGPT (2 minute read)
OpenAI’s DALL-E now offers image editing tools both on the web and on mobile. There are preset style suggestions to help inspire image creation. The image generation platform has been integrated with ChatGPT - users can now edit DALL-E images in ChatGPT across web, iOS, and Android. Videos from OpenAI showing off the new features are available in the article.
Researchers have developed an AI network where one AI can teach another to perform tasks using natural language processing, a capability not previously demonstrated. The system uses a model called S-Bert that allows AI to perform tasks given via instructions and then communicate that knowledge to another AI. This breakthrough has potential applications in robotics and could further understanding of human cognitive functions.
Elon Musk Boosts AI Engineer Pay in ‘Craziest Talent War’ (2 minute read)
Tesla is raising its compensation for artificial intelligence engineers to ward off poaching from other companies. Elon Musk says that the competition for AI engineers is the craziest talent war he has ever seen. Tech companies are offering million-dollar-a-year compensation packages and accelerated stock-vesting schedules. Layoffs are continuing in other areas of tech.
OpenAI Expands Its Custom Model Training Program (2 minute read)
OpenAI is expanding its Custom Model program with assisted fine-tuning and custom-trained models to help enterprise customers develop tailored generative AI models for specific use cases.
Google AI is helping to predict floods better, which could save lots of people and money around the world every year. Utilizing machine learning, particularly LSTM models, Google AI has significantly improved the accuracy and extended the reliability of flood predictions, especially in data-scarce regions like Africa and Asia. It advocates for open-source collaboration, targeted AI investments, and the development of a robust data infrastructure.
Jony Ive and Sam Altman Seek Funding for Personal AI Device (1 minute read)
Former Apple design chief Jony Ive and OpenAI CEO Sam Altman are partnering to create an AI device that won’t resemble a smartphone. Their unnamed startup is seeking up to $1 billion in funding from major venture capitalists, including OpenAI, SoftBank, Thrive Capital, and Emerson Collective.
Microsoft is confident Windows on Arm could finally beat Apple (5 minute read)
Microsoft will unveil its vision for AI PCs at an event in Seattle on May 20. The company is confident that its new Arm-powered Windows laptops will beat Apple’s M3-powered MacBook Air in both CPU performance and AI-accelerated tasks. The laptops will feature Snapdragon X Elite processors from Qualcomm. Microsoft is planning to ship consumer models of its Surface Pro 10 and Surface Laptop 6 with Snapdragon X Elite processors instead of the Intel Core Ultra processors that the business-focused versions have.
OpenAI makes GPT-4 Turbo with Vision available to developers to unlock new AI apps (2 minute read)
GPT-4 Turbo with Vision allows developers to call on a single model for both text and image processing.
Waymo self-driving cars are delivering UberEats orders for first time (2 minute read)
UberEats customers in the Phoenix metropolitan area will receive some food deliveries via Waymo’s Jaguar I-PACE electric vehicles.
Grok-1.5 Vision Preview (6 minute read)
xAI has announced that its latest flagship model has vision capabilities on par with (and in some cases exceeding) state-of-the-art models.
Google’s new chips look to challenge Nvidia, Microsoft, and Amazon (2 minute read)
Google’s new AI chip, Cloud TPU v5p, is now available. It boasts nearly triple the training speed for large language models compared to its predecessor, TPU v4. This release underscores Google’s position in the AI hardware race alongside competitors like Nvidia. Google has also introduced the Google Axion CPU, based on Arm’s chip infrastructure, promising better performance and energy efficiency.
Google launches Code Assist, its latest challenger to GitHub’s Copilot (3 minute read)
Google’s Gemini Code Assist, an AI code completion tool for enterprises, features a one million-token context window for more accurate code suggestions and supports on-premises codebases across multiple platforms.
ALOHA Unleashed(3 minute read)
ALOHA Unleashed by Google DeepMind is a new generation of AI-powered robots with impressive dexterity. DeepMind recently released a series of videos showing the robots hanging shirts, inserting precise gears, and even tying shoelaces. The robots can generalize to untrained objects. The videos are available in the link.
Google’s New Technique Gives LLMs Infinite Context (5 minute read)
Google researchers have introduced Infini-attention, a technique that enables LLMs to work with text of infinite length while keeping memory and compute requirements constant.
OpenAI Winds Down DALL-E 2 (6 minute read)
The launch of OpenAI’s DALL-E 2 in April 2022 marked a groundbreaking and tumultuous period in AI history, as a tight-knit group of artists and tech enthusiasts explored the intersection between language and visual arts using the technology. However, the amazement and exhilaration soon gave way to concerns about the ethics of training AI models on copyrighted creative work without permission or compensation, leading to a polarizing debate that continues to reverberate in the AI space as OpenAI moves on to DALL-E 3 and other AI image synthesis models emerge.
Meta is Going Open-Source (18 minute read)
Meta is releasing its Horizon OS operating system for virtual and mixed reality headsets and partnering with hardware companies to design optimized headsets, aiming to define the next generation of computing with an open model. This move is part of Meta’s broader strategy as a horizontal services company to capture user time and attention, commoditize AI models and headset hardware, and avoid being subject to the control of closed platforms like Apple or Google.
The Ray-Ban Meta Smart Glasses Have Multimodal AI Now (5 minute read)
Meta has rolled out multimodal AI to its Ray-Ban Meta Smart Glasses, allowing users to process photos, audio, and text through voice commands for tasks like identification and translation, although the AI’s capabilities are limited and sometimes inaccurate.
OpenAI Announces New Enterprise AI Features (4 minute read)
OpenAI has announced new enterprise-grade features for its API customers, including enhanced security measures, an upgraded Assistants API, a new Projects feature for granular access control, and cost management tools. These updates demonstrate OpenAI’s focus on offering a more “plug and play” experience for enterprises, countering the rise of competitors like Meta’s Llama 3 and open models from Mistral.