🎉 AI Takes the Director's Chair, Gemini Goes Galactic, Your New Dev Bot, Web Gets Agentic, AI in the Lab
Google Launches Creative Models, Google Unveils Gemini Upgrades, OpenAI Launches Codex Agent, Microsoft Expands Copilot Suite and Debuts Discovery Platform
Welcome to this week’s edition of AImpulse, a five point summary of the most significant advancements in the world of Artificial Intelligence.
Easily our biggest week of the year and it’s only Thursday folks. Here’s the pulse on this week’s top stories (so far):
What’s Happening: Google just debuted a wave of creative AI—Veo 3, Imagen 4, an AI filmmaking suite, Lyria music upgrades, and more.
The details
Veo 3 can bind sound effects, ambience, and dialogue directly to generated video.
Veo 2 adds director tools: scene/character consistency, virtual camera moves, in-paint & out-paint.
Imagen 4 boosts fidelity, nails fine detail and typography, and outputs up to 2 K.
Flow weaves these models into a natural-language film-production workspace.
All arrive via the new Google AI Ultra plan ($250 / mo) or through Vertex for enterprises.
Why it matters: Synced audio plus higher-fidelity visuals give creators sharper control—and open the door to a next wave of generative film and design workflows.
What’s Happening: Google also rolled out sweeping upgrades to its Gemini and Gemma families at I/O—plus fresh AI-powered search, shopping, and agent tooling.
Gemini / Models
Gemini 2.5 Pro and Flash were tuned up: Pro now tops public leaderboards, while Flash gains skill without losing speed.
A tester-only “2.5 Deep Think” variant sets new marks in math, coding, and multimodal reasoning.
Gemma 3n entered preview as a phone-first open model that rivals larger systems like Claude 3.7 Sonnet yet runs efficiently on-device.
Gemini Live (camera + screen-share) is now free for everyone, with personalized add-ons coming soon.
Search / Agents
U.S. users get Gemini 2.5-powered AI Mode, plus new Deep Search and inline Gemini Live answers.
Extras: virtual try-on, an AI shopping concierge, and Search Live for real-time multimodal voice queries.
Google’s coding aide Jules opened public beta, quietly handling dev chores inside repos.
Agent Mode is landing in both Search and Gemini, letting the system juggle up to ten tasks at once.
Why it matters: Years of Google AI research are bursting into products. The new, live-personalized search experience could reshape how people use Google every day.
What’s Happening: OpenAI introduced Codex, a cloud-native engineering agent that can juggle multiple dev tasks on its own.
The details
Runs on codex-1, a flavor of OpenAI’s o3 tuned for software work.
Spins up isolated workspaces to ship features, squash bugs, answer architecture questions, and run tests.
Live now for ChatGPT Pro, Team, and Enterprise; usage will later be rate-limited with optional add-ons.
Why it matters: Codex nudges AI closer to “virtual coworker” status—freeing human devs for higher-level thinking while bots grind through the backlog.
What’s Happening: At Build 2025, Microsoft sketched its “open agentic web” vision, unveiling a revamped GitHub Copilot, Copilot Studio, Azure Foundry, a browser-level AI agent, and more.
The details
GitHub Copilot graduates from in-editor helper to background agent; Copilot Chat for VS Code is now open-source.
Azure AI Foundry adds xAI’s Grok 3 and Grok 3 Mini, topping 1,900 models.
New spec NLWeb aims to make conversational UI as plug-and-play as HTML.
Copilot gains enterprise fine-tuning and multi-agent orchestration for complex workflows.
Why it matters: Microsoft’s barrage highlights a pivot toward open frameworks and genuine agent collaboration—bringing the long-promised age of AI assistants closer to reality.
Watch CEO Satya Nadella’s keynote
What’s Happening: Microsoft also unveiled Discovery, an enterprise platform that pairs scientists with AI agents to slash R&D timelines.
The details
Uses “AI postdoc” agents plus a graph knowledge engine to craft hypotheses, simulate experiments, and analyze results.
Demo: found a novel non-PFAS datacenter coolant in ~200 hours—a task that usually takes months.
Aims to put supercomputing-grade analysis behind a natural-language interface.
Early adopters—GSK, Estée Lauder, NVIDIA, Synopsys—plan to apply Discovery across pharma, materials, and chip design.
Why it matters: By blending agent intelligence with massive compute, Discovery could finally turn AI’s scientific-research promise into repeatable, real-world impact.