🎉 Surgical Robotics, AGI by 2025, AlphaFold 3 Protein Prediction, The Beatles' AI-enhanced Grammy Nomination, New Section: "AI Training"
AI Robots Perform Surgery, Sam Altman Calls AGI by 2025, Google DeepMind's New Protein Model, The Beatles Latest Grammy Nom, New Section in AImpulse: "AI Training"
Welcome to this week’s edition of AImpulse, [usually] a five point summary of the most significant advancements in the world of Artificial Intelligence.
Here’s the pulse on this week’s top stories with a new section called “AI Training” where I’ll attempt to call out a cool new accessible tool available today to try out.
Enjoy!
What’s Happening: Researchers at Johns Hopkins University just achieved a breakthrough in surgical robotics, training a robot to perform complex medical procedures solely by having it watch videos of human surgeons at work.
The details:
The da Vinci Surgical System robot learned and performed critical surgical tasks, such as needle manipulation, tissue lifting, and suturing, with human-level skill.
Using a new imitation learning approach, the system trained with hundreds of surgical videos captured by da Vinci robot wrist cameras.
The AI model combines ChatGPT-style architecture with kinematics, essentially teaching the robot to "speak surgery" through mathematical movements.
The system also showed unexpected adaptability, like automatically retrieving dropped needles — a skill it wasn't explicitly programmed to perform.
Why it matters: The surge in robotic capabilities for both training and dexterity is opening up new use cases — and surgery is next on the list. This video learning approach could do for surgical robotics what LLMs did for AI, allowing robots to rapidly learn and adapt to any procedure instead of hand-coding for each individual movement.
What’s Happening: OpenAI CEO Sam Altman just predicted that artificial general intelligence will be achieved in 2025, coming alongside conflicting reports of slowing progress in LLM development and scaling across the industry.
The details:
In an interview with YC founder Gary Tan, Altman said the path to AGI is ‘basically clear’ and will require engineering, not new scientific breakthroughs.
A new report revealed that the rumored ‘Orion’ model shows smaller improvement over GPT-4 than previous generations, especially in coding tasks.
The company also reportedly formed a new "Foundations Team" to tackle fundamental challenges, such as the scarcity of high-quality training data.
OpenAI researchers Noam Brown and Clive Chan backed Altman’s AGI confidence, believing the o1 reasoning model offers new scaling capabilities.
Why it matters: Altman’s prediction would mean a drastic leap in the company’s AGI scale (currently level 2 of 5) — but the CEO has remained consistent in his confidence. With OpenAI suddenly prioritizing o1 development, it makes sense that the reasoning model might have shown new potential to break through any scaling limits.
What’s Happening: Google DeepMind just open-sourced its groundbreaking AlphaFold 3 protein prediction model, enabling academic researchers to access both code and training weights for the first time since its limited release in May.
The details:
The Nobel Prize-winning technology can predict interactions between proteins and other molecules like DNA, RNA, and potential drug compounds.
Academic researchers can access the model's full capabilities for non-commercial use, though commercial applications remain restricted.
The system has already mapped over 200M protein structures, demonstrating unprecedented scale in structural biology.
Several companies, including Baidu and ByteDance, have already created their own versions based on the original paper's specifications.
DeepMind's spinoff, Isomorphic Labs, maintains exclusive commercial rights, having recently secured $3 billion in pharmaceutical partnerships.
Why it matters: Scientific research is one of the most exciting areas for AI, and the wider availability of AlphaFold via open-source should massively accelerate breakthroughs across biology and medicine – while also leveling the playing field beyond well-funded institutions or pharmaceutical companies.
What’s Happening: "Now and Then," The Beatles' AI-enhanced final song, released a year ago, just became the first AI-assisted track to receive Grammy nominations — marking a historical moment for AI's role in music production.
The details:
The song earned nominations for Record of the Year and Best Rock Performance, competing against artists like Beyoncé and Taylor Swift.
The track used AI "stem separation" technology to clean up and isolate John Lennon's vocals from a 1978 unreleased demo.
The AI technique mirrors noise-canceling technology used in video calls, training models to identify and separate specific sounds.
The nomination follows the Grammy’s 2023 denial of consideration to viral AI creator Ghostwriter due to the unauthorized use of vocals.
Why it matters: The Beatles have been pioneers throughout music history, so it’s only fitting that they help carry the baton into this new era of AI-assisted production and creation. The coming wave of song generation will be an even bigger shift, but this technique shows how artists can also use AI as a tool for preservation and restoration.
AI Training: Claude’s new visual document feature lets AI analyze and understand complex PDFs containing charts, diagrams, and graphics, transforming how we extract information from technical documents.
Step-by-step:
Head over to Claude AI and enable "Visual PDFs" in your Feature Preview settings.
Upload your PDF containing charts, diagrams, or technical illustrations.
Ask specific questions about visual elements (e.g., "Explain the relationship shown in this diagram").
Combine visual analysis with text queries for comprehensive insights.
Pro tip: Frame your questions to specifically reference visual elements. This helps you get more precise and detailed answers about charts and diagrams.