AI News on May 3: Cybersecurity, Grok, and a Voice Race

Today is once again one of those editions where research, product updates, and security issues keep tripping over each other. Especially interesting: AI systems are not only getting stronger, but also becoming practically useful for attacks, agent workflows, and voice features. In short: the models are maturing — and so are the risks.

🛡️ GPT-5.5 and Claude Mythos nearly tied in cybersecurity tests

According to the UK AI Security Institute, OpenAI’s GPT-5.5 has, as only the second model ever, been able to autonomously solve a full network attack simulation — putting it almost on par with Anthropic’s Claude Mythos. This matters because the discussion is no longer just about “helping with coding,” but about systems that can independently plan and act in a realistic attack scenario. For security research, that sends a very clear signal: frontier models are no longer just tools, but potential amplifiers of offensive cyber capabilities. At the same time, the finding also shows how quickly model capabilities are converging. If a model like GPT-5.5 is already available in ChatGPT and via API, a lab result turns into a product problem very quickly. The good news: benchmarks like this help make risks more measurable. The less good news: the bar is now extremely high.
Source: The Decoder

🧠 HASE: New RL engine for multi-agent operations

The arXiv study on HASE (“High-Throughput Compute-Efficient POMDP Hide-And-Seek-Engine”) addresses a classic reinforcement learning problem: multi-agent environments are expensive, slow, and hard to scale. HASE aims to tackle exactly that and make Dec-POMDP workloads more efficient — that is, scenarios in which multiple agents each see only part of the situation. That sounds dry, but it is highly relevant in practice: whether robotics, coordinated simulations, or training agent systems — as soon as multiple actors work together with incomplete information, things get tricky. Engines like this are a good example of how progress in AI is not just about ever-larger models. Often, it’s the infrastructure beneath them that makes it possible for research to iterate quickly enough in the first place. For ambitious newcomers, that means: if you want to understand agents, you should not only look at models, but also at the training and simulation environment.
Source: arXiv

🗣️ The best AI dictation apps tested

TechCrunch tested and ranked AI-powered dictation apps — and the topic is much more practical than it may seem at first glance. Good speech-to-text tools are no longer just for notes, but for emails, meeting minutes, and even coding by voice. The market is interesting because productivity here does not come from “yet another chat window,” but from a real input channel: speech. Anyone who writes a lot will quickly notice how much a good dictation setup can change daily work — provided the transcription is accurate and the workflow is not annoying. For the AI market, this is also a sign: while frontier models battle over benchmarks, specialized tools are quietly winning real usage. And yes, a good dictation tool is often sexier than the tenth chat UI.
Source: TechCrunch

⚖️ Musk in court: Terminator warning and OpenAI usage confirmed

Elon Musk testified for more than seven hours in the lawsuit against Sam Altman and once again delivered the full package: he called himself a “fool,” warned about a Terminator scenario, and at the same time confirmed that xAI uses OpenAI models for its own training. Substantively, that is a remarkable mix of dramatic safety rhetoric and very pragmatic model shopping. That is exactly the context: even the loudest warners rely in practice on the best systems available. It shows how dependent even new labs are on existing model ecosystems. For the industry, this is not a scandal, but rather a realistic look at the state of affairs: competition, yes — but please with access to the most useful ingredients. The lawsuit thus gains another layer — not only legal, but strategic as well.
Source: The Decoder

🚀 xAI launches Grok 4.3 and the “Imagine” agent

With Grok 4.3, xAI is apparently placing greater emphasis on lower-cost usage and better tool integration. According to the reports, the model improves mainly on practical tasks, but still lags behind OpenAI and Anthropic in the top tier. The particularly interesting part is the new “Imagine” multimedia agent: instead of generating only text, the agent becomes more broadly usable — another step toward systems that not only understand content, but also assemble it. For users, this is interesting because tool use often brings more day-to-day usefulness than a tiny benchmark jump. For the market, it means the competition is shifting from “who has the biggest model?” to “who can deliver the most useful agent?” That is exactly where things are becoming practical for product teams and developers now.
Source: The Decoder

🎙️ xAI lets you clone your own voice in under two minutes

xAI is expanding its voice APIs with Custom Voices — in other words, cloning your own voice in under two minutes. For developers, that sounds like a convenient feature for voice apps, assistants, or personalized interfaces. At the same time, of course, it is also a privacy and misuse issue with an alarm light built in. Voice clones are powerful because a voice carries not just information, but identity. That is exactly why the questions of consent, verification, and misuse prevention matter more here than the actual demo video. Technically, the step makes sense: once speech-to-text and text-to-speech are in place, the jump to custom voices is small. In practice, however, it also means voice AI is becoming faster, cheaper, and thus more mass-market — including all the usual side effects.
Source: The Decoder

🛠️ Tool tip of the day: Corvyn for low-cost AI routing workflows

Corvyn is an open-source AI routing proxy that automatically forwards requests to free models and displays the costs in your local currency. This is especially useful if you work with multiple tools such as OpenCode, Claude Code, Cursor, or Aider and do not want to take the most expensive route for every request right away. For solo developers, teams, and tinkerers who care about costs, a proxy like this can make the difference between “cool experiment” and “why is the bill so high?” It is also designed to be local-first — fitting for anyone who prefers to run AI workflows in a controlled rather than wildly distributed way. If you experiment, a setup like this often saves more than any prompt optimization ever could.
Source: GitHub

Don’t want to miss any news? Subscribe to the newsletter