AI is getting more expensive, smarter, and more local: Today’s news

Today is all about the question of what AI really costs in everyday use — financially, technically, and from a security perspective. Between a potentially absurd cloud bill, Windows PCs with local agents, and new security issues, one thing becomes very clear: the AI wave has long since arrived in day-to-day enterprise work. And it brings not only productivity, but also new problems to deal with.

💸 500 million dollars for Claude: When limits are missing

According to Axios, a company reportedly spent around 500 million dollars on Claude licenses in just one month because apparently no meaningful usage limits were set. The report, picked up by The Decoder, is a textbook example of how AI costs depend not only on model pricing, but above all on governance, context management, and discipline.

Why does this matter? Because many companies are only just beginning to integrate AI into workflows — and often prefer to look at the bill later. Without limits, monitoring, and clear guidelines, a “productivity boost” can very quickly turn into a budget hangover. Especially in code and agent use cases, the rule is: a model that works hard is not automatically a model that works efficiently. For companies, that means cost control is not a nice-to-have, but part of the AI architecture. Otherwise, the CFO will soon become a prompt engineer.

Source: The Decoder

🪟 OpenAI brings Computer Use to Windows 11

OpenAI is expanding the Codex app with Computer Use on Windows 11. This means the AI can not only write code, but also operate programs itself, test apps, and detect UI errors. According to The Decoder, work can even be started remotely via the ChatGPT app — handy if you want to trigger tasks from afar.

This is an important step because it frees AI agents from the pure text window and lets them actually interact with software. For developers, QA teams, and automation folks, this opens up exciting possibilities: tests, reproduction steps, routine clicks, and perhaps someday even reasonably reliable end-to-end workflows. At the same time, the question remains how robust such agents are in practice. GUI automation is notoriously finicky. But that is exactly why this move matters: OpenAI is not just bringing the agent into the chat window, but to the desktop.

Source: The Decoder

🧪 New theory for more robust simulators in reinforcement learning

A new paper has appeared on arXiv, “Theoretical Foundations and Effective Algorithms for Policy-Aware Simulator Learning”, which tackles an old problem in model-based reinforcement learning: good prediction is not enough if an agent can systematically game the simulator. The authors argue that learning objectives must be aligned not just with accuracy, but also with how the policy will later use the simulator.

That sounds abstract, but it is highly relevant in practice. Anyone using models for simulation, robotics, or planning knows the problem of the reality gap: everything works perfectly in simulation, and then only sort of in the real world. This paper addresses exactly that and tries to make simulators more robust against exploits. For research, it is another reminder that “better models” do not automatically mean “better agents.” You have to think about the whole system — not just the loss.

Source: arXiv

🛠️ Tool tip of the day: kabeuchi

The GitHub project kabeuchi is a small but very nerdily practical tool: it controls multiple AI chat models in parallel directly from the terminal via an existing browser session — no API keys required. It works via CDP and Playwright, i.e. right where the browser is already running.

Why is this interesting? Because it shows an alternative path for people who want to quickly compare multiple models or automate browser-based workflows without immediately building their own API integrations. That can be useful for experiments, prototyping, and model comparisons. For production setups, the usual caveat applies: browser automation is powerful, but also fragile. Still: definitely worth a look for tinkerers with a terminal fetish. #

🖥️ Microsoft and Nvidia are planning local AI agents for Windows PCs

Microsoft and Nvidia are apparently preparing new Windows PCs with Nvidia chips and local AI execution, as reported by The Decoder. The focus is on devices from Dell and Surface, as well as new software based on the OpenClaw framework, with which AI agents are supposed to handle tasks directly on the PC.

This is strategically interesting because it shifts the focus from the cloud back to the client. Local processing potentially means better latency, more privacy, and lower ongoing API costs. At the same time, it is an indirect admission that the first big Copilot+ PC push did not solve everything. If Microsoft and Nvidia really deliver here, the Windows PC could evolve from a passive tool into an active work assistant. For companies, that would be especially interesting if sensitive tasks can be handled locally instead of in the cloud.

Source: The Decoder

🧠 Helpful language models become less “human”

A large-scale study with around 208,000 participants and 26 million reactions, reported by The Decoder, arrives at an interesting conclusion: the more language models are trained to be helpful, the worse they become at reflecting human behavior. According to the study, even persona prompting with demographic profiles helps only to a limited extent.

This is relevant for everyone who sees LLMs not just as productivity machines, but also as behavioral models. A chatbot that sounds “helpful” is apparently not automatically a good proxy for humans. For research, UX, and evaluation, this is a reminder that benchmarks and human impact can diverge. In short: a model can be nice without really resembling us. So AI is becoming more polite — but not necessarily more human.

Source: The Decoder

🚨 Shared ChatGPT and Claude chats are becoming a malware trap

Security researchers are warning that attackers are abusing the share feature of ChatGPT and Claude to distribute malware. According to The Decoder, the attackers disguise their content as harmless error messages or installation guides. Because the shared chats are hosted on trusted domains, they sometimes slip past security tools.

This is a great example of how quickly new product features create new attack surfaces. Sharing sounds useful — and it is — but once links from well-known AI domains are used for social engineering, trust becomes a risk. For teams, that means security policies should not only look at email and downloads, but also at AI share links, prompt content, and contextual disguise. The next phishing wave does not necessarily need a classic website. Sometimes a shared chat window is enough.

Source: The Decoder

Don’t want to miss any news? Subscribe to the newsletter

💸 500 million dollars for Claude: When limits are missing

🪟 OpenAI brings Computer Use to Windows 11

🧪 New theory for more robust simulators in reinforcement learning

🛠️ Tool tip of the day: kabeuchi

🖥️ Microsoft and Nvidia are planning local AI agents for Windows PCs

🧠 Helpful language models become less “human”

🚨 Shared ChatGPT and Claude chats are becoming a malware trap

Weekly AI news highlights