Dom.Vin
AI Design Journal

Les Barclays asks Who Captures the Value When AI Inference Becomes Cheap?:

Currently enterprise AI chatbots and platforms are marginally useful but largely disappointing because they currently fall short in reliably creating true business transformation. The issue is they don't sufficiently understand specific business processes and wider business transformation is required to enable AI which takes time. Based on what I’ve written, I’m led to believe that the larger direction of AI is going to focus on one subject applications built by startups.

As subsidies end and true costs surface, AI services will become expensive and eventually become commoditised. The current race to the bottom on simple tasks will continue, but complex reasoning and giant contexts will carry premium prices that reflect real compute costs.

A long read, but a really interesting summarization of the current state of the industry. One of my goals this year is to better understand the funding landscape around AI adoption.

Memory and continuous learning are perhaps some of the biggest bottlenecks holding back strong AI, among other stuff. Current stuff is narrowly capable, but still brittle. Solving continuous learning and memory seems a non-negotiable if they want it to shift to high-level machine intelligence.

Amen.

Do Anything:

The most important feature of a Do Anything Agent is it's independence.

Other agents take action AS you in YOUR accounts. Do Anything has its own identity. It takes action as ITSELF in ITS accounts.

This is why your agent has its own name and email address.

This is an awesome piece of design. Each agent acts as your employee, rather than you. It massively reduces the social blast radius when something goes wrong.

Compound Engineering: How Every Codes With Agents by Dan Shipper, CEO and cofounder of Every:

In traditional engineering, you expect each feature to make the next feature harder to build—more code means more edge cases, more interdependencies, and more issues that are hard to anticipate. By contrast, in compound engineering, you expect each feature to make the next feature easier to build. This is because compound engineering creates a learning loop for your agents and members of your team, so that each bug, failed test, or a-ha problem-solving insight gets documented and used by future agents. The complexity of your codebase still grows, but now so does the AI’s knowledge of it, which makes future development work faster.

You have two jobs as an AI-supported software engineer. Engineering software is one of them. Engineering the machine that engineers the software is the other. Equally important.

Today, if your AI is used right, a single developer can do the work of five developers a few years ago, based on our experience at Every.

I’ve been informally polling software engineers for a couple of years now on what they’d estimate is the AI multiplier on their work. 5x is on the high-end of the responses I get, but the mean is going up and up.

Cory Doctorow writes that AI companies will fail. We can salvage something from the wreckage:

The market’s bet on AI is that an AI salesman will visit the CEO of Kaiser and make this pitch: “Look, you fire nine out of 10 of your radiologists, saving $20m a year. You give us $10m a year, and you net $10m a year, and the remaining radiologists’ job will be to oversee the diagnoses the AI makes at superhuman speed – and somehow remain vigilant as they do so, despite the fact that the AI is usually right, except when it’s catastrophically wrong.

“And if the AI misses a tumor, this will be the human radiologist’s fault, because they are the ‘human in the loop’. It’s their signature on the diagnosis.”

It is what Dan Davies calls an “accountability sink”. The radiologist’s job is not really to oversee the AI’s work, it is to take the blame for the AI’s mistakes.

Interesting take, which concludes on the principle that AI is a bubble waiting to burst. I’m not sure if it’s a bubble or a CapEx cycle, but lots of the “follow the money” concerns raised here are totally valid.

I do, however, disagree with one core premise; that AI is not capable of providing an incredible amount of value across a wide array of roles and sectors. Case in point, the example he gives of ‘hallucinating libraries’ in AI-assisted software engineering, which can be almost entirely mitigated by static verification, does not strengthen his argument.

Regardless, the broader concerns he raises are definitely worth understanding and considering. Whether you’re an AI denier, doomer, or accelerationist, you know the impact will be felt. The disagreements are mostly around who will feel it the most, and who will gain or lose in the transition.

Anthropic Cowork:

When we released Claude Code, we expected developers to use it for coding. They did—and then quickly began using it for almost everything else. This prompted us to build Cowork: a simpler way for anyone—not just developers—to work with Claude in the very same way.

I spent some time with Cowork. It’s pretty cool. Some obvious UX bugs aside, I’m broadly impressed. Browser use is as solid as I’ve seen it. If they figure out a decent memory paradigm for between-session accumulation of knowledge, I think this could quickly become transformational software.

These agents are getting really powerful, the pace of change is astonishing. And Opus 4.5 is an absolute beast.

Super interesting open-source repo that collects Agent Skills for Context Engineering:

A comprehensive, open collection of Agent Skills focused on context engineering principles for building production-grade AI agent systems. These skills teach the art and science of curating context to maximize agent effectiveness across any agent platform.

Building the machine that builds the machine that builds the machine. It’s context all the way down.

I heard about this Ralph thing when it launched, but it seemed like a toy:

Geoffrey Huntley, who popularised the technique, describes it plainly: “Ralph is a Bash loop.” The plugin packages that idea into Claude Code so you can run it safely in a session.

I might be wrong, seems like it could be an oddly-named little breakthrough.

Is Ralph Wiggum the thing that finally makes me switch from Cursor Agent to Claude Code?