We’re releasing our new Open Auto Builder today, an open source harness for autonomously building, testing, and maintaining web apps. The design is simple: an agent runs within a container (optionally using Fly.io), unpacks its work into a list of tasks to do, and iterates on those tasks until completion.
The app building process is governed by a set of skill files, which describe how to break things down into tasks and how those tasks should be performed. The skills emphasize a structured approach. During the initial app build the agent does the following:
1. Designs a comprehensive test specification based on the initial spec.
2. Builds the app and writes tests to match the spec.
3. Gets all the tests to pass, deploys to production, and does further testing.
The initial build will not come out perfect. The agent can follow up with maintenance passes where it checks to make sure it closely followed the spec and skill directives and fixes any issues it finds. It will also fix reported bugs and update the skills to avoid similar problems in the future.
As long as each individual step the agent takes is within its capabilities (it can usually do it but not always) the agent will converge on an app that follows the initial spec and skill directives.
The hardest problems the agent will have to deal with are test failures. It records tests using the Replay browser and uses Replay MCP to diagnose why those tests are failing so the agent can write appropriate fixes instead of papering over problems.
We’re also launching the Loop Builder (source), which repeatedly builds apps in a loop, either ones that have been added by users or using a default prompt otherwise. We’ll be using this to continue studying the builder and improving its skills to ensure a consistent high quality result.
