The autonomous pipeline
The Bugzy team uses a/tasks:send command to send development tasks through an AI-powered pipeline:
Task creation
A developer runs
/tasks:send <task>, creating a GitHub issue with the autonomous-pipeline label.Planning
A planning agent (Claude Code running in GitHub Actions) analyzes the task, creates an implementation plan, and opens a Plan PR with the
plan-review label.Human review
The team reviews the plan in the PR diff. They can iterate by commenting with
@claude — the agent revises the plan based on feedback. Once satisfied, the team merges.Implementation
Merging the plan PR automatically triggers an implementation agent. It writes the code and opens a Code PR with the
implementation label.Preview deployment
Vercel auto-deploys a preview environment with a Supabase preview branch. The preview is a fully functional copy of the application with the new changes.
Bugzy verification
The deployment webhook triggers Bugzy’s
verify-changes task. Bugzy runs tests against the preview URL — the same tests it would run for any deployment.Results posted
Bugzy posts test results to the PR as a GitHub check run: “Bugzy QA / Preview Tests”. Pass/fail status, test counts, and any bugs found are visible directly on the PR.
AI verifying AI
In this pipeline, the code is written by one AI agent (Claude Code) and verified by another (Bugzy). The human’s role shifts from writing and testing code to reviewing plans, checking results, and making final merge decisions.| Property | Detail |
|---|---|
| Automated QA gate | Every code PR gets tested before human review |
| Preview testing | Tests run against the actual deployed preview, not just the code diff |
| Continuous feedback | If the AI agent pushes fixes, a new preview deploys and Bugzy re-verifies |
| Check run integration | ”Bugzy QA / Preview Tests” shows pass/fail status directly on the PR |
| Branch protection | Teams can require the Bugzy check run to pass before merging |
Why this matters
As AI agents write more production code, the risk of regressions and functional bugs grows. Manual code review catches logic issues, but functional regressions require running the application. Bugzy fills this gap — automatically testing every deployment, whether the code was written by a human or an AI. This isn’t theoretical — it’s how the Bugzy platform itself is developed. Every feature, bug fix, and refactor goes through this autonomous pipeline.Try it yourself
Verify changes
Learn how Bugzy verifies code changes — from human PRs to AI-generated deployments.
Quickstart
Get Bugzy running on your project in 10 minutes.
