Skip to main content
Every task Bugzy performs runs inside an isolated, ephemeral Cloud Run container. This page covers the execution infrastructure — what’s inside each container, the step-by-step test pipeline, and where test artifacts are stored.

Execution flow

Trigger (dashboard / webhook / cron / Slack message)
  → API route (Next.js on Vercel)
    → Task queue (Supabase)
      → Cloud Run Job (isolated container)
        → Claude Code agent (Anthropic)
          → MCP servers + Playwright browser
            → Results committed to Git
              → Dashboard updated in real time
Triggers include clicking “Run Tests” in the dashboard, a GitHub push/PR webhook, a cron schedule, or a Slack message mentioning Bugzy. API routes validate the request, create an execution record in Supabase, and enqueue a task. Cloud Run Jobs pick up the task, decrypt credentials via Cloud KMS, clone the project repo, and launch the Claude Code agent with the appropriate task configuration. Claude Code executes a step-based task using MCP servers (Slack, Jira, GitHub, etc.) and Playwright for browser automation. Results — test reports, code fixes, bug filings — are committed back to the repo. Real-time updates flow from Supabase subscriptions and Ably channels to the dashboard, so you see execution progress as it happens.

Container properties

PropertyDetail
RuntimeDocker container on Google Cloud Run Jobs
BrowserPre-installed Chromium via Playwright (version-pinned)
CredentialsEncrypted at rest with Cloud KMS, decrypted at job start
WorkspaceEphemeral — cloned from Git, destroyed after execution
MCP serversConfigured per project integrations (Slack, Jira, GitHub, etc.)
AI backboneClaude Code (Anthropic SDK)
Containers are single-use. No state persists between executions — every run starts from a clean Git clone with fresh credentials.

The 15-step test execution pipeline

When you trigger a test run, Bugzy executes a 15-step pipeline:
1

Run Tests Overview

Load task configuration and read tests/CLAUDE.md for project-specific test instructions.
2

Security Notice

Enforce security boundaries — the agent operates within defined permissions.
3

Parse Arguments

Extract test selection criteria: file pattern, tag (@smoke), specific file path, or “all”.
4

Read Test Strategy

Load test-execution-strategy.md for context on test tiers and priorities.
5

Clarification Protocol

If the request is ambiguous, confirm with the team before proceeding.
6

Identify Tests

Resolve the selector to specific test files and confirm the selection.
7

Run Tests

Execute selected Playwright tests inside the container with Chromium. Generate JSON reports.
8

Normalize Results

Convert raw test output into a standardized format.
9

Parse Results

Extract pass/fail status, error messages, screenshots, and traces from JSON reports.
10

Triage Failures

Classify each failure as a product bug (real application issue) or a test issue (broken selector, timing problem, flaky assertion). Uses the knowledge base for accuracy.
11

Fix Test Issues

The test-engineer subagent auto-fixes test issues — broken selectors, timing problems, stale references. Retries up to 3 times.
12

Log Product Bugs

File product bugs in your connected issue tracker (Jira, Azure DevOps, Asana, Linear) with screenshots and reproduction steps.
13

Handle Special Cases

Address edge cases — missing test files, invalid test cases, browser automation failures.
14

Update Knowledge Base

Record learnings in knowledge-base.md for future triage accuracy.
15

Notify Team

Post a summary to Slack, Microsoft Teams, or email with pass/fail counts, bug links, and fix details.

Where test artifacts live

All test artifacts are committed to your project’s Git repository:
ArtifactLocationFormat
Test plantest-plan.mdMarkdown
Manual test casestest-cases/TC-XXX.mdMarkdown
Automated teststests/specs/*.spec.tsPlaywright TypeScript
Test configurationtests/CLAUDE.mdMarkdown
Knowledge baseknowledge-base.mdMarkdown
Disputed findingsdisputed-findings.mdMarkdown
Execution resultsDashboard + manifest.jsonJSON
Because everything lives in Git, you get full version history, code review via PRs, and the ability to run tests locally with standard Playwright commands.