Execution environment

Every task Bugzy performs runs inside an isolated, ephemeral Cloud Run container. This page covers the execution infrastructure — what’s inside each container, the step-by-step test pipeline, and where test artifacts are stored.

Execution flow

Trigger (dashboard / webhook / cron / Slack message)
  → API route (Next.js on Vercel)
    → Task queue (Supabase)
      → Cloud Run Job (isolated container)
        → Claude Code agent (Anthropic)
          → MCP servers + Playwright browser
            → Results committed to Git
              → Dashboard updated in real time

Triggers include clicking “Run Tests” in the dashboard, a GitHub push/PR webhook, a cron schedule, or a Slack message mentioning Bugzy. API routes validate the request, create an execution record in Supabase, and enqueue a task. Cloud Run Jobs pick up the task, decrypt credentials via Cloud KMS, clone the project repo, and launch the Claude Code agent with the appropriate task configuration. Claude Code executes a step-based task using MCP servers (Slack, Jira, GitHub, etc.) and Playwright for browser automation. Results — test reports, code fixes, bug filings — are committed back to the repo. Real-time updates flow from Supabase subscriptions and Ably channels to the dashboard, so you see execution progress as it happens.

Container properties

Property	Detail
Runtime	Docker container on Google Cloud Run Jobs
Browser	Pre-installed Chromium via Playwright (version-pinned)
Credentials	Encrypted at rest with Cloud KMS, decrypted at job start
Workspace	Ephemeral — cloned from Git, destroyed after execution
MCP servers	Configured per project integrations (Slack, Jira, GitHub, etc.)
AI backbone	Claude Code (Anthropic SDK)

Containers are single-use. No state persists between executions — every run starts from a clean Git clone with fresh credentials.

The 15-step test execution pipeline

When you trigger a test run, Bugzy executes a 15-step pipeline:

Run Tests Overview

Load task configuration and read tests/CLAUDE.md for project-specific test instructions.

Security Notice

Enforce security boundaries — the agent operates within defined permissions.

Parse Arguments

Extract test selection criteria: file pattern, tag (@smoke), specific file path, or “all”.

Read Test Strategy

Load test-execution-strategy.md for context on test tiers and priorities.

Clarification Protocol

If the request is ambiguous, confirm with the team before proceeding.

Identify Tests

Resolve the selector to specific test files and confirm the selection.

Run Tests

Execute selected Playwright tests inside the container with Chromium. Generate JSON reports.

Normalize Results

Convert raw test output into a standardized format.

Parse Results

Extract pass/fail status, error messages, screenshots, and traces from JSON reports.

Triage Failures

Classify each failure as a product bug (real application issue) or a test issue (broken selector, timing problem, flaky assertion). Uses the knowledge base for accuracy.

Fix Test Issues

The test-engineer subagent auto-fixes test issues — broken selectors, timing problems, stale references. Retries up to 3 times.

Log Product Bugs

File product bugs in your connected issue tracker (Jira, Azure DevOps, Asana, Linear) with screenshots and reproduction steps.

Handle Special Cases

Address edge cases — missing test files, invalid test cases, browser automation failures.

Update Knowledge Base

Record learnings in knowledge-base.md for future triage accuracy.

Notify Team

Post a summary to Slack, Microsoft Teams, or email with pass/fail counts, bug links, and fix details.

Where test artifacts live

All test artifacts are committed to your project’s Git repository:

Artifact	Location	Format
Test plan	`test-plan.md`	Markdown
Manual test cases	`test-cases/TC-XXX.md`	Markdown
Automated tests	`tests/specs/*.spec.ts`	Playwright TypeScript
Test configuration	`tests/CLAUDE.md`	Markdown
Knowledge base	`knowledge-base.md`	Markdown
Disputed findings	`disputed-findings.md`	Markdown
Execution results	Dashboard + `manifest.json`	JSON

Because everything lives in Git, you get full version history, code review via PRs, and the ability to run tests locally with standard Playwright commands.

Getting started

Use Cases

Capabilities

Integrations

Platform

Security

Execution environment

Execution flow

Container properties

The 15-step test execution pipeline

Where test artifacts live

Getting started

Use Cases

Capabilities

Integrations

Platform

Security

Documentation Index

​Execution flow

​Container properties

​The 15-step test execution pipeline

​Where test artifacts live

Execution flow

Container properties

The 15-step test execution pipeline

Where test artifacts live