Architecture
System design, security architecture, and runtime execution
The Five-Stage Pipeline
Every agentic workflow moves through five stages. Each stage enforces distinct security properties — failures in later stages cannot bypass constraints established by earlier ones.
Key insight: The Markdown file you edit is not what GitHub Actions executes. gh-aw compiles your declarative intent into a deterministic .lock.yml execution plan. Editing the Markdown without recompiling changes nothing at runtime.
| Stage | Input | Output | Security Property |
|---|---|---|---|
| Author | Natural language + YAML frontmatter | .md file | Human-readable, version-controlled intent |
| Compile | .md file + imports | .lock.yml + actions-lock.json | Schema validation, SHA pinning, security scanning |
| Execute | .lock.yml on Actions runner | Buffered artifacts | Read-only permissions, network firewall, container isolation |
| Detect | Buffered artifacts | Pass/fail verdict | AI-powered threat analysis in isolated job |
| Apply | Approved artifacts | GitHub API writes | Scoped write permissions per output type |
Four-Phase Compilation Pipeline
The gh aw compile command transforms your Markdown workflow through four sequential security phases before producing the .lock.yml execution plan.
Schema Validation
Every frontmatter field is validated against a JSON schema. Invalid field names, wrong types, and malformed configurations are rejected before any code is generated.
Invalid fields → rejectedExpression Safety
Only allowlisted GitHub Actions expressions are permitted. Expressions referencing secrets.* in unsafe positions are blocked — preventing injection attacks.
Action SHA Pinning
Every Action reference is resolved to its full commit SHA and recorded in actions-lock.json. Defends against tag-hijacking and supply-chain attacks.
Security Scanners
The compiled output passes through actionlint (workflow linting), zizmor (vulnerability detection), and poutine (supply-chain risk analysis).
—actionlint —zizmor —poutineactionlint
Workflow structure linting with integrated shellcheck and pyflakes. Catches syntax errors, type mismatches, and workflow misconfigurations.
zizmor
Security vulnerability detection. Scans for privilege escalation paths, security misconfigurations, and dangerous patterns in the compiled workflow.
poutine
Supply-chain risk analysis. Evaluates third-party action risks, dependency vulnerabilities, and potential for compromised upstream actions.
Strict Mode (default): Additionally enforces no write permissions on agent jobs, explicit network configuration, no wildcard domains in allowlists, and no deprecated fields.
Three-Layer Trust Model
Each layer enforces distinct security properties under different failure assumptions. Breaching one layer does not compromise the guarantees of layers below it.
Staged Execution & Safe Outputs
Constrains component behavior over time. Decomposes workflows into stages with distinct active components, permissions, and admissible data consumers. The SafeOutputs subsystem is the primary instantiation.
Compiler, Schemas & SHA Pinning
Declarative artifacts and toolchains that instantiate the system’s structure. Controls which components are loaded, how they connect, which channels are permitted, and what privileges each receives. Token distribution is controlled here.
VM, Docker, iptables & Hardware
Hardware and OS foundation. Provides memory isolation, CPU/resource isolation, and kernel-enforced communication boundaries. Guarantees hold even if the agent container is fully compromised.
Trust violation at the substrate layer requires exploiting the container runtime, kernel, hypervisor, or hardware. If this layer fails, all higher-level guarantees are void — but this is the most difficult layer to breach.
Privileged Containers
Three privileged containers run alongside the untrusted agent container, each responsible for mediating a specific concern. The agent never directly accesses secrets, the network, or MCP servers.
Creates a private Docker network, binds the agent, and routes all traffic through a Squid proxy enforcing domain allowlists.
- Creates private network (172.30.0.0/24)
- iptables-based traffic redirection
- Domain allowlist enforcement via Squid
- Drops own iptables capabilities before agent launch
Routes LLM traffic and holds endpoint-specific credentials. The agent sends unauthenticated requests; the proxy injects auth headers.
- Holds API tokens (agent has none)
- Injects Authorization header
- Supports OpenAI, Anthropic, Copilot
- Captures request/response metadata
Spawns isolated MCP server containers via the Docker socket. Each server runs in its own container with no shared state.
- Per-container domain allowlists
- Tool allowlisting (explicit
allowed:lists) - Secret injection via env vars only
- Audit logging of all tool invocations
Runtime Execution Flow
When a workflow triggers, the agent job follows a precise sequence of setup steps before the AI engine enters its execution loop.
Repository Checkout
Standard actions/checkout (SHA-pinned) clones the repository into the runner workspace.
Runtime Setup
Language runtimes (Node.js, Python, Go) installed via setup actions. Dependencies resolved and cached.
Cache Restore
Dependency caches restored from previous runs for faster startup. Reduces re-installation overhead.
MCP Container Start
MCP Gateway spawns isolated server containers via Docker socket. Each server gets its own network and tool allowlist.
Prompt Generation
Workflow context, sanitized user input, and Markdown instructions assembled into the agent prompt.
AI Engine Loop
The selected engine (Copilot, Claude, Codex) reads the prompt, uses MCP tools, writes code. All GitHub writes are buffered by SafeOutputs MCP — never applied directly. Network requests pass through AWF’s Squid proxy; LLM calls route through the API Proxy.
Secret Redaction
Runs unconditionally (if: always()). Scans all files in /tmp/gh-aw and replaces detected secrets with masked format (first 3 chars + asterisks).
Upload Output Artifacts
Buffered artifacts (agent_output.json, aw.patch, prompt.txt) uploaded via Actions artifacts for the detection job.
Save Cache
Updated dependency caches saved for subsequent runs.
Artifact Buffering & Threat Detection
The agent cannot write directly to the repository or GitHub API. Every output flows through a four-stage security pipeline before reaching the outside world.
Buffer
Agent job produces agent_output.json, aw.patch, and prompt.txt. All writes go to SafeOutputs MCP — never the API.
Redact
Secret redaction scans all output files (.txt, .json, .log, .md, .yml). Detected secrets masked: abc*****.
Detect
Isolated detection job runs AI + custom scanners. Analyzes for secret leaks, malicious patches, and policy violations. Produces pass/fail verdict.
Apply
Only if detection passes. Each safe output job receives the minimum write permission needed. Blocked if any threat detected.
Safe Output Permission Mapping
| Safe Output Job | Required Permission | Operation |
|---|---|---|
create_issue | issues: write | Creates a new GitHub issue |
add_comment | issues: write | Adds a comment to an issue or PR |
create_pull_request | contents: write, pull-requests: write | Creates a PR from the buffered patch |
add_labels | issues: write | Applies labels to issues or PRs |
Customizable Threat Detection
Extend detection with custom prompts and external security scanners:
Content Sanitization
Before any user-generated content reaches the agent, it passes through seven sanitization mechanisms at the activation stage boundary — ensuring the agent never processes raw, potentially adversarial input.
| Mechanism | Transformation | Protection |
|---|---|---|
| ① @mention neutralization | @user → | Prevents unintended notifications and social engineering |
| ② Bot trigger protection | fixes #123 → | Prevents automatic issue linking and closure |
| ③ XML/HTML tag conversion | <script> → (script) | Prevents injection via markup tags |
| ④ URI filtering | http://evil.com → (redacted) | Restricts to HTTPS from trusted domains only |
| ⑤ Content limits | Large payloads → truncated | 0.5 MB max, 65k lines max to prevent denial-of-service |
| ⑥ Control character removal | ANSI escapes → stripped | Removes terminal manipulation sequences |
| ⑦ Unicode normalization | Special chars → normalized | Prevents homoglyph and confusable character attacks |
Job DAG Structure
The compiled .lock.yml decomposes into a strict Directed Acyclic Graph (DAG) of jobs. Each job boundary is a trust boundary — data flows only through Actions artifacts, never shared memory.
Pre-Activation
- Role & permission checks
- Deadline validation
- Skip-if-match evaluation
- Command position validation
Activation
- Prepare workflow context
- Sanitize user input (7 mechanisms)
- Validate lock file integrity
Agent Job
- Repository checkout & runtime setup
- Cache restore & MCP container start
- Prompt generation & AI engine execution
- Secret redaction (
if: always()) - Upload output artifacts
Detection Job
- Download buffered artifacts
- AI-powered threat analysis
- Custom scanner steps (TruffleHog, Semgrep)
- Emit pass/fail security verdict
Safe Output Jobs
- Scoped write permissions
- Create issues, PRs, comments
- Apply labels
Blocked
- No writes externalized
- Workflow fails
- Artifacts preserved for audit
Conclusion
- Aggregate results
- Generate summary
Trust boundary enforcement: Data flows between jobs only through Actions artifacts and outputs — never through shared memory or filesystem. A compromised agent job cannot influence detection or safe output jobs except through its declared artifacts. The DAG is expressed natively via Actions’ needs: syntax.
Agent Runtime Environment
The agent runs inside the AWF’s containerized environment with controlled filesystem access, network isolation, and engine selection.
Chroot Mode
Many coding agents expect full host access. The AWF provides a chroot mode that gives the agent access to host-installed tools while maintaining isolation:
- Mounts entire VM filesystem read-only at
/host - Overlays sensitive directories with empty
tmpfslayers - Mounts
HOMEand/tmpread-write - Imports subset of host env vars (
USER,PATH) - Launches agent in a
/hostchroot jail
This lets the agent use Python, Node.js, Go at their normal paths while the AWF controls network and secrets.
Supported AI Engines
The engine: frontmatter field selects which AI agent runs inside the container:
Domain Allowlist Configuration
The AWF enforces network rules from the workflow frontmatter. Ecosystem bundles provide pre-configured domain sets for common toolchains:
Logging at Every Trust Boundary
Every trust boundary doubles as a logging point — and every logging point is a potential mediation point for future information-flow controls.
The Critical Distinction
”By default, everything in an action runs in the same trust domain. Rogue agents can interfere with MCP servers, access authentication secrets, and make network requests to arbitrary hosts.”
gh-aw constrains agent capabilities through architecture rather than relying on procedural safeguards. The system makes it structurally impossible for agents to access secrets, write directly to the repository, or communicate with unauthorized endpoints — even if the agent is fully compromised.