🏗️

Architecture

System design, security architecture, and runtime execution

The Five-Stage Pipeline

Every agentic workflow moves through five stages. Each stage enforces distinct security properties — failures in later stages cannot bypass constraints established by earlier ones.

1
Author
.md file
Natural language intent + YAML frontmatter. Human-readable, version-controlled.
2
Compile
.lock.yml
Deterministic execution plan. Schema validation, SHA pinning, security scanning.
3
Execute
Agent Job
Sandboxed AI runs with read-only permissions. All writes buffered as artifacts.
4
Detect
Threat Analysis
AI-powered detection reviews buffered artifacts for secrets, threats, and policy violations.
5
Apply
Safe Outputs
Scoped write permissions. Only approved artifacts reach the GitHub API.

Key insight: The Markdown file you edit is not what GitHub Actions executes. gh-aw compiles your declarative intent into a deterministic .lock.yml execution plan. Editing the Markdown without recompiling changes nothing at runtime.

StageInputOutputSecurity Property
AuthorNatural language + YAML frontmatter.md fileHuman-readable, version-controlled intent
Compile.md file + imports.lock.yml + actions-lock.jsonSchema validation, SHA pinning, security scanning
Execute.lock.yml on Actions runnerBuffered artifactsRead-only permissions, network firewall, container isolation
DetectBuffered artifactsPass/fail verdictAI-powered threat analysis in isolated job
ApplyApproved artifactsGitHub API writesScoped write permissions per output type

Four-Phase Compilation Pipeline

The gh aw compile command transforms your Markdown workflow through four sequential security phases before producing the .lock.yml execution plan.

1

Schema Validation

Every frontmatter field is validated against a JSON schema. Invalid field names, wrong types, and malformed configurations are rejected before any code is generated.

Invalid fields → rejected
2

Expression Safety

Only allowlisted GitHub Actions expressions are permitted. Expressions referencing secrets.* in unsafe positions are blocked — preventing injection attacks.

No secrets.* in unsafe positions
3

Action SHA Pinning

Every Action reference is resolved to its full commit SHA and recorded in actions-lock.json. Defends against tag-hijacking and supply-chain attacks.

actions/checkout@v4 → SHA
4

Security Scanners

The compiled output passes through actionlint (workflow linting), zizmor (vulnerability detection), and poutine (supply-chain risk analysis).

—actionlint —zizmor —poutine
Linting

actionlint

Workflow structure linting with integrated shellcheck and pyflakes. Catches syntax errors, type mismatches, and workflow misconfigurations.

Security

zizmor

Security vulnerability detection. Scans for privilege escalation paths, security misconfigurations, and dangerous patterns in the compiled workflow.

Supply Chain

poutine

Supply-chain risk analysis. Evaluates third-party action risks, dependency vulnerabilities, and potential for compromised upstream actions.

Strict Mode (default): Additionally enforces no write permissions on agent jobs, explicit network configuration, no wildcard domains in allowlists, and no deprecated fields.

Three-Layer Trust Model

Each layer enforces distinct security properties under different failure assumptions. Breaching one layer does not compromise the guarantees of layers below it.

Layer 3 — Plan-Level Trust

Staged Execution & Safe Outputs

Constrains component behavior over time. Decomposes workflows into stages with distinct active components, permissions, and admissible data consumers. The SafeOutputs subsystem is the primary instantiation.

Pre-activation checksContent sanitizationAgent execution (read-only)Secret redactionThreat detectionSafe output applicationTemporal constraints
Layer 2 — Configuration-Level Trust

Compiler, Schemas & SHA Pinning

Declarative artifacts and toolchains that instantiate the system’s structure. Controls which components are loaded, how they connect, which channels are permitted, and what privileges each receives. Token distribution is controlled here.

JSON Schema validationExpression allowlistSHA pinningSecurity scannersStrict modeToken distribution
Layer 1 — Substrate-Level Trust

VM, Docker, iptables & Hardware

Hardware and OS foundation. Provides memory isolation, CPU/resource isolation, and kernel-enforced communication boundaries. Guarantees hold even if the agent container is fully compromised.

CPU / MMU / KernelHypervisorDocker runtimeNetwork Firewall (AWF)API ProxyMCP Gatewayiptables

Trust violation at the substrate layer requires exploiting the container runtime, kernel, hypervisor, or hardware. If this layer fails, all higher-level guarantees are void — but this is the most difficult layer to breach.

Privileged Containers

Three privileged containers run alongside the untrusted agent container, each responsible for mediating a specific concern. The agent never directly accesses secrets, the network, or MCP servers.

🛡️
Network Firewall (AWF)
Agent Workflow Firewall

Creates a private Docker network, binds the agent, and routes all traffic through a Squid proxy enforcing domain allowlists.

  • Creates private network (172.30.0.0/24)
  • iptables-based traffic redirection
  • Domain allowlist enforcement via Squid
  • Drops own iptables capabilities before agent launch
🔑
API Proxy
Token Holder

Routes LLM traffic and holds endpoint-specific credentials. The agent sends unauthenticated requests; the proxy injects auth headers.

  • Holds API tokens (agent has none)
  • Injects Authorization header
  • Supports OpenAI, Anthropic, Copilot
  • Captures request/response metadata
🔌
MCP Gateway
gh-aw-mcpg

Spawns isolated MCP server containers via the Docker socket. Each server runs in its own container with no shared state.

  • Per-container domain allowlists
  • Tool allowlisting (explicit allowed: lists)
  • Secret injection via env vars only
  • Audit logging of all tool invocations
Actions Runner VM
AWF Private Docker Network (172.30.0.0/24)
🤖 Agent Container
172.30.0.20
Copilot / Claude / Codex
→ HTTP →
🔒 Squid Proxy
172.30.0.10
Domain allowlist
↓ Allowed domains pass ↓     ✗ Blocked → dropped
🔑 API Proxy
holds token
🔌 MCP Gateway
host:80→8000
📦 MCP Servers
spawned via Docker

Runtime Execution Flow

When a workflow triggers, the agent job follows a precise sequence of setup steps before the AI engine enters its execution loop.

1

Repository Checkout

Standard actions/checkout (SHA-pinned) clones the repository into the runner workspace.

2

Runtime Setup

Language runtimes (Node.js, Python, Go) installed via setup actions. Dependencies resolved and cached.

3

Cache Restore

Dependency caches restored from previous runs for faster startup. Reduces re-installation overhead.

4

MCP Container Start

MCP Gateway spawns isolated server containers via Docker socket. Each server gets its own network and tool allowlist.

5

Prompt Generation

Workflow context, sanitized user input, and Markdown instructions assembled into the agent prompt.

6

AI Engine Loop

The selected engine (Copilot, Claude, Codex) reads the prompt, uses MCP tools, writes code. All GitHub writes are buffered by SafeOutputs MCP — never applied directly. Network requests pass through AWF’s Squid proxy; LLM calls route through the API Proxy.

7

Secret Redaction

Runs unconditionally (if: always()). Scans all files in /tmp/gh-aw and replaces detected secrets with masked format (first 3 chars + asterisks).

8

Upload Output Artifacts

Buffered artifacts (agent_output.json, aw.patch, prompt.txt) uploaded via Actions artifacts for the detection job.

9

Save Cache

Updated dependency caches saved for subsequent runs.

Artifact Buffering & Threat Detection

The agent cannot write directly to the repository or GitHub API. Every output flows through a four-stage security pipeline before reaching the outside world.

📝

Buffer

Agent job produces agent_output.json, aw.patch, and prompt.txt. All writes go to SafeOutputs MCP — never the API.

🔒

Redact

Secret redaction scans all output files (.txt, .json, .log, .md, .yml). Detected secrets masked: abc*****.

🔍

Detect

Isolated detection job runs AI + custom scanners. Analyzes for secret leaks, malicious patches, and policy violations. Produces pass/fail verdict.

Apply

Only if detection passes. Each safe output job receives the minimum write permission needed. Blocked if any threat detected.

Safe Output Permission Mapping

Safe Output JobRequired PermissionOperation
create_issueissues: writeCreates a new GitHub issue
add_commentissues: writeAdds a comment to an issue or PR
create_pull_requestcontents: write, pull-requests: writeCreates a PR from the buffered patch
add_labelsissues: writeApplies labels to issues or PRs

Customizable Threat Detection

Extend detection with custom prompts and external security scanners:

threat-detection: prompt: | Additionally check for: - References to internal infrastructure URLs - Attempts to modify CI/CD configuration files - Changes to security-sensitive files steps: - name: Run TruffleHog run: trufflehog filesystem /tmp/gh-aw —only-verified - name: Run Semgrep run: semgrep scan /tmp/gh-aw/aw.patch —config=auto

Content Sanitization

Before any user-generated content reaches the agent, it passes through seven sanitization mechanisms at the activation stage boundary — ensuring the agent never processes raw, potentially adversarial input.

MechanismTransformationProtection
@mention neutralization@user @userPrevents unintended notifications and social engineering
Bot trigger protectionfixes #123 fixes #123Prevents automatic issue linking and closure
XML/HTML tag conversion<script> (script)Prevents injection via markup tags
URI filteringhttp://evil.com (redacted)Restricts to HTTPS from trusted domains only
Content limitsLarge payloads truncated0.5 MB max, 65k lines max to prevent denial-of-service
Control character removalANSI escapes strippedRemoves terminal manipulation sequences
Unicode normalizationSpecial chars normalizedPrevents homoglyph and confusable character attacks

Job DAG Structure

The compiled .lock.yml decomposes into a strict Directed Acyclic Graph (DAG) of jobs. Each job boundary is a trust boundary — data flows only through Actions artifacts, never shared memory.

Gate

Pre-Activation

  • Role & permission checks
  • Deadline validation
  • Skip-if-match evaluation
  • Command position validation
Setup

Activation

  • Prepare workflow context
  • Sanitize user input (7 mechanisms)
  • Validate lock file integrity
Core

Agent Job

  • Repository checkout & runtime setup
  • Cache restore & MCP container start
  • Prompt generation & AI engine execution
  • Secret redaction (if: always())
  • Upload output artifacts
Security

Detection Job

  • Download buffered artifacts
  • AI-powered threat analysis
  • Custom scanner steps (TruffleHog, Semgrep)
  • Emit pass/fail security verdict
✓ Safe

Safe Output Jobs

  • Scoped write permissions
  • Create issues, PRs, comments
  • Apply labels
✗ Threat

Blocked

  • No writes externalized
  • Workflow fails
  • Artifacts preserved for audit
Final

Conclusion

  • Aggregate results
  • Generate summary

Trust boundary enforcement: Data flows between jobs only through Actions artifacts and outputs — never through shared memory or filesystem. A compromised agent job cannot influence detection or safe output jobs except through its declared artifacts. The DAG is expressed natively via Actions’ needs: syntax.

Agent Runtime Environment

The agent runs inside the AWF’s containerized environment with controlled filesystem access, network isolation, and engine selection.

Filesystem

Chroot Mode

Many coding agents expect full host access. The AWF provides a chroot mode that gives the agent access to host-installed tools while maintaining isolation:

  • 1 Mounts entire VM filesystem read-only at /host
  • 2 Overlays sensitive directories with empty tmpfs layers
  • 3 Mounts HOME and /tmp read-write
  • 4 Imports subset of host env vars (USER, PATH)
  • 5 Launches agent in a /host chroot jail

This lets the agent use Python, Node.js, Go at their normal paths while the AWF controls network and secrets.

Engines

Supported AI Engines

The engine: frontmatter field selects which AI agent runs inside the container:

copilotGitHub — Default
claude-codeAnthropic
codexOpenAI
caiGitHub (experimental)

Domain Allowlist Configuration

The AWF enforces network rules from the workflow frontmatter. Ecosystem bundles provide pre-configured domain sets for common toolchains:

network: firewall: true allowed: - defaults # Certificates, JSON schema - python # PyPI, Conda - node # npm, npmjs.com - “api.example.com” # Custom domain
Observability

Logging at Every Trust Boundary

Every trust boundary doubles as a logging point — and every logging point is a potential mediation point for future information-flow controls.

🛡️ Firewall (AWF)
Network destinations, blocked requests, egress patterns
🔑 API Proxy
Model request/response metadata, token usage
🔌 MCP Gateway
Tool invocations, server lifecycle, tool filtering
🤖 Agent Container
Env var accesses, internal instrumentation
✅ Safe Outputs
Buffered operations, verdicts, applied writes
📋 Artifacts
prompt.txt, agent_output.json, aw.patch, engine logs

The Critical Distinction

”By default, everything in an action runs in the same trust domain. Rogue agents can interfere with MCP servers, access authentication secrets, and make network requests to arbitrary hosts.”

gh-aw constrains agent capabilities through architecture rather than relying on procedural safeguards. The system makes it structurally impossible for agents to access secrets, write directly to the repository, or communicate with unauthorized endpoints — even if the agent is fully compromised.