🏗️

Architecture

System design, security architecture, and runtime execution

Pipeline Overview

The Five-Stage Pipeline

Every agentic workflow moves through five stages. Each stage enforces distinct security properties — failures in later stages cannot bypass constraints established by earlier ones.

Author

.md file

Natural language intent + YAML frontmatter. Human-readable, version-controlled.

Compile

.lock.yml

Deterministic execution plan. Schema validation, SHA pinning, security scanning.

Execute

Agent Job

Sandboxed AI runs with read-only permissions. All writes buffered as artifacts.

Detect

Threat Analysis

AI-powered detection reviews buffered artifacts for secrets, threats, and policy violations.

Apply

Safe Outputs

Scoped write permissions. Only approved artifacts reach the GitHub API.

Key insight: The Markdown file you edit is not what GitHub Actions executes. gh-aw compiles your declarative intent into a deterministic .lock.yml execution plan. Editing the Markdown without recompiling changes nothing at runtime.

Stage	Input	Output	Security Property
Author	Natural language + YAML frontmatter	`.md` file	Human-readable, version-controlled intent
Compile	`.md` file + imports	`.lock.yml` + `actions-lock.json`	Schema validation, SHA pinning, security scanning
Execute	`.lock.yml` on Actions runner	Buffered artifacts	Read-only permissions, network firewall, container isolation
Detect	Buffered artifacts	Pass/fail verdict	AI-powered threat analysis in isolated job
Apply	Approved artifacts	GitHub API writes	Scoped write permissions per output type

Compilation

Four-Phase Compilation Pipeline

The gh aw compile command transforms your Markdown workflow through four sequential security phases before producing the .lock.yml execution plan.

Schema Validation

Every frontmatter field is validated against a JSON schema. Invalid field names, wrong types, and malformed configurations are rejected before any code is generated.

Invalid fields → rejected

Expression Safety

Only allowlisted GitHub Actions expressions are permitted. Expressions referencing secrets.* in unsafe positions are blocked — preventing injection attacks.

No secrets.* in unsafe positions

Action SHA Pinning

Every Action reference is resolved to its full commit SHA and recorded in actions-lock.json. Defends against tag-hijacking and supply-chain attacks.

actions/checkout@v4 → SHA

Security Scanners

The compiled output passes through actionlint (workflow linting), zizmor (vulnerability detection), and poutine (supply-chain risk analysis).

—actionlint —zizmor —poutine

Linting

actionlint

Workflow structure linting with integrated shellcheck and pyflakes. Catches syntax errors, type mismatches, and workflow misconfigurations.

Security

zizmor

Security vulnerability detection. Scans for privilege escalation paths, security misconfigurations, and dangerous patterns in the compiled workflow.

Supply Chain

poutine

Supply-chain risk analysis. Evaluates third-party action risks, dependency vulnerabilities, and potential for compromised upstream actions.

Strict Mode (default): Additionally enforces no write permissions on agent jobs, explicit network configuration, no wildcard domains in allowlists, and no deprecated fields.

Defense in Depth

Three-Layer Trust Model

Each layer enforces distinct security properties under different failure assumptions. Breaching one layer does not compromise the guarantees of layers below it.

Layer 3 — Plan-Level Trust

Staged Execution & Safe Outputs

Constrains component behavior over time. Decomposes workflows into stages with distinct active components, permissions, and admissible data consumers. The SafeOutputs subsystem is the primary instantiation.

Pre-activation checksContent sanitizationAgent execution (read-only)Secret redactionThreat detectionSafe output applicationTemporal constraints

Layer 2 — Configuration-Level Trust

Compiler, Schemas & SHA Pinning

Declarative artifacts and toolchains that instantiate the system’s structure. Controls which components are loaded, how they connect, which channels are permitted, and what privileges each receives. Token distribution is controlled here.

JSON Schema validationExpression allowlistSHA pinningSecurity scannersStrict modeToken distribution

Layer 1 — Substrate-Level Trust

VM, Docker, iptables & Hardware

Hardware and OS foundation. Provides memory isolation, CPU/resource isolation, and kernel-enforced communication boundaries. Guarantees hold even if the agent container is fully compromised.

CPU / MMU / KernelHypervisorDocker runtimeNetwork Firewall (AWF)API ProxyMCP Gatewayiptables

Trust violation at the substrate layer requires exploiting the container runtime, kernel, hypervisor, or hardware. If this layer fails, all higher-level guarantees are void — but this is the most difficult layer to breach.

Substrate Components

Privileged Containers

Three privileged containers run alongside the untrusted agent container, each responsible for mediating a specific concern. The agent never directly accesses secrets, the network, or MCP servers.

🛡️

Network Firewall (AWF)

Agent Workflow Firewall

Creates a private Docker network, binds the agent, and routes all traffic through a Squid proxy enforcing domain allowlists.

✓ Creates private network (172.30.0.0/24)
✓ iptables-based traffic redirection
✓ Domain allowlist enforcement via Squid
⚠ Drops own iptables capabilities before agent launch

🔑

API Proxy

Token Holder

Routes LLM traffic and holds endpoint-specific credentials. The agent sends unauthenticated requests; the proxy injects auth headers.

✓ Holds API tokens (agent has none)
✓ Injects Authorization header
✓ Supports OpenAI, Anthropic, Copilot
✓ Captures request/response metadata

🔌

MCP Gateway

gh-aw-mcpg

Spawns isolated MCP server containers via the Docker socket. Each server runs in its own container with no shared state.

✓ Per-container domain allowlists
✓ Tool allowlisting (explicit allowed: lists)
✓ Secret injection via env vars only
✓ Audit logging of all tool invocations

Actions Runner VM

AWF Private Docker Network (172.30.0.0/24)

🤖 Agent Container
172.30.0.20
Copilot / Claude / Codex

→ HTTP →

🔒 Squid Proxy

172.30.0.10

Domain allowlist

↓ Allowed domains pass ↓ ✗ Blocked → dropped

🔑 API Proxy

holds token

🔌 MCP Gateway

host:80→8000

📦 MCP Servers

spawned via Docker

Execution

Runtime Execution Flow

When a workflow triggers, the agent job follows a precise sequence of setup steps before the AI engine enters its execution loop.

Repository Checkout

Standard actions/checkout (SHA-pinned) clones the repository into the runner workspace.

Runtime Setup

Language runtimes (Node.js, Python, Go) installed via setup actions. Dependencies resolved and cached.

Cache Restore

Dependency caches restored from previous runs for faster startup. Reduces re-installation overhead.

MCP Container Start

MCP Gateway spawns isolated server containers via Docker socket. Each server gets its own network and tool allowlist.

Prompt Generation

Workflow context, sanitized user input, and Markdown instructions assembled into the agent prompt.

AI Engine Loop

The selected engine (Copilot, Claude, Codex) reads the prompt, uses MCP tools, writes code. All GitHub writes are buffered by SafeOutputs MCP — never applied directly. Network requests pass through AWF’s Squid proxy; LLM calls route through the API Proxy.

Secret Redaction

Runs unconditionally (if: always()). Scans all files in /tmp/gh-aw and replaces detected secrets with masked format (first 3 chars + asterisks).

Upload Output Artifacts

Buffered artifacts (agent_output.json, aw.patch, prompt.txt) uploaded via Actions artifacts for the detection job.

Save Cache

Updated dependency caches saved for subsequent runs.

Security Pipeline

Artifact Buffering & Threat Detection

The agent cannot write directly to the repository or GitHub API. Every output flows through a four-stage security pipeline before reaching the outside world.

📝

Buffer

Agent job produces agent_output.json, aw.patch, and prompt.txt. All writes go to SafeOutputs MCP — never the API.

🔒

Redact

Secret redaction scans all output files (.txt, .json, .log, .md, .yml). Detected secrets masked: abc*****.

🔍

Detect

Isolated detection job runs AI + custom scanners. Analyzes for secret leaks, malicious patches, and policy violations. Produces pass/fail verdict.

✅

Apply

Only if detection passes. Each safe output job receives the minimum write permission needed. Blocked if any threat detected.

Safe Output Permission Mapping

Safe Output Job	Required Permission	Operation
`create_issue`	`issues: write`	Creates a new GitHub issue
`add_comment`	`issues: write`	Adds a comment to an issue or PR
`create_pull_request`	`contents: write`, `pull-requests: write`	Creates a PR from the buffered patch
`add_labels`	`issues: write`	Applies labels to issues or PRs

Customizable Threat Detection

Extend detection with custom prompts and external security scanners:

threat-detection:
  prompt: |
    Additionally check for:
    - References to internal infrastructure URLs
    - Attempts to modify CI/CD configuration files
    - Changes to security-sensitive files
  steps:
    - name: Run TruffleHog
      run: trufflehog filesystem /tmp/gh-aw —only-verified
    - name: Run Semgrep
      run: semgrep scan /tmp/gh-aw/aw.patch —config=auto

Input Protection

Content Sanitization

Before any user-generated content reaches the agent, it passes through seven sanitization mechanisms at the activation stage boundary — ensuring the agent never processes raw, potentially adversarial input.

Mechanism	Transformation	Protection
① @mention neutralization	`@user` → `@user`	Prevents unintended notifications and social engineering
② Bot trigger protection	`fixes #123` → `fixes #123`	Prevents automatic issue linking and closure
③ XML/HTML tag conversion	`<script>` → `(script)`	Prevents injection via markup tags
④ URI filtering	`http://evil.com` → `(redacted)`	Restricts to HTTPS from trusted domains only
⑤ Content limits	Large payloads → truncated	0.5 MB max, 65k lines max to prevent denial-of-service
⑥ Control character removal	ANSI escapes → stripped	Removes terminal manipulation sequences
⑦ Unicode normalization	Special chars → normalized	Prevents homoglyph and confusable character attacks

Execution Graph

Job DAG Structure

The compiled .lock.yml decomposes into a strict Directed Acyclic Graph (DAG) of jobs. Each job boundary is a trust boundary — data flows only through Actions artifacts, never shared memory.

Gate

Pre-Activation

Role & permission checks
Deadline validation
Skip-if-match evaluation
Command position validation

Setup

Activation

Prepare workflow context
Sanitize user input (7 mechanisms)
Validate lock file integrity

Core

Agent Job

Repository checkout & runtime setup
Cache restore & MCP container start
Prompt generation & AI engine execution
Secret redaction (if: always())
Upload output artifacts

Security

Detection Job

Download buffered artifacts
AI-powered threat analysis
Custom scanner steps (TruffleHog, Semgrep)
Emit pass/fail security verdict

✓ Safe

Safe Output Jobs

Scoped write permissions
Create issues, PRs, comments
Apply labels

✗ Threat

Blocked

No writes externalized
Workflow fails
Artifacts preserved for audit

Final

Conclusion

Aggregate results
Generate summary

Trust boundary enforcement: Data flows between jobs only through Actions artifacts and outputs — never through shared memory or filesystem. A compromised agent job cannot influence detection or safe output jobs except through its declared artifacts. The DAG is expressed natively via Actions’ needs: syntax.

Environment

Agent Runtime Environment

The agent runs inside the AWF’s containerized environment with controlled filesystem access, network isolation, and engine selection.

Filesystem

Chroot Mode

Many coding agents expect full host access. The AWF provides a chroot mode that gives the agent access to host-installed tools while maintaining isolation:

1 Mounts entire VM filesystem read-only at /host
2 Overlays sensitive directories with empty tmpfs layers
3 Mounts HOME and /tmp read-write
4 Imports subset of host env vars (USER, PATH)
5 Launches agent in a /host chroot jail

This lets the agent use Python, Node.js, Go at their normal paths while the AWF controls network and secrets.

Engines

Supported AI Engines

The engine: frontmatter field selects which AI agent runs inside the container:

copilotGitHub — Default

claude-codeAnthropic

codexOpenAI

caiGitHub (experimental)

Domain Allowlist Configuration

The AWF enforces network rules from the workflow frontmatter. Ecosystem bundles provide pre-configured domain sets for common toolchains:

network:
  firewall: true
  allowed:
    - defaults     # Certificates, JSON schema
    - python       # PyPI, Conda
    - node         # npm, npmjs.com
    - “api.example.com”  # Custom domain

Observability

Logging at Every Trust Boundary

Every trust boundary doubles as a logging point — and every logging point is a potential mediation point for future information-flow controls.

🛡️ Firewall (AWF)

Network destinations, blocked requests, egress patterns

🔑 API Proxy

Model request/response metadata, token usage

🔌 MCP Gateway

Tool invocations, server lifecycle, tool filtering

🤖 Agent Container

Env var accesses, internal instrumentation

✅ Safe Outputs

Buffered operations, verdicts, applied writes

📋 Artifacts

prompt.txt, agent_output.json, aw.patch, engine logs

The Critical Distinction

”By default, everything in an action runs in the same trust domain. Rogue agents can interfere with MCP servers, access authentication secrets, and make network requests to arbitrary hosts.”

gh-aw constrains agent capabilities through architecture rather than relying on procedural safeguards. The system makes it structurally impossible for agents to access secrets, write directly to the repository, or communicate with unauthorized endpoints — even if the agent is fully compromised.