Codebase Map¶
Quick orientation guide for anyone working on this codebase — human contributors and AI coding agents alike.
What this project is¶
Ralphify is a CLI tool (ralph) that runs AI coding agents in autonomous loops. It reads a RALPH.md file from a ralph directory, runs commands, assembles a prompt with the output, pipes it to an agent command via stdin, waits for it to finish, then repeats. Each iteration gets a fresh context window. Progress is tracked through git commits.
The core loop is simple. The complexity lives in prompt assembly — running commands and resolving placeholders into the prompt before each iteration.
Directory structure¶
src/ralphify/ # All source code
├── __init__.py # Version detection + app entry point
├── cli.py # CLI commands (run, scaffold) — delegates to engine for the loop
├── engine.py # Core run loop orchestration with structured event emission
├── manager.py # Multi-run orchestration (concurrent runs via threads)
├── _resolver.py # Template placeholder resolution ({{ commands.* }}, {{ args.* }}, {{ ralph.* }})
├── _agent.py # Run agent subprocesses (streaming + blocking modes, log writing)
├── _run_types.py # RunConfig, RunState, RunStatus, Command — shared data types
├── _runner.py # Execute shell commands with timeout and capture output
├── _frontmatter.py # Parse YAML frontmatter from RALPH.md, marker constants
├── _console_emitter.py # Rich console renderer for run-loop events (ConsoleEmitter)
├── _events.py # Event types, emitter protocol, and BoundEmitter convenience wrapper
├── _keypress.py # Cross-platform single-keypress listener (powers the `p` peek toggle)
├── _output.py # ProcessResult base class, subprocess constants (SESSION_KWARGS, SUBPROCESS_TEXT_KWARGS), format durations
└── _brand.py # Brand color constants shared across CLI and console rendering
tests/ # Pytest tests — one test file per module
docs/ # MkDocs site (Material theme) — user-facing documentation
docs/contributing/ # Contributor documentation (this section)
.github/workflows/
├── test.yml # Run tests on push to main and PRs (Python 3.11–3.13)
├── docs.yml # Deploy docs to GitHub Pages on push to main
└── publish.yml # Publish to PyPI on release (with test gate)
Architecture: how the pieces connect¶
The CLI entry point is cli.py:run(), which parses options, reads the ralph directory path, and delegates to engine.py:run_loop() for the actual iteration cycle. The engine emits structured events via an EventEmitter, making the same loop reusable from both the CLI and any external orchestration layer (such as manager.py).
ralph run my-ralph
│
├── cli.py:run() — parse options, print banner
│ ├── Read RALPH.md from the given directory
│ ├── Parse frontmatter (agent, commands, args)
│ └── Build RunConfig and call engine.run_loop()
│
└── engine.py:run_loop(config, state, emitter)
└── Loop:
├── Re-read RALPH.md from disk
├── Run commands → capture output
├── Resolve {{ commands.* }}, {{ args.* }}, and {{ ralph.* }} placeholders
├── Pipe assembled prompt to agent command via subprocess
├── Emit iteration events (started, completed, failed, timed_out)
├── Handle pause/resume/stop requests via RunState
└── Repeat
Placeholder resolution¶
The resolver (_resolver.py) handles three placeholder kinds: {{ commands.<name> }}, {{ args.<name> }}, and {{ ralph.<name> }}. Two functions:
resolve_all()— resolves all three placeholder kinds in a single pass so that a value inserted by one kind (e.g., an arg value containing{{ commands.foo }}) is never re-processed as the other kind. Used by the engine for final prompt assembly. Theralph.*placeholders (ralph.name,ralph.iteration,ralph.max_iterations) provide runtime metadata and require no frontmatter configuration.resolve_args()— resolves only{{ args.<name> }}placeholders. Used by the engine to expand arg references inside commandrunstrings before executing them.
Unmatched placeholders resolve to empty string in both functions.
Event system¶
The run loop communicates via structured events (_events.py). Each event has a type (EventType enum), run ID, typed data payload, and UTC timestamp.
Event data uses TypedDict classes — one per event type — rather than free-form dicts. The key types:
RunStartedData/RunStoppedData— run lifecycle (stop reason is aStopReasonliteral:"completed","error","user_requested")IterationStartedData/IterationEndedData— per-iteration data (return code, duration, log path)CommandsStartedData/CommandsCompletedData— command execution bookendsPromptAssembledData— prompt length after placeholder resolutionAgentActivityData— streaming agent output (Claude Code only)AgentOutputLineData— raw line of agent output from any agent (powers live peek)LogMessageData— info/error messages with optional traceback
All payload types are unioned as EventData.
Emitter implementations:
EventEmitter— protocol that any listener implements (just anemit(event)method)NullEmitter— discards events (used in tests)QueueEmitter— pushes events into aqueue.Queuefor async consumptionFanoutEmitter— broadcasts events to multiple emittersBoundEmitter— wraps any emitter with a fixed run ID, so callers don't have to pass the ID on every emit. The engine creates one per run and threads it through all internal functions.
The CLI uses a ConsoleEmitter (defined in _console_emitter.py) that renders events to the terminal with Rich formatting.
Multi-run management¶
manager.py:RunManager orchestrates concurrent runs:
- Creates runs with unique IDs and wraps them in
ManagedRun(config + state + emitter + thread) - Starts each run in a daemon thread via
engine.run_loop() - Supports pause/resume/stop per run via
RunStatethread-safe control methods - Uses
FanoutEmitterto broadcast events to multiple listeners
Key files to understand first¶
engine.py— The core run loop. UsesRunConfigandRunState(from_run_types.py) andEventEmitter. This is where iteration logic lives._run_types.py—RunConfig,RunState,RunStatus, andCommand. These are the shared data types used by the engine, CLI, and manager.cli.py— All CLI commands. Validates frontmatter fields via extracted helpers (_validate_agent,_validate_commands,_validate_credit,_validate_run_options,_validate_declared_args), builds aRunConfig, and delegates toengine.run_loop()for the actual loop. Terminal event rendering lives in_console_emitter.py._frontmatter.py— YAML frontmatter parsing. Extractsagent,commands,argsfrom the RALPH.md file._resolver.py— Template placeholder logic. Small file but critical.
Traps and gotchas¶
If you change frontmatter fields...¶
Frontmatter parsing is in _frontmatter.py:parse_frontmatter(), which returns a raw dict. Each field is then validated and coerced by a dedicated helper in cli.py — e.g. _validate_agent(), _validate_commands(), _validate_credit(). Adding a new frontmatter field means adding a new validator in cli.py and wiring it into _build_run_config().
Field name constants (FIELD_AGENT, FIELD_COMMANDS, FIELD_ARGS, FIELD_CREDIT, CMD_FIELD_NAME, CMD_FIELD_RUN, CMD_FIELD_TIMEOUT) are centralized in _frontmatter.py. Always import these constants instead of hardcoding strings like "agent" or "commands" — this keeps error messages, validation, and placeholder resolution in sync when fields are renamed.
If you add a new CLI command...¶
Add it in cli.py. The CLI uses Typer. Update docs/cli.md to document the new command.
If you change the event system...¶
Events are defined in _events.py:EventType, with a corresponding TypedDict payload class for each type. Adding a new event type requires a new EventType member, a new TypedDict payload class, adding it to the EventData union, and handling it in ConsoleEmitter (_console_emitter.py).
Live agent output (peek)¶
Both execution paths in _agent.py accept an on_output_line(line, stream) callback and drain the agent's stdout/stderr line-by-line — the blocking path uses two background reader threads, and the streaming path forwards each raw line from _read_agent_stream. The engine wires this callback to emit EventType.AGENT_OUTPUT_LINE events, which the ConsoleEmitter renders only while peek is enabled. The p keybinding flips that state via ConsoleEmitter.toggle_peek(), driven by KeypressListener in _keypress.py. The listener only activates on a real TTY; in CI or when stdin is piped it silently no-ops.
The compact peek panel (_IterationPanel / _IterationSpinner) renders the most recent _MAX_VISIBLE_SCROLL lines while buffering up to _MAX_SCROLL_LINES (_console_emitter.py). Shift+P (FULLSCREEN_PEEK_KEY) enters a _FullscreenPeek view — a Rich Live with screen=True that renders the full buffer on the alt screen and accepts vim/less-style navigation (j/k, space/b, g/G, q). All keys route through a single ConsoleEmitter.handle_key() method that owns the keybinding map for both compact and fullscreen modes. Entering fullscreen stops the compact Live so only one renderer owns the terminal; exiting (or iteration end) tears the alt screen down and restores the compact panel with its still-growing buffer.
Credit trailer¶
When credit is true (the default), engine.py:_assemble_prompt() appends _CREDIT_INSTRUCTION to the prompt — a short instruction telling the agent to include a Co-authored-by: Ralphify trailer in git commits. Users can opt out with credit: false in frontmatter.
Subprocess result types¶
_output.py defines ProcessResult, the base dataclass for subprocess results (provides returncode, timed_out, and a success property). Both _runner.py:RunResult (command execution) and _agent.py:AgentResult (agent execution) extend it. If you add a new subprocess wrapper, inherit from ProcessResult to get consistent success/timeout semantics. The module also provides ensure_str() for bytes-to-string decoding, collect_output() for combining stdout+stderr, and SUBPROCESS_TEXT_KWARGS — the shared kwargs dict used by all subprocess.Popen calls to ensure consistent encoding and stream handling.
If you change shutdown or signal handling...¶
Agent subprocesses run in their own process group (start_new_session=True on POSIX, configured via SESSION_KWARGS in _output.py). This lets _kill_process_group() send signals to the agent and all its children at once.
The two-stage Ctrl+C flow:
- First Ctrl+C — the engine's SIGINT handler sets
RunState.stop_requested, which lets the current iteration finish gracefully. - Second Ctrl+C —
KeyboardInterruptpropagates normally and the agent process is killed.
Timeout and cancellation both use a two-step kill: SIGTERM first, then SIGKILL after _SIGTERM_GRACE_PERIOD seconds (3s). If you add a new subprocess wrapper, use _kill_process_group() and SESSION_KWARGS to get consistent cleanup behavior.
Command parsing¶
Commands in RALPH.md frontmatter are parsed with shlex.split() — no shell features. For shell features, users point the run field at a script.
Testing¶
Tests are in tests/ with one file per module. All tests use temporary directories and don't require any external services.
Dependencies¶
Minimal by design:
- typer — CLI framework
- rich — Terminal formatting (used via typer's console)
- pyyaml — YAML frontmatter parsing in
_frontmatter.py
Dev dependencies: pytest, mkdocs, mkdocs-material.