Architecture
A user-facing overview of how cmakefmt works and why it is built the way it
is.
Mental Model
cmakefmt is not a regex-based text rewriter. It is a structured pipeline:
discover files
-> resolve config
-> parse CMake source
-> classify commands using the command registry
-> build formatted layout decisions
-> emit text / diff / check result / in-place rewrite
That structure is what makes the tool safe and predictable — and what separates it from a simple line-by-line formatter.
Main Layers
Parser
The parser is built on a pest PEG grammar. It understands:
- command invocations
- quoted, unquoted, and bracket arguments
- comments
- variable references
- generator expressions
- continuation lines
Comments are preserved as real syntax nodes throughout — they are never stripped and guessed at later.
Command Registry
The registry is what gives cmakefmt its semantic awareness.
Without it, every argument in:
target_link_libraries(foo PUBLIC bar PRIVATE baz)
looks like a generic positional token. The registry knows that PUBLIC and
PRIVATE are not generic tokens — they start new argument groups. That
knowledge is what lets the formatter produce keyword-aware, correctly grouped
output instead of flattened token streams.
The registry is populated from two sources:
- built-in specs for CMake commands and supported module commands (audited through CMake 4.3.1)
- optional user config under
commands:
Formatter
Once the source is parsed and command shapes are known, the formatter converts the AST into layout decisions using a Wadler-Lindig-style document model.
In practice, this means it can ask:
- can this stay on one line?
- if not, should it hang-wrap?
- if not, should it go fully vertical?
That is how cmakefmt gets stable, principled wrapping behavior instead of
ad-hoc line splitting that changes every time you touch a file.
Config
Config resolution is layered — later layers only apply when earlier ones are absent:
- CLI overrides
- explicit
--config-filefiles, if any - nearest discovered
.cmakefmt.yaml,.cmakefmt.yml, or.cmakefmt.toml - home-directory fallback config
- built-in defaults
Make the resolution process visible with:
cmakefmt --show-config-path src/CMakeLists.txt
cmakefmt --show-config src/CMakeLists.txt
cmakefmt --explain-config
CLI Workflow Layer
The CLI is far more than a thin wrapper around format_source. It handles:
- recursive file discovery
- ignore files and Git-aware selection
--check,--diff, and JSON reporting- in-place rewrites
- partial and range formatting
- progress bars and parallel execution
- diagnostics and summary reporting
That workflow layer is a large part of what makes cmakefmt useful in real
repositories rather than just in toy examples.
Diagnostics
When something goes wrong, cmakefmt tries hard to explain:
- which file failed
- where it failed
- what source text was involved
- what config was active
- what likely caused the failure
This is possible because the architecture keeps spans, config provenance, and formatter decision context around long enough to report them meaningfully — rather than discarding context as soon as each stage completes.
Design Priorities
The codebase is intentionally optimized around:
- correctness over cleverness — no surprising heuristics
- speed that is visible in day-to-day workflows — 20× faster than
cmake-formaton real corpora - strong diagnostics — failures explain themselves
- configurability without scriptable config files — powerful without being dangerous
- maintainability of the grammar/registry/formatter pipeline — easy to extend correctly