Architecture
A user-facing overview of how cmakefmt works and why it is built the way it
is.
Mental Model
Section titled “Mental Model”cmakefmt is not a regex-based text rewriter. It is a structured pipeline:
discover files -> resolve config -> parse CMake source -> classify commands using the command registry -> build formatted layout decisions -> emit text / diff / check result / in-place rewriteThat structure is what makes the tool safe and predictable — and what separates it from a simple line-by-line formatter.
Main Layers
Section titled “Main Layers”Parser
Section titled “Parser”The parser is built on a pest PEG grammar. It understands:
- command invocations
- quoted, unquoted, and bracket arguments
- comments
- variable references
- generator expressions
- continuation lines
Comments are preserved as real syntax nodes throughout — they are never stripped and guessed at later.
Command Registry
Section titled “Command Registry”The registry is what gives cmakefmt its semantic awareness.
Without it, every argument in:
target_link_libraries(foo PUBLIC bar PRIVATE baz)looks like a generic positional token. With it, the registry recognizes that
PUBLIC and PRIVATE are semantic keywords that start new argument groups.
That knowledge is what lets the formatter produce keyword-aware, correctly
grouped output instead of flattened token streams.
The registry is populated from two sources:
- built-in specs for CMake commands and supported module commands (audited through CMake 4.3.1)
- optional user config under
commands:
Formatter
Section titled “Formatter”Once the source is parsed and command shapes are known, the formatter converts the AST into layout decisions using a document model inspired by Wadler and Lindig’s pretty-printing algorithm.
In practice, this means it can ask:
- can this stay on one line?
- if not, should it hang-wrap?
- if not, should it go fully vertical?
That is how cmakefmt gets stable, principled wrapping behavior instead of
ad-hoc line splitting that changes every time you touch a file.
Config
Section titled “Config”Config resolution is layered — later layers only apply when earlier ones are absent:
- CLI overrides
- explicit
--config-filefiles, if any - nearest discovered
.cmakefmt.yaml,.cmakefmt.yml, or.cmakefmt.toml - home-directory fallback config
- built-in defaults
Make the resolution process visible with:
cmakefmt --show-config-path src/CMakeLists.txtcmakefmt --show-config src/CMakeLists.txtcmakefmt --explain-configCLI Workflow Layer
Section titled “CLI Workflow Layer”The CLI is far more than a thin wrapper around format_source. It handles:
- recursive file discovery
- ignore files and Git-aware selection
--check,--diff, and JSON reporting- in-place rewrites
- partial and range formatting
- progress bars and parallel execution
- diagnostics and summary reporting
That workflow layer is a large part of what makes cmakefmt useful in real
repositories rather than just in toy examples.
Diagnostics
Section titled “Diagnostics”When something goes wrong, cmakefmt tries hard to explain:
- which file failed
- where it failed
- what source text was involved
- what config was active
- what likely caused the failure
This is possible because the architecture keeps spans, config provenance, and formatter decision context around long enough to report them meaningfully — rather than discarding context as soon as each stage completes.
Design Priorities
Section titled “Design Priorities”The codebase is intentionally optimized around:
- correctness over cleverness — no surprising heuristics
- speed that is visible in day-to-day workflows — 20× faster than
cmake-formaton real corpora - strong diagnostics — failures explain themselves
- configurability without scriptable config files — powerful without being dangerous
- maintainability of the grammar/registry/formatter pipeline — easy to extend correctly