* feat(executor): close major GHA expression and inter-job gaps
The expression evaluator was basically living in a fantasy world
where every step succeeds, secrets don't exist in expressions,
and jobs can't talk to each other. That's not how GitHub Actions
works, and it meant a *lot* of real-world workflows would silently
produce wrong results or just fail to resolve context variables.
Here's what was broken and what's fixed:
Step outcome/conclusion were hardcoded to "success" — always.
So `if: steps.build.outcome == 'failure'` would never be true,
and `continue-on-error` semantics were completely wrong.
ExpressionContext now tracks real outcome (before continue-on-error)
and conclusion (after). The success()/failure()/cancelled() builtins
now consult actual job state instead of returning constants.
secrets.* was not available in expressions at all. You could use
SecretSubstitution for env var injection, but `if: secrets.TOKEN`
in a conditional? Nope, just resolves to null. Pre-resolve all
referenced secrets into a HashMap at job start and thread it through
the expression context. Evaluator stays synchronous.
needs.* context didn't exist. Job outputs were computed and then
thrown away — nothing flowed between job batches. Add outputs field
to JobResult, resolve job-level output mappings from step outputs,
accumulate across batches in the main execution loop, filter to
declared needs dependencies, and wire into ExpressionContext.
jobs.* context gets the same treatment for free.
While at it, add three new modules that were completely missing:
- artifacts.rs: Local ArtifactStore for actions/upload-artifact
and actions/download-artifact emulation. Filesystem-based,
Arc<RwLock> for cross-job sharing.
- cache.rs: Persistent CacheStore under ~/.wrkflw/cache/ for
actions/cache emulation. SHA-256 keyed with prefix matching
for restore-keys.
- workflow_commands.rs: Parser for ::error::, ::warning::,
::set-output::, ::add-mask::, ::group:: and friends from
step stdout. Handles stop-commands/resume semantics.
414 tests pass, 0 failures, clippy clean.
* fix(executor): fix matrix job context gaps and parameter explosion
The previous commit added expression context enrichment (secrets,
needs, step outcomes) but only wired it into the non-matrix execution
path. Matrix jobs — arguably the *most common* GHA pattern — were
silently getting empty contexts for everything. So secrets.* in a
matrix job? Null. needs.* from an upstream job? Null. steps.X.outcome
after a failure? "success". Brilliant.
It gets worse. Job outputs were keyed by job_result.name, which for
matrix jobs is the display name ("build (os: ubuntu)") rather than
the canonical job key ("build"). Since build_needs_context looks up
by the canonical name from job.needs, matrix job outputs were being
accumulated and then *never found* by downstream jobs.
While at it, unknown step IDs were defaulting to "success" instead
of null, which means `steps.nonexistent.outcome == 'success'` was
true. That's not how GitHub Actions works.
Here's what's fixed:
- Thread secrets, needs context, and step status tracking through
MatrixExecutionContext into execute_matrix_job. Pre-resolve secrets
once before the parallel fan-out, not per-combination.
- Add canonical_name field to JobResult and key the accumulation maps
by it. Matrix combinations use last-write-wins, matching GHA.
- Track step_status_map and job_status_str in the matrix step loop,
mirroring what execute_job already does.
- Resolve job outputs in execute_matrix_job instead of returning
an empty HashMap.
- Thread secrets into execute_composite_action (GHA composites
inherit the calling workflow's secrets).
- Fix unknown step ID fallback from "success" to Null.
- Refactor preprocess_expressions (10 params → 3) and
evaluate_condition_with_context (9 params → 2) to accept
ExpressionContext directly. Every new context dimension was
requiring changes to 30+ call sites. Not anymore.
274 tests pass, clippy clean.
* fix(executor): address review findings from GHA emulation PR
The previous two commits closed a bunch of GHA expression gaps, but
the review turned up several things that shouldn't ship as-is.
The worst offender: execute_composite_action was passing
step_statuses: &HashMap::new() and job_status: "success" as
*constants* into every step execution. So steps.X.outcome inside
a composite action? Always Null. success()/failure()? Always
true/false. Composite actions with conditional steps based on
prior step results were quietly broken.
Fix that by tracking composite_step_statuses and
composite_job_status within the composite loop, same pattern as
execute_job and execute_matrix_job.
While at it:
- Extract record_step_status() to deduplicate ~15 identical lines
of StepStatus-to-string conversion + map insertion that were
copy-pasted between execute_job and execute_matrix_job. Those
were one edit away from diverging.
- Derive Copy on StepStatus (it's a fieldless enum — cloning it
was just noise) and add Display so we stop writing the same
match arm in three places.
- Pass actual job_status_str into resolve_job_outputs instead of
hardcoding "success". If a job output expression references
success()/failure(), it should get the real answer.
- Change CacheStore::new() from Option<Self> to Result<Self, String>
so callers can actually tell *why* cache init failed instead of
getting a silent None.
- Downgrade artifacts and cache modules from pub to pub(crate) —
they're not wired into the engine yet, so exposing them as public
API is just asking for someone to depend on an unstable interface.
- Replace Box::leak in expression.rs tests with lazy_static
constants. The leaks were harmless in practice but fragile if
anyone ever puts empty_ctx() in a loop.
274 tests pass, clippy clean.
* feat(executor): wire up artifact, cache, and workflow command modules
Three modules — artifacts.rs, cache.rs, workflow_commands.rs — were
sitting in the crate fully implemented, fully tested, and fully
*unused*. 751 lines of dead code that the compiler was rightly
complaining about. Let's fix that.
ArtifactStore now gets created per workflow run and threaded through
the entire execution context chain (execute_job_batch → job_with_matrix
→ matrix_job → StepExecutionContext). actions/upload-artifact and
actions/download-artifact are emulated inline in execute_step(),
following the same pattern as actions/checkout.
CacheStore is stateless (filesystem-backed at ~/.wrkflw/cache/), so
it gets created on-demand — no threading needed. actions/cache does
restore-or-save in one shot and writes cache-hit to GITHUB_OUTPUT.
Not the full dual-phase model GHA uses, but good enough for local
emulation.
Workflow commands (::set-output::, ::error::, ::warning::, etc.) are
now parsed from step stdout via process_workflow_commands() in all
three step loops (execute_job, execute_matrix_job, composite action).
The deprecated ::set-output:: feeds into step_outputs_map *before*
GITHUB_OUTPUT file processing, so the modern mechanism takes
precedence. ::add-mask:: is logged but not wired to SecretMasker yet
because that requires Arc<Mutex<>> wrapping — deferred to a follow-up.
While at it, added preprocess_with_value() so action with: params
get ${{ }} expression resolution before being used by the emulation
branches.
* fix(executor): fix review findings — cache lifecycle, secret duplication, path traversal
The PR review flagged several real issues. Let's go through them.
CacheStore was being created *per step* inside execute_step, while
ArtifactStore was sensibly created once per workflow run. This meant
parallel matrix jobs would race on the same ~/.wrkflw/cache/
directory with zero coordination. Promote CacheStore to per-run
lifecycle matching ArtifactStore, threading it through
JobExecutionContext → MatrixExecutionContext → StepExecutionContext.
resolve_secrets_for_context was called once in
execute_job_with_matrix (for the matrix path) and then *again* inside
execute_job (for the non-matrix path), completely wasting the first
resolution. Move the call before the matrix/non-matrix branch and
thread the result via a new secrets_context field on
JobExecutionContext.
ExpressionContext was constructed verbatim ~10 times across
engine.rs — 8 fields, every single time. Every new field meant
updating all 10 sites. Add expr_context() and
expr_context_with_env() methods to StepExecutionContext so
this boilerplate collapses to one-line calls.
While at it, add path traversal and symlink protections:
walk_files (artifacts) and copy_dir_contents (cache) now skip
symlinks; artifact upload validates glob results are within the
workspace via canonicalization; cache save/restore validates paths
don't escape the workspace.
* fix(executor): fix review findings — output context, reusable stores, path validation
It turns out that resolve_job_outputs was constructing its
ExpressionContext with an *empty* step_statuses map, so any job
output expression referencing steps.<id>.conclusion would silently
resolve to Null. Not great when you're trying to emulate GHA
faithfully.
While at it, the reusable workflow paths were each creating their
own ArtifactStore rooted at Path::new(".") instead of reusing the
parent workflow's stores. This meant artifacts uploaded in a parent
job were invisible to the reusable workflow, and vice versa. Same
story for inter-job needs context — reusable workflows were passing
empty HashMaps, so multi-job called workflows with `needs:` deps
would get nothing. Let's fix that.
The matrix output overwrite was a known GHA semantic edge case, but
we were silently clobbering previous combinations' outputs. Now we
at least warn about it so nobody spends an afternoon debugging why
needs.build.outputs only has values from the last matrix combo.
Also replaced the naive `path.contains("..")` check in cache path
validation with proper Path::components() inspection, because a
directory literally named "..bar" is not a traversal attack.
* fix(executor): close review findings — path traversal, JobStatus duplication, stale plan.md
The download-artifact emulation was joining user-provided `path`
input directly via ctx.working_dir.join() without ever checking
that the result stays inside the workspace. A crafted `with.path`
containing `../` could write artifact files anywhere on disk.
Add canonicalize + starts_with validation matching the pattern
already used in cache.rs. This was the only action emulation path
missing the check — upload-artifact and cache both had it.
While at it, add a Display impl for JobStatus to kill the three
identical match blocks converting status to "success"/"failure"/
"skipped" strings. StepStatus already had this — JobStatus just
got forgotten.
Also remove plan.md (development scratch file that snuck into the
branch), add TODO comments for reusable workflow secret propagation,
and clarify why GitLab pipelines pass empty needs context maps.
* fix(executor): close review findings — masker wiring, loop dedup, cache eviction
The code review flagged six issues ranging from "this will leak
secrets" to "this will eat your disk." Let's go through them.
The ::add-mask:: workflow command was being parsed and then *thrown
away*. The SecretMasker took &mut self for add_secret(), making it
impossible to call through the shared &SecretMasker references
threaded through step execution. Fix this by switching SecretMasker
to interior mutability (RwLock) so add_secret() takes &self. Now
process_workflow_commands() actually wires AddMask values into the
masker. Secrets no longer appear in plain text in output logs.
The step-execution loops in execute_job and execute_matrix_job were
copy-pasted with minor formatting differences — the kind of thing
that drifts apart over time until someone spends a week debugging
why matrix jobs don't process workflow commands the same way as
regular jobs. Extract a StepLoopState struct with a process_outcome
method that handles status recording, logging, command parsing, and
env file application in one place.
The cache at ~/.wrkflw/cache/ had no eviction and no size limit,
which is the kind of optimism I find *alarming* in something that
writes to the user's home directory. Add a 1 GiB cap with LRU
eviction by mtime, triggered automatically after each save.
While at it: fix cache.rs file_name().unwrap_or_default() to return
a proper error instead of writing to an empty filename, fix
artifacts.rs strip_prefix fallback to error instead of silently
using absolute paths, and aggregate reusable workflow outputs into
the parent JobResult instead of always returning an empty map.
14 new tests covering all the above. 292 tests pass, zero failures.
* fix(executor): harden artifact store, masker locks, and cache prefix matching
It turns out that ArtifactStore was happily accepting names like
"../../escape" and joining them directly onto the artifact root
path. That's a path traversal bug. The artifact name comes from
user workflow `with.name`, so *any* workflow could write outside
the artifact directory. Let's not do that.
Added sanitize_artifact_name() that rejects names containing path
separators, "..", null bytes, or leading dots. Both upload() and
download() now validate before touching the filesystem.
While at it, the upload/download methods were async functions that
did nothing but blocking std::fs calls — which blocks the tokio
runtime thread. Wrapped the file I/O in spawn_blocking so we stop
lying about being async.
SecretMasker's RwLock calls all used .unwrap(), which means a
single panic while holding a write guard poisons the lock and
cascades into panics on every subsequent access. Replaced with
.unwrap_or_else(|e| e.into_inner()) so we recover from poisoned
locks instead of bringing down the whole workflow execution.
CacheStore::find_by_prefix was returning whichever entry read_dir
happened to yield first — completely non-deterministic. Now scans
all entries and picks the one with the newest mtime, which matches
GitHub Actions' behavior of preferring the most recently created
matching key.
Also removed an unnecessary clone in StepLoopState::process_outcome
(Skipped arm was using ref + clone when it could just move), and
fixed pre-existing clippy warnings in the secrets crate tests.
* fix(executor): wrap CacheStore I/O in spawn_blocking, add missing tests
CacheStore was doing all its filesystem I/O — recursive directory
copies, eviction scans, the whole shebang — directly on the async
executor thread. ArtifactStore already knew better and used
spawn_blocking. CacheStore just... didn't. Let's fix that.
Make CacheStore Clone and move restore/save into async wrappers
that offload the real work to spawn_blocking. The internal logic
is unchanged, just properly sequestered where it won't block the
Tokio runtime.
While at it, document the cache save-on-miss divergence from real
GitHub Actions (we save eagerly; GHA uses a post-step hook), add
a proper doc comment explaining the RwLock poison recovery strategy
in SecretMasker, and add unit tests for build_needs_context and
resolve_job_outputs that were previously untested.
* fix(executor): close silent failure modes found in PR review
Four things came out of the review that were all variations of the
same theme: "something goes wrong and nobody hears about it."
The cache restore task was swallowing panics via `.ok()?`, which is
a lovely way to make debugging impossible. Match the error handling
pattern already used in save() — log the panic, return None.
The GITHUB_OUTPUT write for cache-hit was using `let _ =` to discard
I/O errors. This was the *only* silent error discard in the entire
module, which makes it feel less like a deliberate choice and more
like an oversight. Log the failure instead.
`secrets: inherit` on reusable workflows was silently ignored because
the code only handled the mapping case and inherit comes through as
a bare string. Emit an explicit warning so users know their secrets
aren't actually being inherited yet.
While at it, replace `&line[2..]` with `strip_prefix("::")` in the
workflow command parser. The starts_with check above guarantees
safety, but direct byte indexing is fragile and strip_prefix says
what it means.
* fix(executor): fix atomicity, eager save, and untested eviction
Three bugs found during PR review, all quietly waiting to bite
someone:
1. SecretMasker::add_secret() was doing two separate RwLock
acquisitions — one for secret_cache, one for secrets. A concurrent
mask() call between the two writes sees inconsistent state. The fix
is embarrassingly simple: put both collections behind a *single*
RwLock<SecretData>. One lock, one write, no window.
2. The actions/cache emulation was saving eagerly on cache miss,
right there in the step. Real GitHub Actions saves in a post-step
hook that only runs *after all steps complete* and *only if the job
succeeded*. Our eager save could persist stale or incomplete data
into ~/.wrkflw/cache/ and poison future runs. Defer saves into a
Mutex<Vec<PendingCacheSave>> on the step context, flush at
end-of-job conditional on success.
3. The eviction test was, shall we say, *aspirational*. It created
two tiny entries well under the 1 GiB limit, then asserted both
still existed. That's not testing eviction, that's testing storage.
Make max_size configurable on CacheStore, set it to 30 bytes in the
test, and actually verify the oldest entry gets evicted.
* fix(executor): close reusable workflow secrets gap, wire matrix env substitution
The reusable workflow path was passing `None, None` for
secret_manager and secret_masker into child execute_job_batch
calls. This meant that `${{ secrets.* }}` expressions inside
called workflows resolved to null even when the parent had
explicitly passed secrets via the `secrets:` mapping.
That's... not great. Pass the parent's secret_manager and
secret_masker through so secrets actually work in reusable
workflows.
While at it, the matrix job path had a lingering TODO where
job-level env values like `MY_VAR: ${{ matrix.os }}` were
inserted raw without expression substitution. Wire up
preprocess_expressions with the matrix combination context so
these actually resolve. Collect resolved values before
insertion to dodge the borrow checker, which rightfully
complains about mutating job_env while reading from it.
Also add a fast-path to CacheStore::find_by_prefix that checks
the exact key hash before scanning the entire cache directory,
and add integration tests that exercise the artifact upload/
download, cache miss-defer-flush-restore, needs.* context,
step outcome/conclusion, and secrets.* expression paths through
the actual execution wiring — not just the isolated helpers.
* fix(executor): close security holes and correctness bugs from review
The code review turned up some genuinely scary stuff. Let's go
through it.
The artifact download had an unwrap_or(&file_path) fallback that
meant if strip_prefix failed for *any* reason, we'd pass the
absolute file path to Path::join — which silently replaces the
base. So target.join("/etc/passwd") gives you "/etc/passwd".
That's an arbitrary file write. Not great.
The upload side was also doing strip_prefix against the
non-canonical workspace path while the security filter used the
canonical one. Inconsistent canonicalization is how you get path
traversal bugs. Use canonical paths for both.
success() was implemented as `job_status != "failure"`, which
means it returns true when the job is *cancelled*. In real GHA,
success() and cancelled() are mutually exclusive. This also meant
both could be true simultaneously, which is just nonsensical.
The workflow command parser had no URL-decoding at all. GHA
percent-encodes newlines, colons, and percent signs in command
values. Without decoding, ::add-mask:: masks the encoded form
while the actual secret appears unmasked in logs. That's a secret
leak.
Condition evaluation was defaulting to true on parse errors,
silently running steps that should be skipped. A typo in an if:
condition should not mean "yes please run this." Default to false.
secrets: inherit was logging a "not yet supported" warning and
then doing absolutely nothing. Now it actually propagates the
parent secrets to the child workflow, which is what it's supposed
to do.
While at it: filter .cache_key metadata from cache restore,
remove the phantom jobs.*.result context that doesn't exist in
real GHA, make aggregate_reusable_workflow_outputs deterministic,
fix walk_files silently swallowing errors, reject empty
set-output/save-state names, add missing annotation fields, and
stop preprocess_with_value from returning raw ${{ }} templates
on error.
* fix(executor): fix decode ordering, multi-path cache, and missing tests
The decode_value() function in workflow_commands.rs was decoding %25
(percent) *first* in the replacement chain. This means an input like
%250A would first become %0A, and then the next replacement would
happily turn that into a newline. Classic double-decode bug — the
kind of thing that looks correct until someone actually sends a
literal percent sign through the system.
Move %25 decoding to the *end* of the chain, matching what GitHub
Actions actually does.
While at it, actions/cache's `path` input supports multiple
newline-separated paths (e.g. "node_modules\n~/.npm"), but we were
passing the raw multi-line string as a single path. The second path
would just silently not get cached. Fix this by splitting on
newlines and saving/restoring each path independently. The cache
store now uses a composite (key, path) hash so multiple paths under
the same key don't stomp on each other.
Also add the tests that should have been there from the start:
download-artifact path traversal rejection, download-all-artifacts
flow, cache empty key/path validation, cache multi-path deferred
save, and the double-decode regression test.
* fix(executor): close correctness, security, and async safety gaps from review
The PR review turned up a frankly embarrassing number of issues
across the GHA emulation stack. Let's go through the highlights.
The expression evaluator had *multiple* correctness bugs: toJSON()
was emitting unescaped strings (producing invalid JSON), format()
was doing sequential replacements so arg content could be consumed
by later placeholders, fromJSON() was using trim_matches('"')
which strips quotes greedily from both ends, and github.event.*
nested context access silently returned Null because of an overly
strict parts.len() == 2 guard. All fixed.
The secret masker was leaking the first and last characters of
every secret longer than 8 bytes. This is not how you do secret
masking. Replaced with fixed *** output matching GHA behavior.
Also fixed byte/char length confusion on multi-byte secrets,
sorted secrets longest-first to handle overlapping matches, and
added the missing aws_secret and api_key checks to
has_secret_patterns().
Three std::process::Command calls in the Rust action handler were
blocking the tokio runtime. Under parallel matrix execution this
would starve the thread pool. Replaced with tokio::process::Command.
Cache saves were non-atomic — if the process died mid-copy, both
old and new entries were lost. Now writes to a .tmp dir first and
renames into place. While at it, made all .cache_key references
use the CACHE_KEY_METADATA_FILE constant instead of hardcoding the
string in three places.
Other fixes: workflow command parser was trim()ing lines (GHA only
recognizes commands at column 0), INPUT_* env vars were being
logged in verbose output without masking, local reusable workflow
paths had no traversal validation, artifact names with null bytes
were silently stripped instead of rejected, hashFiles() wasn't
checking for symlink traversal, and workflow_commands was
unnecessarily pub.
* fix(executor): fix UTF-8 corruption in expression tokenizer and format()
It turns out the expression tokenizer was storing its input as `&[u8]`
and happily casting individual bytes to `char` via `u8 as char`. This
works fine for ASCII, which is *most* of what GHA expressions contain.
But the moment someone puts a multi-byte UTF-8 character inside a
string literal — say, `format('{0} → {1}', 'a', 'b')` — each byte
of `→` (0xE2 0x86 0x92) gets treated as a separate Latin-1 character.
The arrow becomes `â\u{86}\u{92}`. Not great.
The same bug existed in the `format()` builtin, which iterated
`fmt.as_bytes()` and did the same `u8 as char` dance.
Fix both: change the `Tokenizer` to store `&str` instead of `&[u8]`,
use proper char-aware iteration in `read_string()` (advancing by
`ch.len_utf8()` instead of always 1), and rewrite the `format()`
replacement loop to use `char_indices()`. All the ASCII-only paths
(operators, identifiers, numbers) still use byte-level access since
they only care about ASCII — no performance regression there.
While at it, fix an unnecessary `String::clone` per comparison in
`aggregate_reusable_workflow_outputs` where `sort_by_key` was
cloning the key string. `sort_by(|a, b| a.0.cmp(b.0))` does the
same thing without the allocation.
* refactor(executor): introduce JobServices struct, fix async mutex misuse
The review found that every function in the job/step execution
hierarchy was threading 5-6 individual parameters (secrets_context,
needs_context, needs_results, artifact_store, cache_store) through
their signatures, leading to 15+ argument functions peppered with
#[allow(clippy::too_many_arguments)].
This is not great.
Introduce a JobServices<'a> struct that groups those five fields
into a single borrowing container. JobExecutionContext,
MatrixExecutionContext, and StepExecutionContext all now carry a
`services: JobServices<'a>` instead of five separate fields. The
net result is ~20 fewer lines despite adding a new struct, because
the call sites got dramatically simpler.
While at it, fix the pending_cache_saves Mutex: it was using
std::sync::Mutex in async context. It worked *only* because the
lock was never held across an await point, but that's the kind of
thing that breaks silently the moment someone adds an await inside
the critical section. Switch to tokio::sync::Mutex so the compiler
will actually stop you from doing something stupid.
Also improve CacheStore::new() to fall back to $HOME when
dirs::home_dir() returns None (hello, minimal Docker containers),
and document the github.event.* env-var approximation limitation
in expression.rs so nobody has to rediscover it the hard way.
* fix(executor): address review findings — cache safety, mask perf, missing tests
The cache save_inner claimed to do an "atomic swap" via remove_dir_all
followed by rename. It turns out that two operations are not, in fact,
one operation. If the process dies between the remove and the rename,
the cache entry is just *gone* and the temp dir is orphaned.
Fix the save path to rename-aside-then-swap: old entry goes to `.old`,
`.tmp` renames into place, then `.old` gets cleaned up. Still not truly
atomic (POSIX doesn't give us that for directories), but the window
where the entry is absent is now minimal rather than guaranteed.
While at it, fix three other review findings:
- SecretMasker::mask() was cloning and sorting all secret pairs on
*every single call*. Cache the sorted pairs behind an Option that
invalidates on mutation. The read-path fast-path avoids the write
lock entirely.
- evaluate_condition_with_context now says "failed to parse" instead
of "failed to evaluate" when a condition errors, so users can
actually tell the difference between "your condition is false" and
"your condition is broken syntax and we're skipping your step."
- aggregate_reusable_workflow_outputs silently overwrote colliding
keys from different jobs. Now it warns when that happens, because
non-deterministic output clobbering deserves at least a log line.
Add unit tests for build_needs_context filtering, output key collision
behavior, and empty-input edge cases.
* fix(executor): close correctness, perf, and architecture gaps from review
The PR review turned up a nice collection of things that were either
wrong, wasteful, or unnecessarily messy. Let's go through them.
current_dir().unwrap_or_default() in execute_matrix_job was silently
producing an empty PathBuf on failure, which would then quietly
poison every expression resolution downstream. That's not error
handling, that's *wishful thinking*. Propagate the error properly.
toJSON() had hand-rolled string escaping that missed control chars,
null bytes, and who knows what else. We already depend on serde_json
— just use serde_json::to_string() like a normal person.
SecretMasker::mask() was deep-cloning the entire sorted_pairs Vec on
every single call. With 100 secrets that's ~6KB of allocation per
log line. Wrap it in Arc so the clone is an atomic refcount bump.
pending_cache_saves used tokio::sync::Mutex despite never being held
across an await point. Switch to std::sync::Mutex — no async overhead
for what is fundamentally a synchronous lock-push-unlock pattern.
resolve_job_outputs was silently swallowing expression evaluation
errors and returning empty strings. Now it logs a warning so you
at least have a prayer of debugging broken output references.
The local-file and remote-file branches of execute_reusable_workflow_job
had ~80 lines of duplicated code for env setup, secrets propagation,
batch execution, and result aggregation. Extract run_called_workflow()
so we stop maintaining the same logic in two places that will
inevitably diverge.
While at it, fold secret_manager and secret_masker into JobServices
to reduce the parameter sprawl. Mark with_mask_char as #[cfg(test)]
since create_mask always produces "***". Add tests for toJSON with
control chars, multi-path cache restore, and missing step references.
* fix(executor): close review findings — decode gaps, path hardening, refactor
The PR review caught several real issues that needed addressing.
First, decode_value() in the workflow commands parser was missing
%2C (comma) and %3B (semicolon) from its percent-decoding table.
GitHub Actions encodes these in command parameter values, so any
annotation with a comma in the file path would silently corrupt
the value. Not great when your whole job is to parse these things.
Second, the download-artifact path traversal check was using
unwrap_or_else on a failed canonicalize of the *workspace itself*,
falling back to a potentially-relative path as the safety baseline.
If the workspace path was somehow non-canonical, the starts_with
check becomes meaningless. Now it rejects outright if the workspace
can't be canonicalized.
Third, add_secret was using chars().count() for the min_length
check, which counts Unicode scalar values rather than bytes. This
was an unintentional behavior change from the original len() — a
2-char emoji string at 8 bytes would get rejected despite being
substantial. Reverted to len().
While at it, extracted the three inline action handlers
(upload-artifact, download-artifact, cache) from the 300+ line
execute_step match chain into dedicated handle_* functions. The
function was getting *unwieldy* and this makes each handler
independently testable.
Added tests for: condition parse-error → false behavior change,
format() out-of-bounds placeholders, format() no double-substitution,
%2C/%3B decoding, download-all with empty store, and
process_workflow_commands edge cases.
WRKFLW
A command-line tool for validating and executing GitHub Actions workflows locally. Test your workflows on your machine before pushing to GitHub.
Features
- TUI interface — interactive terminal UI for browsing, running, and monitoring workflows
- Workflow validation — syntax checks, structural validation, and composite action input cross-checking with CI/CD-friendly exit codes
- Local execution — run workflows using Docker, Podman, or emulation mode (no containers)
- Job selection — run individual jobs with
--jobflag or via TUI job selection mode - Job dependency resolution — automatic ordering based on
needswith parallel execution of independent jobs - Action support — Docker container actions, JavaScript actions, composite actions, and local actions
- Reusable workflows — execute caller jobs via
jobs.<id>.uses(local orowner/repo/path@ref) - GitHub context emulation — environment variables,
GITHUB_OUTPUT,GITHUB_ENV,GITHUB_PATH,GITHUB_STEP_SUMMARY - Matrix builds — full support for
include,exclude,max-parallel, andfail-fast - Secrets management — multiple providers (env, file, Vault, AWS, Azure, GCP) with masking and encryption
- Remote triggering — trigger
workflow_dispatchruns on GitHub or GitLab pipelines - GitLab support — validate and trigger GitLab CI pipelines
Installation
cargo install wrkflw
Or build from source:
git clone https://github.com/bahdotsh/wrkflw.git
cd wrkflw
cargo build --release
Quick Start
# Launch the TUI (auto-detects .github/workflows)
wrkflw
# Validate workflows
wrkflw validate
# Run a workflow
wrkflw run .github/workflows/ci.yml
Usage
Validation
# Validate all workflows in .github/workflows
wrkflw validate
# Validate specific files or directories
wrkflw validate path/to/workflow.yml
wrkflw validate path/to/workflows/
# Validate multiple paths
wrkflw validate flow-1.yml flow-2.yml path/to/workflows/
# GitLab pipelines
wrkflw validate .gitlab-ci.yml --gitlab
# Verbose output
wrkflw validate --verbose path/to/workflow.yml
Exit codes: 0 = all valid, 1 = validation failures, 2 = usage error. Use --no-exit-code to disable.
Execution
# Run with Docker (default)
wrkflw run .github/workflows/ci.yml
# Run with Podman
wrkflw run --runtime podman .github/workflows/ci.yml
# Run in emulation mode (no containers)
wrkflw run --runtime emulation .github/workflows/ci.yml
# Run a specific job
wrkflw run --job build .github/workflows/ci.yml
# List jobs in a workflow
wrkflw run --jobs .github/workflows/ci.yml
# Preserve failed containers for debugging
wrkflw run --preserve-containers-on-failure .github/workflows/ci.yml
TUI
# Open TUI with default directory
wrkflw tui
# Open with specific runtime
wrkflw tui --runtime podman
Controls:
| Key | Action |
|---|---|
Tab / 1-4 |
Switch tabs (Workflows, Execution, Logs, Help) |
Up/Down or j/k |
Navigate |
Space |
Toggle selection |
Enter |
Run / View details |
r |
Run selected workflows |
a / n |
Select all / Deselect all |
e |
Cycle runtime (Docker / Podman / Emulation) |
v |
Toggle Execution / Validation mode |
t |
Trigger remote workflow |
q / Esc |
Quit / Back |
Remote Triggering
Trigger workflow_dispatch events on GitHub or GitLab.
# GitHub (requires GITHUB_TOKEN env var)
wrkflw trigger workflow-name --branch main --input key=value
# GitLab (requires GITLAB_TOKEN env var)
wrkflw trigger-gitlab --branch main --variable key=value
Runtime Modes
| Mode | Description | Best for |
|---|---|---|
| Docker (default) | Full container isolation, closest to GitHub runners | Production, CI/CD |
| Podman | Rootless containers, no daemon required | Security-conscious environments |
| Emulation | Runs directly on host, no containers needed | Quick local testing |
Reusable Workflows
jobs:
call-local:
uses: ./.github/workflows/shared.yml
call-remote:
uses: my-org/my-repo/.github/workflows/shared.yml@v1
with:
foo: bar
secrets:
token: ${{ secrets.MY_TOKEN }}
- Local refs resolve relative to the working directory
- Remote refs are shallow-cloned at the specified
@ref with:entries becomeINPUT_<KEY>env vars;secrets:becomeSECRET_<KEY>
Limitations: outputs from called workflows are not propagated back; secrets: inherit is not supported; private repos for remote uses: are not yet supported.
Secrets Management
WRKFLW supports GitHub Actions-compatible ${{ secrets.* }} syntax with multiple providers:
# Environment variables (simplest)
export GITHUB_TOKEN="ghp_..."
wrkflw run .github/workflows/ci.yml
# File-based secrets (JSON, YAML, or .env format)
# Configure in ~/.wrkflw/secrets.yml
Supported providers: environment variables, file-based, HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, Google Cloud Secret Manager. See the secrets demo for detailed examples.
Limitations
Supported
- Workflow syntax validation with exit codes
- Job dependency resolution and parallel execution
- Matrix builds, environment variables, GitHub context
- Container, JavaScript, composite, and local actions
- Reusable workflows (caller jobs)
- Environment files (
GITHUB_OUTPUT,GITHUB_ENV,GITHUB_PATH,GITHUB_STEP_SUMMARY) - TUI and CLI interfaces
- Container cleanup (even on Ctrl+C)
Not Supported
- GitHub encrypted secrets and fine-grained permissions
actions/cache(no persistent cache between runs)- Artifact upload/download between jobs
- Event triggers other than
workflow_dispatch - Windows and macOS runners
- Job/step timeouts, concurrency, and cancellation
- Service containers in emulation mode
- Reusable workflow output propagation (
needs.<id>.outputs.*)
Project Structure
WRKFLW is organized as a Cargo workspace with focused crates:
| Crate | Purpose |
|---|---|
wrkflw |
CLI binary and library entry point |
wrkflw-executor |
Workflow execution engine |
wrkflw-parser |
Workflow file parsing and schema validation |
wrkflw-evaluator |
Structural evaluation of workflow files |
wrkflw-validators |
Validation rules for jobs, steps, triggers |
wrkflw-runtime |
Container and emulation runtime abstractions |
wrkflw-ui |
Terminal user interface |
wrkflw-models |
Shared data structures |
wrkflw-matrix |
Matrix expansion utilities |
wrkflw-secrets |
Secrets management with multiple providers |
wrkflw-github |
GitHub API integration |
wrkflw-gitlab |
GitLab API integration |
wrkflw-logging |
In-memory logging for TUI/CLI |
wrkflw-utils |
Shared helpers |
License
MIT License - see LICENSE for details.
