Commit Graph

12 Commits

Author SHA1 Message Date
Gokul
6016887a3b feat(executor): easy GHA emulation fixes for better compatibility (#82)
* feat(executor): add easy GHA emulation fixes for better compatibility

- Expand github.* context with 13 missing env vars (CI, GITHUB_ACTIONS,
  GITHUB_REF_NAME, GITHUB_REF_TYPE, GITHUB_REPOSITORY_OWNER, etc.) and
  improve GITHUB_ACTOR to use git config / $USER instead of hardcoded value
- Enforce timeout-minutes at both job level (default 360m per GHA spec)
  and step level via tokio::time::timeout
- Implement defaults.run.shell and defaults.run.working-directory with
  proper fallback chain: step > job defaults > workflow defaults > bash
- Implement hashFiles() expression function with glob matching, sorted
  file hashing (SHA-256), and integration into the substitution pipeline

* fix(executor): harden hashFiles, working-directory, and shell -e

Three issues from code review, all in the "we got the GHA emulation
*almost* right" category:

1. hashFiles() was returning an empty string when no files matched.
   GHA returns the SHA-256 of empty input (e3b0c44...), not nothing.
   An empty string as a cache key component is the kind of thing
   that silently ruins your day. Also, unreadable files were being
   skipped without a peep — now we at least warn about it.

2. The working-directory default resolution was doing a naive
   Path::join with user-controlled input. If someone writes
   `working-directory: ../../../etc` or an absolute path, join
   happily replaces the base. Inside a container this is *somewhat*
   contained, but in emulation mode it's a real path traversal.
   Normalize the path and reject anything that escapes the
   workspace.

3. The bash -e flag change (correct per GHA spec) was undocumented.
   Scripts that relied on intermediate commands failing without
   aborting the step will now break. Document it in
   BREAKING_CHANGES.md so users aren't left guessing.

* fix(executor): complete the GHA shell invocation and harden hashFiles

The previous commit added `-e` to bash but stopped there, even
though the BREAKING_CHANGES.md *literally documented* the full GHA
invocation as `bash --noprofile --norc -e -o pipefail {0}`. So we
were advertising behavior we weren't actually implementing. This is
not great.

Without `-o pipefail`, piped commands like `false | echo ok` would
silently succeed, which is exactly the kind of divergence that makes
you distrust an emulator. And without `--noprofile --norc`, user
profile scripts can interfere with reproducibility.

While at it, fix hashFiles error handling — it was silently
swallowing read errors and producing a partial hash, which is worse
than failing because you get a *wrong* cache key with no indication
anything went sideways. preprocess_hash_files and
preprocess_expressions now return Result and the engine surfaces
failures as step errors.

Also add the tests that should have been there from the start:
shell invocation flags, working-directory path traversal rejection,
and defaults cascade (step > job > workflow).

* fix(executor): harden hashFiles, timeout, and shell edge cases

The previous round of GHA emulation fixes left a few holes that
would bite you in production:

hashFiles() would happily glob '../../etc/passwd' and hash whatever
it found outside the workspace. It also loaded entire file contents
into memory before hashing, which is *not great* when someone points
it at a large binary artifact. The glob patterns now reject '..'
traversal, and file contents are streamed into the SHA-256 hasher
via io::copy instead.

timeout-minutes accepted any f64 from YAML, including negative
values, NaN, and infinity — all of which make Duration::from_secs_f64
panic. Non-finite and non-positive values now fall back to the GHA
default of 360 minutes.

Unknown shell values were silently accepted with a '-c' fallback.
Now they emit a warning so you at least *know* something is off.

While at it, replaced the hash_files_read_error_returns_err test
that was testing two Ok paths (despite its name) with proper
path-traversal rejection tests.

* fix(executor): fix shadowed timeout_mins and extract sanitization helper

It turns out the job timeout error path was re-reading the *raw*
timeout_minutes value instead of using the already-sanitized one.
If someone set timeout-minutes to NaN or a negative number, the
sanitization would correctly fall back to 360, but the error
message would happily print "Job exceeded timeout of NaN minutes."

Not great.

Extract sanitize_timeout_minutes() so both the job and step
timeout paths use the same logic instead of duplicating the
is_finite/positive/clamp dance. While at it, add proper tests
for NaN, Infinity, negative, zero, and the max clamp — plus a
test that actually exercises the job-level timeout expiry branch,
which previously had zero coverage.
2026-04-02 18:00:41 +05:30
Gokul
fd348a460e fix: actually execute Docker-based GitHub Actions instead of emulating them (#74)
* fix(executor): actually execute Docker-based GitHub Actions instead of emulating them

Third-party GitHub Actions that use Docker (like super-linter) were
silently passing without ever *actually running*. The engine would
resolve the action, pick a Docker image, and then... run
`echo 'Would execute GitHub action: ...'` inside it. Every single
time. Regardless of runtime mode. Confusion ensues.

It turns out there were two separate failures conspiring here:

1. `prepare_action()` would error out on `ActionType::DockerBuild`
   with "not yet supported", fall back to `determine_action_image()`,
   and cheerfully return `node:20-slim` for super-linter. This is
   not great.

2. The `PreparedAction::Image` execution branch had three sub-paths
   for is_docker, is_local, and everything else — and *all three*
   just ran echo commands. The image was resolved correctly and then
   completely ignored.

The fix has several parts:

- Add a `NativeDocker` variant to `PreparedAction` that means "run
  this image with its built-in ENTRYPOINT, no command override."
  Docker registry actions and DockerBuild actions both use this.

- Implement DockerBuild properly: clone the repo, resolve the
  Dockerfile path from action.yml, build it, return the tag.
  Uses the existing `shallow_clone` and `runtime.build_image`.

- Fix `build_image_inner` to tar the *full context directory*
  instead of just the Dockerfile. The old code had `_context_dir`
  sitting right there, computed and unused. COPY instructions in
  Dockerfiles need the context, obviously.

- Allow empty `cmd` in `run_container` to mean "use the image's
  default ENTRYPOINT/CMD". The Docker impl now sets `config.cmd =
  None` when cmd is empty. Podman already handled this correctly.

The existing `PreparedAction::Image` path with all its special-cased
action handling (actions-rs, checkout, etc.) is completely untouched.

Closes #59

* fix(executor): fix macOS entrypoint hang, path traversal, and silent emulation pass

Three bugs in the Docker action execution path from the previous
commit:

1. The macOS emulation entrypoint override (`bash -l -c`) was applied
*unconditionally*, even when cmd was empty (NativeDocker path). That
means Docker actions running on macOS emu images would get bash with
no argument — which either hangs forever or exits immediately. The
image's real ENTRYPOINT gets discarded either way. This is not great.

Fix: capture `has_cmd` before cmd_vec is moved into the config, only
apply the bash wrapper when there's actually a command to wrap.

2. The `dockerfile_rel` extracted from action.yml's `runs.image` was
not sanitized after stripping the `docker://` prefix. A malicious
action.yml with `docker:///etc/shadow` or `../../sensitive` would
escape the action directory via Path::join's absolute-path behavior
or dotdot traversal.

Fix: strip leading slashes and reject any path containing `..`.

3. Emulation mode returned exit_code 0 for Docker actions it *didn't
actually run*. Users got a green checkmark for actions that were
silently skipped. Confusion ensues.

Fix: return exit_code 1 with a clear stderr message explaining the
action was not executed and needs --runtime docker.

While at it, add tests for all three fixes: NativeDocker variant
construction, dockerfile path sanitization (6 cases), and emulation
empty-cmd failure behavior.

* fix(executor): harden Docker action security and fix docker:// execution path

Three issues found during review, all in the Docker action plumbing:

1. The `is_docker` path in `prepare_action()` was returning
`PreparedAction::Image` instead of `NativeDocker`, which means
`docker://` prefixed actions in `uses:` went straight through the
legacy echo-command path and *never actually executed*. Same class
of bug we just fixed for DockerBuild, hiding in plain sight.

2. The path traversal check for Dockerfile paths used
`contains("..")`, which rejects perfectly legitimate directory
names like `foo..bar/`. Check for `..` as an actual path
*component* instead via `split('/').any(|c| c == "..")`.

3. `build_image_inner` was calling `append_dir_all` on untrusted
action repositories without disabling symlink following. A
malicious action repo could plant a symlink pointing at the host
filesystem and have its contents shipped into the Docker build
context. That's the kind of thing that makes security auditors
lose sleep. Set `follow_symlinks(false)` on the tar builder.

* fix(executor): wire up runs.entrypoint, runs.args, and fix local Docker dispatch

The previous commits got the NativeDocker path working for remote
actions, but left several holes that a code review correctly
identified. Let's fix them all.

First, local Docker actions (uses: ./my-action with a Dockerfile)
were *still* returning PreparedAction::Image instead of NativeDocker.
Same class of bug we just fixed for remote actions, hiding one
function call away. They now go through NativeDocker and parse the
local action.yml for entrypoint/args.

Second, runs.entrypoint and runs.args from action.yml were being
completely ignored. Docker actions that declare their entrypoint in
action.yml (which is, you know, *a lot of them*) would silently
use the wrong entrypoint. Add an entrypoint parameter to the
ContainerRuntime trait and thread it through all four implementations:
Docker sets Config.entrypoint, Podman passes --entrypoint, and the
emulation runtimes accept-and-ignore it.

Third, with.args from workflow steps (uses: docker://alpine with
args: "echo hello") was not being passed as container CMD. It now
overrides runs.args when present, matching GitHub Actions behavior.

While at it:
- Extract sanitize_dockerfile_rel into a real function instead of
  having the tests duplicate the logic and test their own copy.
  Testing a copy of your code instead of the actual code is not
  what I'd call confidence-inspiring.
- Add canonicalize() defense-in-depth after Dockerfile path
  resolution to catch symlink escapes.
- Document the build_image_inner context directory invariant.

* fix(executor): fix broken args parsing, empty dockerfile path, and silent entrypoint drop

Three correctness bugs found during review of the Docker action
execution path:

1. with.args was being split on whitespace like a caveman. An
   argument like "hello world" would turn into two separate args,
   which is *not* how GitHub Actions works. Use shlex::split() for
   proper shell-word parsing, with a whitespace fallback for
   malformed input that shlex chokes on.

2. sanitize_dockerfile_rel() happily accepted empty strings. Feed
   it "" or "docker://" and it would produce an empty path, which
   then joins to a directory instead of a file. The subsequent
   docker build would fail with a confusing error. Let's just
   reject empty paths upfront.

3. SecureEmulationRuntime silently swallowed the entrypoint
   override without telling anyone. If you're running in secure
   emulation mode and your action specifies runs.entrypoint, you
   deserve to know it's being ignored — not left wondering why
   your action isn't doing what you expect.

* fix(executor)!: pass explicit build context to Docker image builds

It turns out that build_image_inner was deriving the Docker build
context from dockerfile.parent(), which is *wrong* when the
Dockerfile lives in a subdirectory of the action root. An action
with runs.image: subdir/Dockerfile would get subdir/ as its build
context instead of the action root, silently breaking every COPY
instruction that references files outside that subdirectory.

The fix is straightforward: add an explicit context_dir parameter
to the ContainerRuntime::build_image trait so callers tell us what
the context is instead of us guessing from the Dockerfile path.
The DockerBuild path in engine.rs now passes &action_dir, and the
Docker inner implementation computes the Dockerfile path relative
to context_dir via strip_prefix instead of just using file_name().

While at it, add a warning log when shlex::split fails to parse
with.args (unmatched quotes). Previously this silently fell back
to naive whitespace splitting, which is the kind of thing that
makes you stare at container logs for an hour wondering why your
quoted argument got split into three pieces.

* fix(executor): reject bad dockerfile paths instead of silently guessing

Three bugs found during review:

The build_image_inner strip_prefix fallback was *silently* using just
the filename when the Dockerfile wasn't a clean descendant of the
context directory. So if something weird happened with the path,
you'd just get the wrong Dockerfile used for the build with zero
indication anything went wrong. That's not a fallback, that's a
footgun. Return an error instead.

sanitize_dockerfile_rel was happily preserving a leading "./" from
the raw path, which then caused strip_prefix to fail (because
"./build/Dockerfile" is not a prefix-match for a joined path).
Strip it early so the downstream path arithmetic actually works.

While at it, extract_docker_runs_config was using filter_map on
runs.args, which means non-string YAML values like integers and
booleans were silently dropped. GitHub Actions coerces those to
strings, so we should too.

* fix(executor): handle string-form args and reject malformed with.args

It turns out that extract_docker_runs_config only handled runs.args
as a YAML sequence. If an action.yml declared args as a plain string
(which GitHub Actions absolutely allows), we'd silently drop the
entire argument. Not great.

While at it, the with.args parser had the opposite problem — when
shlex::split hit an unmatched quote, it shrugged and fell back to
naive whitespace splitting. That's the kind of "graceful degradation"
that produces subtly wrong container invocations and makes you spend
an afternoon wondering why your action is getting the wrong flags.

Fix both: extract_docker_runs_config now handles args as either a
YAML sequence or a string (shell-tokenized via shlex). The with.args
path now returns a hard error on malformed quoting instead of
pretending everything is fine. Added tests for string-form args
including the bad-quoting edge case.

* fix(executor): close sub_path traversal hole and make args parsing consistent

It turns out that sub_path from action references (the part after
owner/repo in owner/repo/some/subdir) was being joined to the clone
directory with absolutely no sanitization. A crafted sub_path like
"../../etc" would escape the cloned repo and get passed as the
Docker *build context*. Please don't do that.

Add sanitize_sub_path() that rejects any path component equal to
"..", and apply it in both the DockerBuild and Composite action
paths. For DockerBuild, also canonicalize the resolved action_dir
and verify it's still inside the repo_dir — because symlinks exist
and trusting user-controlled paths is how we end up on HN.

While at it, fix a behavioral inconsistency in args parsing: bad
quoting in action.yml's runs.args was silently falling back to the
raw string, while the exact same bad quoting in a workflow's
with.args was a hard error. Now both are errors, because silently
doing the wrong thing is worse than loudly refusing.

* fix(executor): harden Docker build context, sanitize inputs, deduplicate mount setup

The PR review flagged several issues ranging from correctness to
performance to plain old code smell. Let's address them all.

It turns out that build_image_inner was happily tarring the *entire*
context directory and shipping it to the Docker daemon, cheerfully
ignoring any .dockerignore file. For large action repos with test
fixtures, docs, and who knows what else, this is not great. When a
.dockerignore exists, we now use the `ignore` crate's WalkBuilder to
walk only non-ignored files. Falls back to the old append_dir_all
when there's no .dockerignore, because we're not breaking anything
that already works.

The sanitize_sub_path and sanitize_dockerfile_rel functions checked
for ".." traversal but not null bytes, which can cause truncation at
OS boundaries and potentially bypass the traversal check. Please
don't do that. Added null byte rejection to both.

extract_docker_runs_config was taking &Option<T> instead of
Option<&T>, which is the Rust equivalent of wearing your shirt
inside out — it works, but everyone who sees it knows something is
wrong. Fixed the signature and all callers.

The with.args empty-string handling was also wrong: `with.args: ""`
was treated as "no override" instead of "pass zero args", which
doesn't match GitHub Actions behavior where the presence of the key
is the signal, not its value.

While at it, extracted the volume/env/mount setup boilerplate that
was copy-pasted across three execution paths into a
StepContainerContext helper. Not because I enjoy moving code around,
but because the same 12 lines in three places is not my idea of
maintainability.

* fix(executor): cap build context size, disable git hooks, add NativeDocker tests

Three security and reliability fixes from the PR review:

The build_image_inner tar buffer was completely unbounded — a
malicious or just absurdly large action repo with no .dockerignore
would happily try to load the entire thing into memory. Now we
track cumulative file sizes and bail at 500 MB. The old
append_dir_all fallback had to go since it gives us no per-file
hook; replaced it with an ignore::WalkBuilder walk (already a dep)
so both paths enforce the same limit.

shallow_clone was happily running git checkout on untrusted repos
without disabling hooks. A cloned repo's .git/hooks/post-checkout
runs automatically, which is the kind of thing that makes security
reviewers lose sleep. Pass -c core.hooksPath=/dev/null to every
git invocation so cloned repos can't execute anything on our host.

While at it, add a MockContainerRuntime and four integration tests
that exercise the NativeDocker execute_step path end-to-end:
entrypoint passthrough, with.args override + INPUT_* injection,
empty args, and step/job env propagation. This path previously had
zero test coverage for the runtime flow.

* fix(executor): deduplicate build context walker, harden sub_path, add missing tests

The build_image_inner code in docker.rs had two near-identical
~50-line walker loops — one for when .dockerignore exists and one
for when it doesn't. The *only* difference was a single
add_custom_ignore_filename() call on the builder. Copy-paste like
that drifts. Let's not.

Merged into a single loop with a conditional on the WalkBuilder
before iteration. Same behavior, half the code.

While at it, sanitize_sub_path now splits on both '/' and '\' so
a Windows-style traversal like "a\..\..\etc" doesn't sneak past
the check. Also expanded the PreparedAction::Image doc comment to
explain which code paths still produce it and why it's distinct
from NativeDocker — future contributors shouldn't have to guess.

Added tests for: unmatched-quote error path in with.args, with.args
overriding runs.args, and backslash path traversal in sub_path.

* fix(executor): close backslash traversal gap and add with.entrypoint override

It turns out that sanitize_dockerfile_rel was only splitting on '/'
to catch ".." traversal, while its sibling sanitize_sub_path was
correctly splitting on both '/' and '\\'. So a crafted Dockerfile
path like "..\\..\\etc\\shadow" would sail right past the
sanitizer.

The canonicalize() defense-in-depth below *would* catch this in
practice, but relying on one security layer to cover a hole in
another is not great. Let's just make them consistent.

While at it, the NativeDocker execution path was missing support
for with.entrypoint — a documented GitHub Actions feature that
lets workflow steps override the Docker container's ENTRYPOINT.
We were already handling with.args but silently ignoring
with.entrypoint, which is the kind of asymmetry that bites you
the moment someone actually tries to use it.

* fix(executor): close composite sub_path symlink hole and filter empty entrypoint

The DockerBuild handler had a proper canonicalize + starts_with
defense-in-depth check after resolving sub_path, but the composite
action handler just blindly trusted sanitize_sub_path() and called
repo_dir.join(p) without verifying the result stayed inside the
cloned repo. A symlink named "legit" pointing to "../../secrets"
would sail right through the string-only sanitizer.

That is not great.

Add the same canonicalize + starts_with check to the composite
action path so both handlers have identical protection.

While at it, filter empty-string entrypoint values to None in both
extract_docker_runs_config and the Docker runtime layer. An empty
runs.entrypoint in action.yml should mean "use the image default",
not "tell Docker to clear the entrypoint" — which is what passing
Some("") actually does. Added tests for both the with.entrypoint
override path and the empty entrypoint filtering.

* fix(executor): filter empty podman entrypoint and extract NativeDocker step handler

The podman runtime was happily passing `--entrypoint ""` to podman
when a workflow set `with.entrypoint: ""`, while Docker correctly
filtered it out via `.filter(|s| !s.is_empty())`. So the two
runtimes silently diverged on empty entrypoint behavior. Not great.

Add the same filter to podman's entrypoint handling so both
runtimes treat empty strings as "use the image default."

While at it, extract the ~90-line NativeDocker match arm from
execute_step into its own `execute_native_docker_step` function.
That match block was getting unwieldy, and this keeps each action
type's execution logic self-contained.

Also drop a TODO on the in-memory tar buffer in build_image_inner —
it holds the entire build context in a Vec<u8>, which gets
uncomfortable as repos approach the 500 MB cap.
2026-04-02 10:28:37 +05:30
bahdotsh
f53a45e25d fix(executor): fix docker digest parsing, token leak in redirects, and missing tests
It turns out that resolve_action was blindly splitting on '@' for
*all* action references, including Docker image refs like
docker://alpine@sha256:abc123. The '@' in a Docker digest is not a
version separator — it's part of the image reference. Splitting it
produces a nonsensical repository and a fake "version" that happens
to be a SHA256 digest. Nobody noticed because the Docker path
doesn't use the version field, but the parsed data was still wrong.

While at it, the auth retry path in fetch_and_parse was constructing
a brand new reqwest::Client on every single 404-then-retry cycle.
That means a fresh TLS handshake each time, which is wasteful when
we already have a perfectly good static client pattern. Promote the
no-redirect client to a static Lazy, same as HTTP_CLIENT.

The auth redirect flow — where we send GITHUB_TOKEN to the origin
but strip it before following a redirect to a CDN — had zero test
coverage. This is the kind of security invariant that *really*
should not depend on code review alone. Add wiremock-based tests
that verify the token does not leak to redirect targets, plus tests
for the basic auth retry and 404 paths. Parameterize fetch_and_parse
with a base_url so wiremock can intercept the requests.
2026-03-28 16:42:36 +05:30
bahdotsh
f2c6097534 fix: resolve action.yml from remote repos to determine correct Docker image
Fixes #48. When encountering third-party GitHub Actions, wrkflw previously
defaulted to node:20-slim for all unknown actions. Now it fetches the
action's action.yml from raw.githubusercontent.com, parses runs.using to
determine the action type (Node/Docker/Composite), and selects the
appropriate Docker image. Falls back to the existing hardcoded mapping on
any failure.
2026-03-28 11:14:10 +05:30
bahdotsh
b1cc74639c version fix 2025-09-05 08:22:15 +05:30
bahdotsh
51a655f07b version fixes 2025-08-28 12:56:05 +05:30
bahdotsh
8005cbb7ee feat: Add .gitignore support for file copying
- Add ignore crate dependency to executor and runtime crates
- Implement gitignore-aware file copying in engine.rs and emulation.rs
- Support for .gitignore patterns, whitelist rules, and default ignore patterns
- Maintain backward compatibility with projects without .gitignore files
- Add proper error handling and debug logging for ignored files

This ensures that files marked in .gitignore are not copied to containers
or emulation workspaces, improving performance and security.
2025-08-21 15:27:00 +05:30
bahdotsh
250a88ba94 feat: implement robust secrets management with multi-provider support, masking, and security features 2025-08-14 23:26:30 +05:30
bahdotsh
960f7486a2 Release 0.7.0
wrkflw@0.7.0
wrkflw-evaluator@0.7.0
wrkflw-executor@0.7.0
wrkflw-github@0.7.0
wrkflw-gitlab@0.7.0
wrkflw-logging@0.7.0
wrkflw-matrix@0.7.0
wrkflw-models@0.7.0
wrkflw-parser@0.7.0
wrkflw-runtime@0.7.0
wrkflw-ui@0.7.0
wrkflw-utils@0.7.0
wrkflw-validators@0.7.0

Generated by cargo-workspaces
2025-08-13 18:07:11 +05:30
bahdotsh
537bf2f9d1 chore: bump version to 0.6.0
- Updated workspace version from 0.5.0 to 0.6.0
- Updated all internal crate dependencies to 0.6.0
- Verified all tests pass and builds succeed
2025-08-09 17:46:09 +05:30
bahdotsh
f0b6633cb8 renamed 2025-08-09 17:03:03 +05:30
bahdotsh
470132c5bf Refactor: Migrate modules to workspace crates
- Extracted functionality from the `src/` directory into individual crates within the `crates/` directory. This improves modularity, organization, and separation of concerns.
- Migrated modules include: models, evaluator, ui, gitlab, utils, logging, github, matrix, executor, runtime, parser, and validators.
- Removed the original source files and directories from `src/` after successful migration.
- This change sets the stage for better code management and potentially independent development/versioning of workspace members.
2025-05-02 12:53:41 +05:30