mirror of
https://github.com/bahdotsh/wrkflw.git
synced 2026-05-18 05:05:35 +02:00
6016887a3b5c71601bdb00331b24186bcd279c55
12 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
6016887a3b |
feat(executor): easy GHA emulation fixes for better compatibility (#82)
* feat(executor): add easy GHA emulation fixes for better compatibility
- Expand github.* context with 13 missing env vars (CI, GITHUB_ACTIONS,
GITHUB_REF_NAME, GITHUB_REF_TYPE, GITHUB_REPOSITORY_OWNER, etc.) and
improve GITHUB_ACTOR to use git config / $USER instead of hardcoded value
- Enforce timeout-minutes at both job level (default 360m per GHA spec)
and step level via tokio::time::timeout
- Implement defaults.run.shell and defaults.run.working-directory with
proper fallback chain: step > job defaults > workflow defaults > bash
- Implement hashFiles() expression function with glob matching, sorted
file hashing (SHA-256), and integration into the substitution pipeline
* fix(executor): harden hashFiles, working-directory, and shell -e
Three issues from code review, all in the "we got the GHA emulation
*almost* right" category:
1. hashFiles() was returning an empty string when no files matched.
GHA returns the SHA-256 of empty input (e3b0c44...), not nothing.
An empty string as a cache key component is the kind of thing
that silently ruins your day. Also, unreadable files were being
skipped without a peep — now we at least warn about it.
2. The working-directory default resolution was doing a naive
Path::join with user-controlled input. If someone writes
`working-directory: ../../../etc` or an absolute path, join
happily replaces the base. Inside a container this is *somewhat*
contained, but in emulation mode it's a real path traversal.
Normalize the path and reject anything that escapes the
workspace.
3. The bash -e flag change (correct per GHA spec) was undocumented.
Scripts that relied on intermediate commands failing without
aborting the step will now break. Document it in
BREAKING_CHANGES.md so users aren't left guessing.
* fix(executor): complete the GHA shell invocation and harden hashFiles
The previous commit added `-e` to bash but stopped there, even
though the BREAKING_CHANGES.md *literally documented* the full GHA
invocation as `bash --noprofile --norc -e -o pipefail {0}`. So we
were advertising behavior we weren't actually implementing. This is
not great.
Without `-o pipefail`, piped commands like `false | echo ok` would
silently succeed, which is exactly the kind of divergence that makes
you distrust an emulator. And without `--noprofile --norc`, user
profile scripts can interfere with reproducibility.
While at it, fix hashFiles error handling — it was silently
swallowing read errors and producing a partial hash, which is worse
than failing because you get a *wrong* cache key with no indication
anything went sideways. preprocess_hash_files and
preprocess_expressions now return Result and the engine surfaces
failures as step errors.
Also add the tests that should have been there from the start:
shell invocation flags, working-directory path traversal rejection,
and defaults cascade (step > job > workflow).
* fix(executor): harden hashFiles, timeout, and shell edge cases
The previous round of GHA emulation fixes left a few holes that
would bite you in production:
hashFiles() would happily glob '../../etc/passwd' and hash whatever
it found outside the workspace. It also loaded entire file contents
into memory before hashing, which is *not great* when someone points
it at a large binary artifact. The glob patterns now reject '..'
traversal, and file contents are streamed into the SHA-256 hasher
via io::copy instead.
timeout-minutes accepted any f64 from YAML, including negative
values, NaN, and infinity — all of which make Duration::from_secs_f64
panic. Non-finite and non-positive values now fall back to the GHA
default of 360 minutes.
Unknown shell values were silently accepted with a '-c' fallback.
Now they emit a warning so you at least *know* something is off.
While at it, replaced the hash_files_read_error_returns_err test
that was testing two Ok paths (despite its name) with proper
path-traversal rejection tests.
* fix(executor): fix shadowed timeout_mins and extract sanitization helper
It turns out the job timeout error path was re-reading the *raw*
timeout_minutes value instead of using the already-sanitized one.
If someone set timeout-minutes to NaN or a negative number, the
sanitization would correctly fall back to 360, but the error
message would happily print "Job exceeded timeout of NaN minutes."
Not great.
Extract sanitize_timeout_minutes() so both the job and step
timeout paths use the same logic instead of duplicating the
is_finite/positive/clamp dance. While at it, add proper tests
for NaN, Infinity, negative, zero, and the max clamp — plus a
test that actually exercises the job-level timeout expiry branch,
which previously had zero coverage.
|
||
|
|
fd348a460e |
fix: actually execute Docker-based GitHub Actions instead of emulating them (#74)
* fix(executor): actually execute Docker-based GitHub Actions instead of emulating them Third-party GitHub Actions that use Docker (like super-linter) were silently passing without ever *actually running*. The engine would resolve the action, pick a Docker image, and then... run `echo 'Would execute GitHub action: ...'` inside it. Every single time. Regardless of runtime mode. Confusion ensues. It turns out there were two separate failures conspiring here: 1. `prepare_action()` would error out on `ActionType::DockerBuild` with "not yet supported", fall back to `determine_action_image()`, and cheerfully return `node:20-slim` for super-linter. This is not great. 2. The `PreparedAction::Image` execution branch had three sub-paths for is_docker, is_local, and everything else — and *all three* just ran echo commands. The image was resolved correctly and then completely ignored. The fix has several parts: - Add a `NativeDocker` variant to `PreparedAction` that means "run this image with its built-in ENTRYPOINT, no command override." Docker registry actions and DockerBuild actions both use this. - Implement DockerBuild properly: clone the repo, resolve the Dockerfile path from action.yml, build it, return the tag. Uses the existing `shallow_clone` and `runtime.build_image`. - Fix `build_image_inner` to tar the *full context directory* instead of just the Dockerfile. The old code had `_context_dir` sitting right there, computed and unused. COPY instructions in Dockerfiles need the context, obviously. - Allow empty `cmd` in `run_container` to mean "use the image's default ENTRYPOINT/CMD". The Docker impl now sets `config.cmd = None` when cmd is empty. Podman already handled this correctly. The existing `PreparedAction::Image` path with all its special-cased action handling (actions-rs, checkout, etc.) is completely untouched. Closes #59 * fix(executor): fix macOS entrypoint hang, path traversal, and silent emulation pass Three bugs in the Docker action execution path from the previous commit: 1. The macOS emulation entrypoint override (`bash -l -c`) was applied *unconditionally*, even when cmd was empty (NativeDocker path). That means Docker actions running on macOS emu images would get bash with no argument — which either hangs forever or exits immediately. The image's real ENTRYPOINT gets discarded either way. This is not great. Fix: capture `has_cmd` before cmd_vec is moved into the config, only apply the bash wrapper when there's actually a command to wrap. 2. The `dockerfile_rel` extracted from action.yml's `runs.image` was not sanitized after stripping the `docker://` prefix. A malicious action.yml with `docker:///etc/shadow` or `../../sensitive` would escape the action directory via Path::join's absolute-path behavior or dotdot traversal. Fix: strip leading slashes and reject any path containing `..`. 3. Emulation mode returned exit_code 0 for Docker actions it *didn't actually run*. Users got a green checkmark for actions that were silently skipped. Confusion ensues. Fix: return exit_code 1 with a clear stderr message explaining the action was not executed and needs --runtime docker. While at it, add tests for all three fixes: NativeDocker variant construction, dockerfile path sanitization (6 cases), and emulation empty-cmd failure behavior. * fix(executor): harden Docker action security and fix docker:// execution path Three issues found during review, all in the Docker action plumbing: 1. The `is_docker` path in `prepare_action()` was returning `PreparedAction::Image` instead of `NativeDocker`, which means `docker://` prefixed actions in `uses:` went straight through the legacy echo-command path and *never actually executed*. Same class of bug we just fixed for DockerBuild, hiding in plain sight. 2. The path traversal check for Dockerfile paths used `contains("..")`, which rejects perfectly legitimate directory names like `foo..bar/`. Check for `..` as an actual path *component* instead via `split('/').any(|c| c == "..")`. 3. `build_image_inner` was calling `append_dir_all` on untrusted action repositories without disabling symlink following. A malicious action repo could plant a symlink pointing at the host filesystem and have its contents shipped into the Docker build context. That's the kind of thing that makes security auditors lose sleep. Set `follow_symlinks(false)` on the tar builder. * fix(executor): wire up runs.entrypoint, runs.args, and fix local Docker dispatch The previous commits got the NativeDocker path working for remote actions, but left several holes that a code review correctly identified. Let's fix them all. First, local Docker actions (uses: ./my-action with a Dockerfile) were *still* returning PreparedAction::Image instead of NativeDocker. Same class of bug we just fixed for remote actions, hiding one function call away. They now go through NativeDocker and parse the local action.yml for entrypoint/args. Second, runs.entrypoint and runs.args from action.yml were being completely ignored. Docker actions that declare their entrypoint in action.yml (which is, you know, *a lot of them*) would silently use the wrong entrypoint. Add an entrypoint parameter to the ContainerRuntime trait and thread it through all four implementations: Docker sets Config.entrypoint, Podman passes --entrypoint, and the emulation runtimes accept-and-ignore it. Third, with.args from workflow steps (uses: docker://alpine with args: "echo hello") was not being passed as container CMD. It now overrides runs.args when present, matching GitHub Actions behavior. While at it: - Extract sanitize_dockerfile_rel into a real function instead of having the tests duplicate the logic and test their own copy. Testing a copy of your code instead of the actual code is not what I'd call confidence-inspiring. - Add canonicalize() defense-in-depth after Dockerfile path resolution to catch symlink escapes. - Document the build_image_inner context directory invariant. * fix(executor): fix broken args parsing, empty dockerfile path, and silent entrypoint drop Three correctness bugs found during review of the Docker action execution path: 1. with.args was being split on whitespace like a caveman. An argument like "hello world" would turn into two separate args, which is *not* how GitHub Actions works. Use shlex::split() for proper shell-word parsing, with a whitespace fallback for malformed input that shlex chokes on. 2. sanitize_dockerfile_rel() happily accepted empty strings. Feed it "" or "docker://" and it would produce an empty path, which then joins to a directory instead of a file. The subsequent docker build would fail with a confusing error. Let's just reject empty paths upfront. 3. SecureEmulationRuntime silently swallowed the entrypoint override without telling anyone. If you're running in secure emulation mode and your action specifies runs.entrypoint, you deserve to know it's being ignored — not left wondering why your action isn't doing what you expect. * fix(executor)!: pass explicit build context to Docker image builds It turns out that build_image_inner was deriving the Docker build context from dockerfile.parent(), which is *wrong* when the Dockerfile lives in a subdirectory of the action root. An action with runs.image: subdir/Dockerfile would get subdir/ as its build context instead of the action root, silently breaking every COPY instruction that references files outside that subdirectory. The fix is straightforward: add an explicit context_dir parameter to the ContainerRuntime::build_image trait so callers tell us what the context is instead of us guessing from the Dockerfile path. The DockerBuild path in engine.rs now passes &action_dir, and the Docker inner implementation computes the Dockerfile path relative to context_dir via strip_prefix instead of just using file_name(). While at it, add a warning log when shlex::split fails to parse with.args (unmatched quotes). Previously this silently fell back to naive whitespace splitting, which is the kind of thing that makes you stare at container logs for an hour wondering why your quoted argument got split into three pieces. * fix(executor): reject bad dockerfile paths instead of silently guessing Three bugs found during review: The build_image_inner strip_prefix fallback was *silently* using just the filename when the Dockerfile wasn't a clean descendant of the context directory. So if something weird happened with the path, you'd just get the wrong Dockerfile used for the build with zero indication anything went wrong. That's not a fallback, that's a footgun. Return an error instead. sanitize_dockerfile_rel was happily preserving a leading "./" from the raw path, which then caused strip_prefix to fail (because "./build/Dockerfile" is not a prefix-match for a joined path). Strip it early so the downstream path arithmetic actually works. While at it, extract_docker_runs_config was using filter_map on runs.args, which means non-string YAML values like integers and booleans were silently dropped. GitHub Actions coerces those to strings, so we should too. * fix(executor): handle string-form args and reject malformed with.args It turns out that extract_docker_runs_config only handled runs.args as a YAML sequence. If an action.yml declared args as a plain string (which GitHub Actions absolutely allows), we'd silently drop the entire argument. Not great. While at it, the with.args parser had the opposite problem — when shlex::split hit an unmatched quote, it shrugged and fell back to naive whitespace splitting. That's the kind of "graceful degradation" that produces subtly wrong container invocations and makes you spend an afternoon wondering why your action is getting the wrong flags. Fix both: extract_docker_runs_config now handles args as either a YAML sequence or a string (shell-tokenized via shlex). The with.args path now returns a hard error on malformed quoting instead of pretending everything is fine. Added tests for string-form args including the bad-quoting edge case. * fix(executor): close sub_path traversal hole and make args parsing consistent It turns out that sub_path from action references (the part after owner/repo in owner/repo/some/subdir) was being joined to the clone directory with absolutely no sanitization. A crafted sub_path like "../../etc" would escape the cloned repo and get passed as the Docker *build context*. Please don't do that. Add sanitize_sub_path() that rejects any path component equal to "..", and apply it in both the DockerBuild and Composite action paths. For DockerBuild, also canonicalize the resolved action_dir and verify it's still inside the repo_dir — because symlinks exist and trusting user-controlled paths is how we end up on HN. While at it, fix a behavioral inconsistency in args parsing: bad quoting in action.yml's runs.args was silently falling back to the raw string, while the exact same bad quoting in a workflow's with.args was a hard error. Now both are errors, because silently doing the wrong thing is worse than loudly refusing. * fix(executor): harden Docker build context, sanitize inputs, deduplicate mount setup The PR review flagged several issues ranging from correctness to performance to plain old code smell. Let's address them all. It turns out that build_image_inner was happily tarring the *entire* context directory and shipping it to the Docker daemon, cheerfully ignoring any .dockerignore file. For large action repos with test fixtures, docs, and who knows what else, this is not great. When a .dockerignore exists, we now use the `ignore` crate's WalkBuilder to walk only non-ignored files. Falls back to the old append_dir_all when there's no .dockerignore, because we're not breaking anything that already works. The sanitize_sub_path and sanitize_dockerfile_rel functions checked for ".." traversal but not null bytes, which can cause truncation at OS boundaries and potentially bypass the traversal check. Please don't do that. Added null byte rejection to both. extract_docker_runs_config was taking &Option<T> instead of Option<&T>, which is the Rust equivalent of wearing your shirt inside out — it works, but everyone who sees it knows something is wrong. Fixed the signature and all callers. The with.args empty-string handling was also wrong: `with.args: ""` was treated as "no override" instead of "pass zero args", which doesn't match GitHub Actions behavior where the presence of the key is the signal, not its value. While at it, extracted the volume/env/mount setup boilerplate that was copy-pasted across three execution paths into a StepContainerContext helper. Not because I enjoy moving code around, but because the same 12 lines in three places is not my idea of maintainability. * fix(executor): cap build context size, disable git hooks, add NativeDocker tests Three security and reliability fixes from the PR review: The build_image_inner tar buffer was completely unbounded — a malicious or just absurdly large action repo with no .dockerignore would happily try to load the entire thing into memory. Now we track cumulative file sizes and bail at 500 MB. The old append_dir_all fallback had to go since it gives us no per-file hook; replaced it with an ignore::WalkBuilder walk (already a dep) so both paths enforce the same limit. shallow_clone was happily running git checkout on untrusted repos without disabling hooks. A cloned repo's .git/hooks/post-checkout runs automatically, which is the kind of thing that makes security reviewers lose sleep. Pass -c core.hooksPath=/dev/null to every git invocation so cloned repos can't execute anything on our host. While at it, add a MockContainerRuntime and four integration tests that exercise the NativeDocker execute_step path end-to-end: entrypoint passthrough, with.args override + INPUT_* injection, empty args, and step/job env propagation. This path previously had zero test coverage for the runtime flow. * fix(executor): deduplicate build context walker, harden sub_path, add missing tests The build_image_inner code in docker.rs had two near-identical ~50-line walker loops — one for when .dockerignore exists and one for when it doesn't. The *only* difference was a single add_custom_ignore_filename() call on the builder. Copy-paste like that drifts. Let's not. Merged into a single loop with a conditional on the WalkBuilder before iteration. Same behavior, half the code. While at it, sanitize_sub_path now splits on both '/' and '\' so a Windows-style traversal like "a\..\..\etc" doesn't sneak past the check. Also expanded the PreparedAction::Image doc comment to explain which code paths still produce it and why it's distinct from NativeDocker — future contributors shouldn't have to guess. Added tests for: unmatched-quote error path in with.args, with.args overriding runs.args, and backslash path traversal in sub_path. * fix(executor): close backslash traversal gap and add with.entrypoint override It turns out that sanitize_dockerfile_rel was only splitting on '/' to catch ".." traversal, while its sibling sanitize_sub_path was correctly splitting on both '/' and '\\'. So a crafted Dockerfile path like "..\\..\\etc\\shadow" would sail right past the sanitizer. The canonicalize() defense-in-depth below *would* catch this in practice, but relying on one security layer to cover a hole in another is not great. Let's just make them consistent. While at it, the NativeDocker execution path was missing support for with.entrypoint — a documented GitHub Actions feature that lets workflow steps override the Docker container's ENTRYPOINT. We were already handling with.args but silently ignoring with.entrypoint, which is the kind of asymmetry that bites you the moment someone actually tries to use it. * fix(executor): close composite sub_path symlink hole and filter empty entrypoint The DockerBuild handler had a proper canonicalize + starts_with defense-in-depth check after resolving sub_path, but the composite action handler just blindly trusted sanitize_sub_path() and called repo_dir.join(p) without verifying the result stayed inside the cloned repo. A symlink named "legit" pointing to "../../secrets" would sail right through the string-only sanitizer. That is not great. Add the same canonicalize + starts_with check to the composite action path so both handlers have identical protection. While at it, filter empty-string entrypoint values to None in both extract_docker_runs_config and the Docker runtime layer. An empty runs.entrypoint in action.yml should mean "use the image default", not "tell Docker to clear the entrypoint" — which is what passing Some("") actually does. Added tests for both the with.entrypoint override path and the empty entrypoint filtering. * fix(executor): filter empty podman entrypoint and extract NativeDocker step handler The podman runtime was happily passing `--entrypoint ""` to podman when a workflow set `with.entrypoint: ""`, while Docker correctly filtered it out via `.filter(|s| !s.is_empty())`. So the two runtimes silently diverged on empty entrypoint behavior. Not great. Add the same filter to podman's entrypoint handling so both runtimes treat empty strings as "use the image default." While at it, extract the ~90-line NativeDocker match arm from execute_step into its own `execute_native_docker_step` function. That match block was getting unwieldy, and this keeps each action type's execution logic self-contained. Also drop a TODO on the in-memory tar buffer in build_image_inner — it holds the entire build context in a Vec<u8>, which gets uncomfortable as repos approach the 500 MB cap. |
||
|
|
f53a45e25d |
fix(executor): fix docker digest parsing, token leak in redirects, and missing tests
It turns out that resolve_action was blindly splitting on '@' for *all* action references, including Docker image refs like docker://alpine@sha256:abc123. The '@' in a Docker digest is not a version separator — it's part of the image reference. Splitting it produces a nonsensical repository and a fake "version" that happens to be a SHA256 digest. Nobody noticed because the Docker path doesn't use the version field, but the parsed data was still wrong. While at it, the auth retry path in fetch_and_parse was constructing a brand new reqwest::Client on every single 404-then-retry cycle. That means a fresh TLS handshake each time, which is wasteful when we already have a perfectly good static client pattern. Promote the no-redirect client to a static Lazy, same as HTTP_CLIENT. The auth redirect flow — where we send GITHUB_TOKEN to the origin but strip it before following a redirect to a CDN — had zero test coverage. This is the kind of security invariant that *really* should not depend on code review alone. Add wiremock-based tests that verify the token does not leak to redirect targets, plus tests for the basic auth retry and 404 paths. Parameterize fetch_and_parse with a base_url so wiremock can intercept the requests. |
||
|
|
f2c6097534 |
fix: resolve action.yml from remote repos to determine correct Docker image
Fixes #48. When encountering third-party GitHub Actions, wrkflw previously defaulted to node:20-slim for all unknown actions. Now it fetches the action's action.yml from raw.githubusercontent.com, parses runs.using to determine the action type (Node/Docker/Composite), and selects the appropriate Docker image. Falls back to the existing hardcoded mapping on any failure. |
||
|
|
b1cc74639c | version fix | ||
|
|
51a655f07b | version fixes | ||
|
|
8005cbb7ee |
feat: Add .gitignore support for file copying
- Add ignore crate dependency to executor and runtime crates - Implement gitignore-aware file copying in engine.rs and emulation.rs - Support for .gitignore patterns, whitelist rules, and default ignore patterns - Maintain backward compatibility with projects without .gitignore files - Add proper error handling and debug logging for ignored files This ensures that files marked in .gitignore are not copied to containers or emulation workspaces, improving performance and security. |
||
|
|
250a88ba94 | feat: implement robust secrets management with multi-provider support, masking, and security features | ||
|
|
960f7486a2 |
Release 0.7.0
wrkflw@0.7.0 wrkflw-evaluator@0.7.0 wrkflw-executor@0.7.0 wrkflw-github@0.7.0 wrkflw-gitlab@0.7.0 wrkflw-logging@0.7.0 wrkflw-matrix@0.7.0 wrkflw-models@0.7.0 wrkflw-parser@0.7.0 wrkflw-runtime@0.7.0 wrkflw-ui@0.7.0 wrkflw-utils@0.7.0 wrkflw-validators@0.7.0 Generated by cargo-workspaces |
||
|
|
537bf2f9d1 |
chore: bump version to 0.6.0
- Updated workspace version from 0.5.0 to 0.6.0 - Updated all internal crate dependencies to 0.6.0 - Verified all tests pass and builds succeed |
||
|
|
f0b6633cb8 | renamed | ||
|
|
470132c5bf |
Refactor: Migrate modules to workspace crates
- Extracted functionality from the `src/` directory into individual crates within the `crates/` directory. This improves modularity, organization, and separation of concerns. - Migrated modules include: models, evaluator, ui, gitlab, utils, logging, github, matrix, executor, runtime, parser, and validators. - Removed the original source files and directories from `src/` after successful migration. - This change sets the stage for better code management and potentially independent development/versioning of workspace members. |