Commit Graph

3 Commits

Author SHA1 Message Date
Gokul
6016887a3b feat(executor): easy GHA emulation fixes for better compatibility (#82)
* feat(executor): add easy GHA emulation fixes for better compatibility

- Expand github.* context with 13 missing env vars (CI, GITHUB_ACTIONS,
  GITHUB_REF_NAME, GITHUB_REF_TYPE, GITHUB_REPOSITORY_OWNER, etc.) and
  improve GITHUB_ACTOR to use git config / $USER instead of hardcoded value
- Enforce timeout-minutes at both job level (default 360m per GHA spec)
  and step level via tokio::time::timeout
- Implement defaults.run.shell and defaults.run.working-directory with
  proper fallback chain: step > job defaults > workflow defaults > bash
- Implement hashFiles() expression function with glob matching, sorted
  file hashing (SHA-256), and integration into the substitution pipeline

* fix(executor): harden hashFiles, working-directory, and shell -e

Three issues from code review, all in the "we got the GHA emulation
*almost* right" category:

1. hashFiles() was returning an empty string when no files matched.
   GHA returns the SHA-256 of empty input (e3b0c44...), not nothing.
   An empty string as a cache key component is the kind of thing
   that silently ruins your day. Also, unreadable files were being
   skipped without a peep — now we at least warn about it.

2. The working-directory default resolution was doing a naive
   Path::join with user-controlled input. If someone writes
   `working-directory: ../../../etc` or an absolute path, join
   happily replaces the base. Inside a container this is *somewhat*
   contained, but in emulation mode it's a real path traversal.
   Normalize the path and reject anything that escapes the
   workspace.

3. The bash -e flag change (correct per GHA spec) was undocumented.
   Scripts that relied on intermediate commands failing without
   aborting the step will now break. Document it in
   BREAKING_CHANGES.md so users aren't left guessing.

* fix(executor): complete the GHA shell invocation and harden hashFiles

The previous commit added `-e` to bash but stopped there, even
though the BREAKING_CHANGES.md *literally documented* the full GHA
invocation as `bash --noprofile --norc -e -o pipefail {0}`. So we
were advertising behavior we weren't actually implementing. This is
not great.

Without `-o pipefail`, piped commands like `false | echo ok` would
silently succeed, which is exactly the kind of divergence that makes
you distrust an emulator. And without `--noprofile --norc`, user
profile scripts can interfere with reproducibility.

While at it, fix hashFiles error handling — it was silently
swallowing read errors and producing a partial hash, which is worse
than failing because you get a *wrong* cache key with no indication
anything went sideways. preprocess_hash_files and
preprocess_expressions now return Result and the engine surfaces
failures as step errors.

Also add the tests that should have been there from the start:
shell invocation flags, working-directory path traversal rejection,
and defaults cascade (step > job > workflow).

* fix(executor): harden hashFiles, timeout, and shell edge cases

The previous round of GHA emulation fixes left a few holes that
would bite you in production:

hashFiles() would happily glob '../../etc/passwd' and hash whatever
it found outside the workspace. It also loaded entire file contents
into memory before hashing, which is *not great* when someone points
it at a large binary artifact. The glob patterns now reject '..'
traversal, and file contents are streamed into the SHA-256 hasher
via io::copy instead.

timeout-minutes accepted any f64 from YAML, including negative
values, NaN, and infinity — all of which make Duration::from_secs_f64
panic. Non-finite and non-positive values now fall back to the GHA
default of 360 minutes.

Unknown shell values were silently accepted with a '-c' fallback.
Now they emit a warning so you at least *know* something is off.

While at it, replaced the hash_files_read_error_returns_err test
that was testing two Ok paths (despite its name) with proper
path-traversal rejection tests.

* fix(executor): fix shadowed timeout_mins and extract sanitization helper

It turns out the job timeout error path was re-reading the *raw*
timeout_minutes value instead of using the already-sanitized one.
If someone set timeout-minutes to NaN or a negative number, the
sanitization would correctly fall back to 360, but the error
message would happily print "Job exceeded timeout of NaN minutes."

Not great.

Extract sanitize_timeout_minutes() so both the job and step
timeout paths use the same logic instead of duplicating the
is_finite/positive/clamp dance. While at it, add proper tests
for NaN, Infinity, negative, zero, and the max clamp — plus a
test that actually exercises the job-level timeout expiry branch,
which previously had zero coverage.
2026-04-02 18:00:41 +05:30
Gokul
8a8d7e5eec fix: resolve correctness, security, and parsing bugs across codebase (#73)
* fix: resolve 10 bugs found during full codebase review

- Fix memory leak from Box::leak() in status bar render loop
- Fix AES-GCM nonce reuse vulnerability in encrypted secret storage
- Fix Default impl for EncryptedSecretStore that discarded encryption key
- Fix early return in list command that prevented GitLab pipeline listing
- Fix double validation call in validate_github_workflow
- Wire --show-action-messages CLI flag through ExecutionConfig
- Add serde(rename = "if") to GitLab Rule if_ field for correct deserialization
- Fix potential panic on multibyte paths in workflow tab path shortening
- Include involved job names in circular dependency error messages
- Improve cron syntax validation to check value ranges, steps, and expressions

* fix: address PR review feedback

- Remove dead `_nonce` parameter from `EncryptedSecretStore::from_data`
- Add clarifying comment for inverted show/hide action messages mapping
- Add comprehensive cron validation tests (valid expressions, out-of-range
  values, wrong part count, invalid steps, invalid ranges, edge cases)

* fix: resolve 27 bugs found during full codebase verification

A thorough manual verification of every feature uncovered a
*remarkable* collection of bugs hiding in plain sight. The
highlights:

The `strategy.matrix` YAML structure was never parsed. The Job
struct had `matrix` at the top level, but GitHub Actions nests it
under `strategy.matrix`. Serde silently ignored the `strategy`
key, so matrix expansion code existed but could never run. For
absolutely no reason. Introduce a proper `Strategy` struct and
wire it through the executor.

The Step struct was missing `if`, `id`, `working-directory`,
`shell`, and `timeout-minutes` fields. Step-level conditionals
were silently dropped — every step always ran regardless of its
`if` condition. While at it, `continue-on-error` was in the
struct but had no serde rename and was never checked during
execution. Fix all of that.

The validator cheerfully reported cyclic `needs` dependencies as
"Valid". Add DFS cycle detection so `A -> B -> C -> A` is caught
at validation time instead of blowing up at execution time.

Five of eight GitLab CI test fixtures failed to parse because the
model was too rigid: `extends` only accepted arrays (not strings),
`variables` rejected integers, `cache.key` rejected structured
formats, and `script` rejected single strings. Add custom
deserializers following the existing codebase pattern.

The GitHub trigger function leaked the auth token via curl process
arguments visible in `/proc/[pid]/cmdline`. Replace with reqwest,
matching the pattern already used elsewhere. Also add symlink and
path traversal protections in the executor.

Other fixes: hardcoded matrix variable stripping replaced with
proper substitution, `show_action_messages` wired through TUI,
dead `if true {}` removed, default branch detection uses remote
HEAD instead of current branch, cron validator accepts named
days/months, reusable workflow ref validation loosened from OR
to AND, matrix include entries merge into all matching combos.

* fix: harden step-level evaluation, volume checks, and add tests

The PR review turned up a few things that needed fixing before this
was actually ready.

The step-level `if` condition evaluator was silently reusing the
job-level `evaluate_job_condition` function, which knows nothing
about step-scoped expressions like `steps.<id>.outcome`, `success()`,
`failure()`, `always()`, or `cancelled()`. These would fall through
to the generic "unknown condition" path without so much as a warning.
Now they're detected early, a warning is logged, and they default to
true — which is at least *honest* about the limitation.

The volume path traversal check (`..`) was applied to the entire
volume spec string, meaning a perfectly legitimate container path
like `/safe/host:/container/..weird` would get rejected. The check
now only inspects the host path component after splitting on `:`,
which is the part that actually matters for traversal attacks.

While at it, renamed the awkwardly-named `step_name_for_skip` to
just `step_name` in `execute_matrix_job` for consistency with
`execute_job`, and added a BREAKING_CHANGES.md documenting the
EncryptedSecretStore serialization format change.

Added 19 new tests covering matrix include/exclude merge semantics,
step condition evaluation for unsupported expressions, volume path
traversal edge cases, and continue-on-error + step-level if parsing.

* fix: correct condition defaults, path traversal check, and null variable handling

The previous commit defaulted *all* unsupported step-level
condition functions (failure(), cancelled(), always(), success())
to true. It turns out that defaulting failure() and cancelled()
to true is semantically wrong — it means steps guarded by
`if: failure()` will *always* run, even when nothing failed.
That's not a feature, that's a bug.

Default each function to its most likely state: always() and
success() return true, failure() and cancelled() return false.
Not perfect (we still can't track actual step outcomes), but at
least we're not silently running cleanup steps on every build.

The path traversal check was using `contains("..")` which is a
substring match. A directory literally named `..hidden` would
get rejected. Use Path::components() to detect actual ParentDir
components instead of playing string matching games.

While at it, fix deserialize_variables in the GitLab models to
handle YAML null values as empty strings instead of producing
"~\n". Also trim the catch-all serialization output.

* fix: correct cycle detection, condition evaluation, and matrix continue-on-error

The DFS cycle detector in `dfs_detect_cycle` had a genuinely nasty bug:
when a cycle was found, it returned early *without popping itself from
rec_stack*. This left stale entries that corrupted the stack for
subsequent DFS traversals. Net result: cross-edges to already-visited
nodes would be falsely reported as cycles. A→B→A is a cycle, but
D→E→A is just a cross-edge. The old code couldn't tell the difference.

Fix this properly by introducing a separate `in_stack` HashSet for O(1)
membership checks, while keeping the Vec for path reconstruction. Both
are now correctly cleaned up — no early returns skip the cleanup.

While at it, `execute_matrix_job` was silently ignoring `continue-on-error`
on the Err branch. The non-matrix `execute_job` handled it correctly,
but the matrix path would just abort the entire job. Copy-paste bugs
are fun like that. Let's fix that.

The `evaluate_job_condition` status function handling was doing sequential
`contains()` checks with early returns, which meant compound expressions
like `failure() || success()` would match `failure()` first and return
false. Now we scan for all status functions in one pass and pick the
most permissive default when positive functions are present.

Also: `convert_yaml_to_step` was hardcoding `None` for `if_condition`,
`id`, `working_directory`, `shell`, and `timeout_minutes` despite the
YAML potentially having them. And `is_valid_cron_atom` was rejecting
valid POSIX cron syntax like `5/2`.

* refactor(executor): extract step guards into shared helper, fix steps.* default

The step-level if-condition check and continue-on-error handling was
copy-pasted between execute_job and execute_matrix_job with subtly
different control flow — one sets job_success=false and breaks, the
other returns Ok(JobResult{Failure}) immediately. Two copies of the
same logic that *already* disagree is not redundancy, it's a bug
waiting to happen. Let's fix that.

Extract run_step_with_guards() that encapsulates the if-condition
evaluation, execute_step call, and continue-on-error wrapping into
a single StepOutcome enum. Both job execution paths now call this
shared helper.

While at it, fix the condition evaluator defaulting bare steps.*
references to true — "steps.build.outcome == 'failure'" should
*not* optimistically run the step. Now only always() and success()
default to true; everything else (bare step refs, failure(),
cancelled()) conservatively defaults to false.

Also add serde alias "matrix" on Job.strategy so old workflows with
flat matrix: at job level still parse, and document the intentional
or_insert_with in matrix include merging per GitHub Actions spec.

* fix: clean up review findings in step guards, secret store, and test fixture

The PR review flagged three issues worth fixing before merge.

First, run_step_with_guards had a bogus StepStatus::Skipped check
in the abort_job logic. The condition tested for Failure *or*
Skipped, then only actually aborted on Failure — meaning the
Skipped branch did nothing except confuse anyone reading the code.
Simplify to just check Failure directly.

Second, EncryptedSecretStore::from_json would silently fail with a
generic serde error when fed the old serialization format (which
had a shared top-level nonce field). Now it detects the old format
by checking for the "nonce" key and returns a clear error pointing
at BREAKING_CHANGES.md. Added a test for this.

Third, tests/workflows/continue-on-error-test.yml was an orphan
fixture — nothing referenced it. The same content is already
tested inline by parse_continue_on_error_workflow in the parser.
Removed it.

* fix: correct cron day-of-week range, steps. false positive, and Step boilerplate

Three issues from PR review, all straightforward:

The cron validator was rejecting day-of-week value 7, which is a
perfectly valid Sunday alias in both POSIX cron and GitHub Actions.
The max was 6 when it should be 7. The named-value resolver guard
also needed updating from `max == 6` to `max >= 6` so named days
still resolve correctly with the wider range.

The `evaluate_job_condition` heuristic for detecting `steps.*`
references was using a bare `contains("steps.")`, which means an
env var like `env.MY_STEPS_COUNT` would falsely trigger it and
short-circuit to false. Now we check that the character before
"steps." is either start-of-string or non-alphanumeric. Not a
full expression parser, but it stops the obvious false positives.

While at it, add a `Step::with_run` constructor so the GitLab
converter doesn't need three identical 12-field struct literals
that silently break every time someone adds a field to Step.

* fix: harden steps. boundary check, document condition semantics, dedup cycles

The steps. word-boundary heuristic in evaluate_job_condition was
checking for alphanumeric characters before "steps." to avoid false
positives on env vars like "env.MY_STEPS_COUNT". It turns out that
underscore is *not* alphanumeric, so "env._STEPS_CHECK" would
incorrectly trigger the step-reference path and return false.

While at it, the always() && failure() compound expression returning
true got a proper comment explaining *why* that's intentional — we
lack step-status context locally, so we'd rather over-run than
silently skip steps. Not ideal, but honest.

The DFS cycle detector in detect_cyclic_needs could report the same
cycle multiple times depending on HashMap iteration order. Normalize
cycles by rotating the node list to start at the lexicographically
smallest node, then deduplicate via a HashSet. Same cycle from
different entry points now gets reported exactly once.

* fix: squash review nits — double parse, clippy warnings, lost flag

Three leftover issues from the codebase review PR:

The from_json() deserialization was parsing the JSON *twice* — once
into serde_json::Value to sniff for the old nonce field, then again
from the raw string into the actual struct. Parse once, use
from_value() on the already-parsed Value. Not rocket science.

The cycle detector had two clippy warnings: .iter().cloned().collect()
on a slice (just use .to_vec(), please) and .min_by_key() cloning a
double reference instead of comparing properly. Switch to .min_by()
with an explicit cmp.

The show_action_messages flag was being silently dropped in
execute_workflow_cli — hardcoded to false regardless of what the user
asked for. Propagate it through the function signature and the TUI
fallback path so it actually does something.
2026-04-01 23:06:48 +05:30
bahdotsh
470132c5bf Refactor: Migrate modules to workspace crates
- Extracted functionality from the `src/` directory into individual crates within the `crates/` directory. This improves modularity, organization, and separation of concerns.
- Migrated modules include: models, evaluator, ui, gitlab, utils, logging, github, matrix, executor, runtime, parser, and validators.
- Removed the original source files and directories from `src/` after successful migration.
- This change sets the stage for better code management and potentially independent development/versioning of workspace members.
2025-05-02 12:53:41 +05:30