fix: artifact actions under --runtime emulation (#93)

* test(executor): cover run-step → upload/download-artifact handoff under emulation

Drive a real EmulationRuntime through execute_step for three steps (run,
upload-artifact, download-artifact) and assert the payload round-trips
byte-for-byte. Reproduces #88: today the run step writes via the
GITHUB_WORKSPACE reroute while upload reads ctx.working_dir, so the two
workspaces diverge and the upload fails with "No files found matching
pattern".

Test is intentionally failing at this commit; the fix follows.

* runtime(container): add resolve_host_working_dir helper

Introduces a pure function that rebases a container-visible working
directory onto its host-side volume source by finding the longest
component-boundary prefix in the volumes list and grafting the suffix.

This is the building block for making non-container runtimes
(emulation, secure_emulation) honor the same mount semantics as docker,
so a run: step and an artifact handler observe the same host workspace
(fix for #88). The helper itself is pure and pub(crate); wiring follows
in the next commit.

Component-boundary matching (via Path::strip_prefix) ensures
/github/workspace-foo is not matched by a /github/workspace mount.

* fix(runtime): emulation honors volume mounts for working dir (#88)

EmulationRuntime::run_container now rebases a container-visible
working directory onto its host-side volume source via
resolve_host_working_dir, matching the mount semantics docker already
provides. The old GITHUB_WORKSPACE / CI_PROJECT_DIR reroute is gone:
the volumes parameter is authoritative, and calling with a container
path that no volume covers is now a hard error instead of a silent
substitution.

This makes `run:` steps and artifact/cache handlers observe the same
host workspace, fixing the reported symptom where upload-artifact
returned "No files found matching pattern" because the run step had
written through the reroute to the real project directory while the
upload was searching the per-job tempdir.

Also adjust the #88 regression test to use a glob pattern compatible
with the `glob` crate's handling of trailing `**` (which matches zero
components in that position). The pattern quirk is orthogonal to #88;
tracking fixing artifact glob semantics separately.

* fix(runtime): secure_emulation honors volume mounts; sandbox runs in-place (#88)

Brings SecureEmulationRuntime and the Sandbox under the same mount
contract as EmulationRuntime and docker: a container-visible working
dir is rebased via `resolve_host_working_dir` onto its host-side volume
source, and the command is executed directly in that host path.

The Sandbox used to copy files into its own private workspace and run
there, so nothing the caller's `run:` step wrote was visible to any
subsequent artifact/cache handler — the same #88 symptom that hit
plain emulation, just via a different mechanism. The copy dance was
also not a meaningful security layer; validation comes from
`validate_command` (command whitelist, dangerous patterns) and
`is_env_var_safe` (env filtering), which still run.

Removed: Sandbox::setup_sandbox_environment, copy_safe_files,
is_path_allowed, should_skip_file, should_skip_directory, and the
`workspace: TempDir` field. Removed the file-filtering test that
covered only the deleted copy path. Added two targeted tests for the
new rebase behavior and the error-when-no-volume-covers path.

Uncovered container working dirs now fail loudly with
"not covered by any volume mount" instead of silently running in an
incorrect location.

* refactor(runtime): centralize rebase helper and tighten semantics

PR #93 landed the #88 fix, but review pointed out three things
I'd missed, and they're the kind that turn into the next bug
report if you leave them alone.

First: both EmulationRuntime and SecureEmulationRuntime did
`if working_dir.exists() { use it } else { rebase via volumes }`.
That's the wrong way around. If a dev happens to have a real
`/github/workspace` directory on their host — containerized dev
env, weird NFS mount, somebody being creative — the short-circuit
silently skips the rebase and we're right back to two workspaces
disagreeing about where files live. The whole point of the #88
fix is that the volume mapping is *authoritative*. Flipping the
branch order makes it actually authoritative: the volume wins
first, and we only fall back to \`working_dir\` as-is when no
volume claims it.

Second: ~50 lines of near-identical rebase + create_dir_all +
log + error boilerplate sat duplicated in two runtimes. The next
fix would get applied to one file and not the other. Extracted
into a single \`rebase_working_dir_or_error\` helper in
container.rs. Both call sites collapse to one line.

Third: SandboxConfig was still carrying \`allowed_read_paths\` and
\`allowed_write_paths\` fields that nothing reads. The old
\`is_path_allowed\` that used to check them is gone. Leaving dead
fields on a type named \`Sandbox\` is worse than removing them — a
reader a year from now will assume they're enforced and ship
something based on that assumption. Please don't do that.
Deleted them, along with the factory code that was still
populating them for no reason.

While at it: documented the \`>=\` tiebreaker in
\`resolve_host_working_dir\` (equal-depth ties only happen on
duplicate volume entries, so first-wins is deliberate, not a
typo), added the matching emulation error test for symmetry
with secure_emulation, added four new unit tests for the helper,
and put a module-level doc comment on SandboxConfig explaining
what the sandbox actually enforces (command validation, not
filesystem isolation).

19/19 runtime tests (up from 14), 333/333 executor tests
including the #88 regression, workspace clippy clean.

* fix(test): skip emulation artifact roundtrip test on Windows

The artifact roundtrip test runs `bash -c "mkdir ... && echo ..."`
through EmulationRuntime, which does `Command::new("bash")`. On
Windows CI runners, this resolves to WSL's `bash.exe` sitting in
`C:\Windows\System32\`, not Git Bash. And since the runner has no
WSL distributions installed, the test blows up with a helpful
"Windows Subsystem for Linux has no installed distributions" and
a link to the Microsoft Store.

The test is inherently Unix-only — it exercises Unix shell commands
on a real EmulationRuntime. Gate it with `cfg(not(target_os =
"windows"))`.

* fix(test): skip secure_emulation rebase test on Windows

Same story as 024a300 — the test runs `pwd` via `sh -c` and then
tries to `canonicalize()` the output. On Windows, `sh` (from MSYS2)
prints a Unix-style path that `std::path::PathBuf::canonicalize()`
has absolutely no idea what to do with.

The rebase logic itself is tested by the pure path-math tests in
container.rs which work fine everywhere. This test is specifically
about the end-to-end sandbox execution path, and that requires a
real Unix `sh`. Skip it on Windows.
This commit is contained in:
Gokul
2026-04-13 10:51:38 +05:30
committed by GitHub
parent f4e9f84fc4
commit a668d815c4
5 changed files with 561 additions and 222 deletions

View File

@@ -7618,6 +7618,189 @@ runs:
assert!(dl_result.output.contains("Downloaded artifact 'my-build'"));
}
/// Regression test for #88.
///
/// A `run:` step that writes a file must land in the same workspace that
/// `actions/upload-artifact` subsequently reads from. Under the buggy
/// emulation runtime, run steps were rerouted to `GITHUB_WORKSPACE` (i.e.
/// the real project directory) while artifact handlers kept using the
/// per-job tempdir — so uploads could never find files the run step had
/// just written.
///
/// This test drives a real `EmulationRuntime` end-to-end through
/// run → upload-artifact → download-artifact and asserts the payload
/// round-trips byte-for-byte.
#[cfg(not(target_os = "windows"))]
#[tokio::test]
async fn run_step_upload_download_artifact_roundtrip_emulation() {
let runtime = emulation::EmulationRuntime::new();
let workflow = minimal_workflow();
let working_dir = tempfile::tempdir().unwrap();
// Point GITHUB_WORKSPACE at an isolated tempdir. On buggy main this is
// where the rerouted run step writes, so upload (which reads
// `ctx.working_dir`) finds nothing and the test fails. After the fix,
// emulation honors the volume mount and the run step writes directly
// into `ctx.working_dir`, so this path is irrelevant — but we still
// isolate it to keep the test from touching the real project tree.
let fake_github_ws = tempfile::tempdir().unwrap();
let artifact_dir = tempfile::tempdir().unwrap();
let artifact_store = crate::artifacts::ArtifactStore::new(artifact_dir.path()).unwrap();
let cache_dir = tempfile::tempdir().unwrap();
let cache_store =
crate::cache::CacheStore::with_root(cache_dir.path().to_path_buf()).unwrap();
let pending = std::sync::Mutex::new(Vec::<PendingCacheSave>::new());
let mut job_env = HashMap::new();
job_env.insert(
"GITHUB_WORKSPACE".to_string(),
fake_github_ws.path().to_string_lossy().to_string(),
);
// --- Step 1: run step that writes a file into the workspace ---
let run_step = make_step_run("mkdir artifact-dir && echo hello > artifact-dir/payload.txt");
let run_ctx = StepExecutionContext {
step: &run_step,
step_idx: 0,
job_env: &job_env,
working_dir: working_dir.path(),
runtime: &runtime,
workflow: &workflow,
runner_image: "ubuntu:latest",
verbose: false,
matrix_combination: &None,
container_config: None,
workflow_defaults: None,
job_defaults: None,
step_outputs: &HashMap::new(),
step_statuses: &HashMap::new(),
job_status: "success",
services: JobServices {
secret_manager: None,
secret_masker: None,
secrets_context: &HashMap::new(),
needs_context: &HashMap::new(),
needs_results: &HashMap::new(),
artifact_store: &artifact_store,
cache_store: &cache_store,
},
pending_cache_saves: &pending,
};
let run_result = execute_step(run_ctx).await.unwrap();
assert_eq!(
run_result.status,
StepStatus::Success,
"run step failed: {}",
run_result.output
);
// --- Step 2: upload-artifact reads the file the run step just wrote ---
let mut up_with = HashMap::new();
up_with.insert("name".to_string(), "payload".to_string());
up_with.insert("path".to_string(), "artifact-dir/payload.txt".to_string());
let up_step = make_step(
"upload",
"actions/upload-artifact@v4",
Some(up_with),
HashMap::new(),
);
let up_ctx = StepExecutionContext {
step: &up_step,
step_idx: 1,
job_env: &job_env,
working_dir: working_dir.path(),
runtime: &runtime,
workflow: &workflow,
runner_image: "ubuntu:latest",
verbose: false,
matrix_combination: &None,
container_config: None,
workflow_defaults: None,
job_defaults: None,
step_outputs: &HashMap::new(),
step_statuses: &HashMap::new(),
job_status: "success",
services: JobServices {
secret_manager: None,
secret_masker: None,
secrets_context: &HashMap::new(),
needs_context: &HashMap::new(),
needs_results: &HashMap::new(),
artifact_store: &artifact_store,
cache_store: &cache_store,
},
pending_cache_saves: &pending,
};
let up_result = execute_step(up_ctx).await.unwrap();
assert_eq!(
up_result.status,
StepStatus::Success,
"upload step failed (this is the #88 regression): {}",
up_result.output
);
assert!(
up_result.output.contains("Uploaded artifact 'payload'"),
"unexpected upload output: {}",
up_result.output
);
// --- Step 3: download-artifact into a fresh dir and verify byte equality ---
let dl_workspace = tempfile::tempdir().unwrap();
let mut dl_with = HashMap::new();
dl_with.insert("name".to_string(), "payload".to_string());
dl_with.insert("path".to_string(), "dl".to_string());
let dl_step = make_step(
"download",
"actions/download-artifact@v4",
Some(dl_with),
HashMap::new(),
);
let dl_ctx = StepExecutionContext {
step: &dl_step,
step_idx: 2,
job_env: &job_env,
working_dir: dl_workspace.path(),
runtime: &runtime,
workflow: &workflow,
runner_image: "ubuntu:latest",
verbose: false,
matrix_combination: &None,
container_config: None,
workflow_defaults: None,
job_defaults: None,
step_outputs: &HashMap::new(),
step_statuses: &HashMap::new(),
job_status: "success",
services: JobServices {
secret_manager: None,
secret_masker: None,
secrets_context: &HashMap::new(),
needs_context: &HashMap::new(),
needs_results: &HashMap::new(),
artifact_store: &artifact_store,
cache_store: &cache_store,
},
pending_cache_saves: &pending,
};
let dl_result = execute_step(dl_ctx).await.unwrap();
assert_eq!(
dl_result.status,
StepStatus::Success,
"download step failed: {}",
dl_result.output
);
let downloaded = std::fs::read_to_string(
dl_workspace
.path()
.join("dl")
.join("artifact-dir/payload.txt"),
)
.expect("downloaded payload.txt should exist");
assert_eq!(downloaded, "hello\n");
}
#[tokio::test]
async fn cache_step_miss_defers_save_and_flush_works() {
let runtime = MockContainerRuntime::default();

View File

@@ -1,5 +1,7 @@
use async_trait::async_trait;
use std::path::Path;
use std::fs;
use std::path::{Path, PathBuf};
use wrkflw_logging;
/// Prefix for all locally-built images. Used to skip registry pulls.
pub const LOCAL_IMAGE_PREFIX: &str = "wrkflw-";
@@ -67,6 +69,102 @@ pub enum ContainerError {
NetworkOperation(String),
}
/// Rebase a container-visible working directory onto its host-side volume
/// source.
///
/// Given a `container_dir` like `/github/workspace/sub` and a `volumes` list
/// that maps `(host, container)` pairs (e.g. `(/tmp/job-xxxx, /github/workspace)`),
/// return the corresponding host path (`/tmp/job-xxxx/sub`) by locating the
/// longest `container` path that is a component-boundary prefix of
/// `container_dir` and grafting the remainder onto its `host` counterpart.
///
/// Returns `None` if no volume covers `container_dir`.
///
/// This is the mount-semantics bridge used by non-container runtimes
/// (emulation, secure_emulation) so that a `run:` step and an
/// artifact/cache handler observe the same host workspace. It is the fix
/// for #88.
pub(crate) fn resolve_host_working_dir(
container_dir: &Path,
volumes: &[(&Path, &Path)],
) -> Option<PathBuf> {
let mut best: Option<(usize, PathBuf)> = None;
for (host, container) in volumes {
if let Ok(suffix) = container_dir.strip_prefix(container) {
// `Path::strip_prefix` respects component boundaries, so
// `/github/workspace-foo` is NOT matched by `/github/workspace`.
let depth = container.components().count();
let candidate = host.join(suffix);
match &best {
// Equal-depth ties can only occur when two volume entries
// share the same `container` prefix (two distinct container
// paths that are both strict component-boundary prefixes of
// the same `container_dir` must have different component
// counts, because one has to contain the other). In that
// duplicate-entry case, first-seen wins — the `>=` is the
// intentional conflict resolution, not a typo for `>`.
Some((best_depth, _)) if *best_depth >= depth => {}
_ => best = Some((depth, candidate)),
}
}
}
best.map(|(_, path)| path)
}
/// Resolve the host working directory for a non-container runtime call, or
/// return a `ContainerError` describing why it couldn't.
///
/// This is the shared wiring used by `EmulationRuntime::run_container` and
/// `SecureEmulationRuntime::run_container`. It enforces one invariant: when
/// `volumes` covers `working_dir`, the volume mapping **always wins** over
/// any accidentally-existing host path — so a dev-environment quirk like
/// `/github/workspace` happening to exist on the host cannot silently skip
/// the rebase and reintroduce #88.
///
/// - If a volume covers `working_dir`, rebase it, `create_dir_all` the host
/// side if it doesn't exist yet (matching docker's bind-mount behavior of
/// creating the mount target on first access), and return the host path.
/// - If no volume covers `working_dir` but `working_dir` itself exists on
/// the host, accept it as a caller-provided host path.
/// - Otherwise, return a loud, descriptive error. No silent fallback.
///
/// `runtime_label` is used as a prefix in log and error messages so the
/// reader can tell which runtime produced a given line.
pub(crate) fn rebase_working_dir_or_error(
working_dir: &Path,
volumes: &[(&Path, &Path)],
runtime_label: &str,
) -> Result<PathBuf, ContainerError> {
match resolve_host_working_dir(working_dir, volumes) {
Some(host) => {
if !host.exists() {
fs::create_dir_all(&host).map_err(|e| {
ContainerError::ContainerExecution(format!(
"{}: failed to create host working directory '{}': {}",
runtime_label,
host.display(),
e
))
})?;
}
wrkflw_logging::info(&format!(
"{}: rebased container path '{}' to host path '{}' via volume mount",
runtime_label,
working_dir.display(),
host.display()
));
Ok(host)
}
None if working_dir.exists() => Ok(working_dir.to_path_buf()),
None => Err(ContainerError::ContainerExecution(format!(
"{}: container working dir '{}' is not covered by any volume mount; \
caller must pass volumes",
runtime_label,
working_dir.display()
))),
}
}
impl fmt::Display for ContainerError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
@@ -87,3 +185,142 @@ impl fmt::Display for ContainerError {
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn resolve_host_working_dir_exact_match() {
let host = Path::new("/host/tmp/job");
let container = Path::new("/github/workspace");
let volumes = [(host, container)];
assert_eq!(
resolve_host_working_dir(Path::new("/github/workspace"), &volumes),
Some(PathBuf::from("/host/tmp/job"))
);
}
#[test]
fn resolve_host_working_dir_sub_path() {
let host = Path::new("/host/tmp/job");
let container = Path::new("/github/workspace");
let volumes = [(host, container)];
assert_eq!(
resolve_host_working_dir(Path::new("/github/workspace/src/lib"), &volumes),
Some(PathBuf::from("/host/tmp/job/src/lib"))
);
}
#[test]
fn resolve_host_working_dir_longest_prefix_wins() {
let outer_host = Path::new("/host/outer");
let outer_container = Path::new("/a");
let inner_host = Path::new("/host/inner");
let inner_container = Path::new("/a/b");
// Order shouldn't matter — longest prefix always wins.
let volumes = [(outer_host, outer_container), (inner_host, inner_container)];
assert_eq!(
resolve_host_working_dir(Path::new("/a/b/c"), &volumes),
Some(PathBuf::from("/host/inner/c"))
);
let reversed = [(inner_host, inner_container), (outer_host, outer_container)];
assert_eq!(
resolve_host_working_dir(Path::new("/a/b/c"), &reversed),
Some(PathBuf::from("/host/inner/c"))
);
}
#[test]
fn resolve_host_working_dir_no_match() {
let host = Path::new("/host/tmp/job");
let container = Path::new("/different");
let volumes = [(host, container)];
assert_eq!(
resolve_host_working_dir(Path::new("/github/workspace"), &volumes),
None
);
}
#[test]
fn resolve_host_working_dir_empty_volumes() {
assert_eq!(
resolve_host_working_dir(Path::new("/github/workspace"), &[]),
None
);
}
/// Critical: a string-prefix match would incorrectly rebase
/// `/github/workspace-foo` onto the mount for `/github/workspace`.
/// `Path::strip_prefix` respects component boundaries, so this must
/// return `None`.
#[test]
fn resolve_host_working_dir_component_boundary_is_respected() {
let host = Path::new("/host/tmp/job");
let container = Path::new("/github/workspace");
let volumes = [(host, container)];
assert_eq!(
resolve_host_working_dir(Path::new("/github/workspace-foo"), &volumes),
None
);
}
/// Core invariant of `rebase_working_dir_or_error`: when a volume
/// covers `working_dir`, the rebase wins even if `working_dir` also
/// happens to exist on the host. Without this, a dev environment with
/// a real `/github/workspace` directory would silently skip the rebase
/// and reintroduce the #88 class of bug.
#[test]
fn rebase_prefers_volume_mapping_over_accidentally_existing_host_path() {
// `container_dir` points at a real host tempdir (so `.exists()`
// returns true), but we also supply a volume mapping that claims
// that same path for a different host location. The volume must win.
let existing = tempfile::tempdir().unwrap();
let mapped = tempfile::tempdir().unwrap();
let volumes = [(mapped.path(), existing.path())];
let resolved = rebase_working_dir_or_error(existing.path(), &volumes, "test").unwrap();
assert_eq!(resolved, mapped.path().to_path_buf());
}
#[test]
fn rebase_accepts_existing_host_path_when_no_volume_covers_it() {
let host_dir = tempfile::tempdir().unwrap();
let resolved = rebase_working_dir_or_error(host_dir.path(), &[], "test").unwrap();
assert_eq!(resolved, host_dir.path().to_path_buf());
}
#[test]
fn rebase_errors_loudly_when_no_volume_and_path_does_not_exist() {
let err = rebase_working_dir_or_error(
Path::new("/definitely/does/not/exist/wrkflw-test"),
&[],
"test",
)
.expect_err("should error");
let msg = err.to_string();
assert!(
msg.contains("not covered by any volume mount"),
"unexpected error: {}",
msg
);
assert!(msg.contains("test:"), "error should carry runtime label");
}
#[test]
fn rebase_creates_host_side_of_mount_if_missing() {
// Simulate `working-directory: sub` pointing at a container subdir
// that hasn't been created yet. The helper must `create_dir_all`
// the host side so the subsequent `Command` can `current_dir` into
// it. This matches docker's bind-mount behavior.
let host_root = tempfile::tempdir().unwrap();
let container_root = Path::new("/github/workspace");
let volumes = [(host_root.path(), container_root)];
let container_sub = Path::new("/github/workspace/sub/nested");
let resolved = rebase_working_dir_or_error(container_sub, &volumes, "test").unwrap();
let expected = host_root.path().join("sub/nested");
assert_eq!(resolved, expected);
assert!(expected.exists(), "helper should have created the subdir");
}
}

View File

@@ -1,4 +1,6 @@
use crate::container::{ContainerError, ContainerOutput, ContainerRuntime};
use crate::container::{
rebase_working_dir_or_error, ContainerError, ContainerOutput, ContainerRuntime,
};
use async_trait::async_trait;
use once_cell::sync::Lazy;
use std::collections::HashMap;
@@ -152,7 +154,7 @@ impl ContainerRuntime for EmulationRuntime {
command: &[&str],
env_vars: &[(&str, &str)],
working_dir: &Path,
_volumes: &[(&Path, &Path)],
volumes: &[(&Path, &Path)],
_entrypoint: Option<&str>,
) -> Result<ContainerOutput, ContainerError> {
// Build command string
@@ -197,47 +199,13 @@ impl ContainerRuntime for EmulationRuntime {
wrkflw_logging::info(&format!(" {}={}", key, value));
}
// Find actual working directory - determine if we should use the current directory instead
let actual_working_dir: PathBuf = if !working_dir.exists() {
// Look for GITHUB_WORKSPACE or CI_PROJECT_DIR in env_vars
let mut workspace_path = None;
for (key, value) in env_vars {
if *key == "GITHUB_WORKSPACE" || *key == "CI_PROJECT_DIR" {
workspace_path = Some(PathBuf::from(value));
break;
}
}
// If found, use that as the working directory
if let Some(path) = workspace_path {
if path.exists() {
wrkflw_logging::info(&format!(
"Using environment-defined workspace: {}",
path.display()
));
path
} else {
// Fallback to current directory
let current_dir =
std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."));
wrkflw_logging::info(&format!(
"Using current directory: {}",
current_dir.display()
));
current_dir
}
} else {
// Fallback to current directory
let current_dir = std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."));
wrkflw_logging::info(&format!(
"Using current directory: {}",
current_dir.display()
));
current_dir
}
} else {
working_dir.to_path_buf()
};
// Resolve the host working directory via the shared mount-semantics
// helper. When `volumes` covers `working_dir` (the normal case for
// container-visible paths like `/github/workspace`), the rebase
// always wins over an accidentally-existing host path — so a dev
// environment with a real `/github/workspace` directory cannot
// silently reintroduce #88.
let actual_working_dir = rebase_working_dir_or_error(working_dir, volumes, "emulation")?;
wrkflw_logging::info(&format!(
"Using actual working directory: {}",
@@ -823,4 +791,38 @@ mod tests {
);
assert!(output.stdout.is_empty(), "stdout should be empty");
}
/// Regression for #88: if the caller passes a container-visible working
/// directory that no volume covers, emulation must hard-error instead of
/// silently rerouting to `GITHUB_WORKSPACE`. Symmetry test with the
/// matching `secure_emulation_errors_when_volumes_dont_cover_working_dir`.
#[tokio::test]
async fn emulation_errors_when_volumes_dont_cover_working_dir() {
let runtime = EmulationRuntime::new();
// /github/workspace doesn't exist on host and no volume covers it.
let result = runtime
.run_container(
"alpine:latest",
&["echo", "nope"],
&[],
Path::new("/github/workspace"),
&[],
None,
)
.await;
let err = result.expect_err("should error when no volume covers working_dir");
let msg = err.to_string();
assert!(
msg.contains("not covered by any volume mount"),
"unexpected error: {}",
msg
);
assert!(
msg.contains("emulation:"),
"error should carry runtime label, got: {}",
msg
);
}
}

View File

@@ -1,13 +1,18 @@
use regex::Regex;
use std::collections::HashSet;
use std::fs;
use std::path::{Path, PathBuf};
use std::path::Path;
use std::process::{Command, Stdio};
use std::time::Duration;
use tempfile::TempDir;
use wrkflw_logging;
/// Configuration for sandbox execution
/// Configuration for sandbox execution.
///
/// Note: this sandbox provides **command-level** validation (whitelist,
/// blocklist, dangerous-pattern regexes) and **environment-variable**
/// filtering. It does NOT provide filesystem isolation — the command runs
/// in the caller's working directory as a plain subprocess, so absolute
/// paths and `..` sequences can reach anything the host user can reach.
/// If filesystem isolation is needed, use the Docker or Podman runtime.
#[derive(Debug, Clone)]
pub struct SandboxConfig {
/// Maximum execution time for commands
@@ -20,10 +25,6 @@ pub struct SandboxConfig {
pub allowed_commands: HashSet<String>,
/// Blocked commands (blacklist)
pub blocked_commands: HashSet<String>,
/// Allowed file system paths (read-only)
pub allowed_read_paths: HashSet<PathBuf>,
/// Allowed file system paths (read-write)
pub allowed_write_paths: HashSet<PathBuf>,
/// Whether to enable network access
pub allow_network: bool,
/// Maximum number of processes
@@ -143,8 +144,6 @@ impl Default for SandboxConfig {
max_cpu_percent: 80,
allowed_commands,
blocked_commands,
allowed_read_paths: HashSet::new(),
allowed_write_paths: HashSet::new(),
allow_network: false,
max_processes: 10,
strict_mode: true,
@@ -180,32 +179,35 @@ pub enum SandboxError {
/// Secure sandbox for executing commands in emulation mode
pub struct Sandbox {
config: SandboxConfig,
workspace: TempDir,
dangerous_patterns: Vec<Regex>,
}
impl Sandbox {
/// Create a new sandbox with the given configuration
pub fn new(config: SandboxConfig) -> Result<Self, SandboxError> {
let workspace = tempfile::tempdir().map_err(|e| SandboxError::SandboxSetupError {
reason: format!("Failed to create sandbox workspace: {}", e),
})?;
let dangerous_patterns = Self::compile_dangerous_patterns();
wrkflw_logging::info(&format!(
"Created new sandbox with workspace: {}",
workspace.path().display()
));
wrkflw_logging::info("Created new sandbox");
Ok(Self {
config,
workspace,
dangerous_patterns,
})
}
/// Execute a command in the sandbox
/// Execute a command in the sandbox.
///
/// The command runs **in-place** in `working_dir` (no file copying). The
/// caller is responsible for ensuring `working_dir` is the correct host
/// workspace — `SecureEmulationRuntime::run_container` rebases container
/// paths via the volume mount before calling in, matching the mount
/// semantics of docker/podman (#88).
///
/// Security is enforced by `validate_command` (command whitelist / blocked
/// commands / dangerous patterns), `is_env_var_safe` (env var filtering),
/// and `execute_with_limits` (timeout). The previous copy-files-to-a-
/// private-workspace layer was not actually providing isolation — it was
/// breaking the run-step / artifact-handler workspace invariant.
pub async fn execute_command(
&self,
command: &[&str],
@@ -223,11 +225,8 @@ impl Sandbox {
// Step 1: Validate command
self.validate_command(&command_str)?;
// Step 2: Setup sandbox environment
let sandbox_dir = self.setup_sandbox_environment(working_dir)?;
// Step 3: Execute with limits
self.execute_with_limits(command, env_vars, &sandbox_dir)
// Step 2: Execute in-place with limits
self.execute_with_limits(command, env_vars, working_dir)
.await
}
@@ -333,63 +332,6 @@ impl Sandbox {
builtins.contains(&command)
}
/// Setup isolated sandbox environment
fn setup_sandbox_environment(&self, working_dir: &Path) -> Result<PathBuf, SandboxError> {
let sandbox_root = self.workspace.path();
let sandbox_workspace = sandbox_root.join("workspace");
// Create sandbox directory structure
fs::create_dir_all(&sandbox_workspace).map_err(|e| SandboxError::SandboxSetupError {
reason: format!("Failed to create sandbox workspace: {}", e),
})?;
// Copy allowed files to sandbox (if working_dir exists and is allowed)
if working_dir.exists() && self.is_path_allowed(working_dir, false) {
self.copy_safe_files(working_dir, &sandbox_workspace)?;
}
wrkflw_logging::info(&format!(
"Sandbox environment ready: {}",
sandbox_workspace.display()
));
Ok(sandbox_workspace)
}
/// Copy files safely to sandbox, excluding dangerous files
fn copy_safe_files(&self, source: &Path, dest: &Path) -> Result<(), SandboxError> {
for entry in fs::read_dir(source).map_err(|e| SandboxError::SandboxSetupError {
reason: format!("Failed to read source directory: {}", e),
})? {
let entry = entry.map_err(|e| SandboxError::SandboxSetupError {
reason: format!("Failed to read directory entry: {}", e),
})?;
let path = entry.path();
let file_name = path.file_name().and_then(|s| s.to_str()).unwrap_or("");
// Skip dangerous or sensitive files
if self.should_skip_file(file_name) {
continue;
}
let dest_path = dest.join(file_name);
if path.is_file() {
fs::copy(&path, &dest_path).map_err(|e| SandboxError::SandboxSetupError {
reason: format!("Failed to copy file: {}", e),
})?;
} else if path.is_dir() && !self.should_skip_directory(file_name) {
fs::create_dir_all(&dest_path).map_err(|e| SandboxError::SandboxSetupError {
reason: format!("Failed to create directory: {}", e),
})?;
self.copy_safe_files(&path, &dest_path)?;
}
}
Ok(())
}
/// Execute command with resource limits and monitoring
async fn execute_with_limits(
&self,
@@ -466,28 +408,6 @@ impl Sandbox {
}
}
/// Check if a path is allowed for access
fn is_path_allowed(&self, path: &Path, write_access: bool) -> bool {
let abs_path = path.canonicalize().unwrap_or_else(|_| path.to_path_buf());
if write_access {
self.config
.allowed_write_paths
.iter()
.any(|allowed| abs_path.starts_with(allowed))
} else {
self.config
.allowed_read_paths
.iter()
.any(|allowed| abs_path.starts_with(allowed))
|| self
.config
.allowed_write_paths
.iter()
.any(|allowed| abs_path.starts_with(allowed))
}
}
/// Check if an environment variable is safe to pass through
fn is_env_var_safe(&self, key: &str) -> bool {
// Block dangerous environment variables
@@ -504,45 +424,6 @@ impl Sandbox {
!dangerous_env_vars.contains(&key)
}
/// Check if a file should be skipped during copying
fn should_skip_file(&self, filename: &str) -> bool {
let dangerous_files = [
".ssh",
".gnupg",
".aws",
".docker",
"id_rsa",
"id_ed25519",
"credentials",
"config",
".env",
".secrets",
];
dangerous_files
.iter()
.any(|pattern| filename.contains(pattern))
|| filename.starts_with('.') && filename != ".gitignore" && filename != ".github"
}
/// Check if a directory should be skipped
fn should_skip_directory(&self, dirname: &str) -> bool {
let skip_dirs = [
"target",
"node_modules",
".git",
".cargo",
".npm",
".cache",
"build",
"dist",
"tmp",
"temp",
];
skip_dirs.contains(&dirname)
}
/// Compile regex patterns for dangerous command detection
fn compile_dangerous_patterns() -> Vec<Regex> {
let patterns = [
@@ -586,32 +467,18 @@ impl Sandbox {
/// Create a default sandbox configuration for CI/CD workflows
pub fn create_workflow_sandbox_config() -> SandboxConfig {
let mut allowed_read_paths = HashSet::new();
allowed_read_paths.insert(PathBuf::from("."));
let mut allowed_write_paths = HashSet::new();
allowed_write_paths.insert(PathBuf::from("."));
SandboxConfig {
max_execution_time: Duration::from_secs(1800), // 30 minutes
max_memory_mb: 2048, // 2GB
max_processes: 50,
allow_network: true,
strict_mode: false,
allowed_read_paths,
allowed_write_paths,
..Default::default()
}
}
/// Create a strict sandbox configuration for untrusted code
pub fn create_strict_sandbox_config() -> SandboxConfig {
let mut allowed_read_paths = HashSet::new();
allowed_read_paths.insert(PathBuf::from("."));
let mut allowed_write_paths = HashSet::new();
allowed_write_paths.insert(PathBuf::from("."));
// Very limited command set
let allowed_commands = ["echo", "cat", "ls", "pwd", "date"]
.iter()
@@ -624,8 +491,6 @@ pub fn create_strict_sandbox_config() -> SandboxConfig {
max_processes: 5,
allow_network: false,
strict_mode: true,
allowed_read_paths,
allowed_write_paths,
allowed_commands,
..Default::default()
}
@@ -666,19 +531,4 @@ mod tests {
assert!(sandbox.validate_command("git clone").is_err());
assert!(sandbox.validate_command("cargo build").is_err());
}
#[test]
fn test_file_filtering() {
let sandbox = Sandbox::new(SandboxConfig::default()).unwrap();
// Should skip dangerous files
assert!(sandbox.should_skip_file("id_rsa"));
assert!(sandbox.should_skip_file(".ssh"));
assert!(sandbox.should_skip_file("credentials"));
// Should allow safe files
assert!(!sandbox.should_skip_file("Cargo.toml"));
assert!(!sandbox.should_skip_file("README.md"));
assert!(!sandbox.should_skip_file(".gitignore"));
}
}

View File

@@ -1,4 +1,6 @@
use crate::container::{ContainerError, ContainerOutput, ContainerRuntime};
use crate::container::{
rebase_working_dir_or_error, ContainerError, ContainerOutput, ContainerRuntime,
};
use crate::sandbox::{create_workflow_sandbox_config, Sandbox, SandboxConfig, SandboxError};
use async_trait::async_trait;
use std::path::Path;
@@ -52,7 +54,7 @@ impl ContainerRuntime for SecureEmulationRuntime {
command: &[&str],
env_vars: &[(&str, &str)],
working_dir: &Path,
_volumes: &[(&Path, &Path)],
volumes: &[(&Path, &Path)],
entrypoint: Option<&str>,
) -> Result<ContainerOutput, ContainerError> {
if let Some(ep) = entrypoint {
@@ -70,10 +72,17 @@ impl ContainerRuntime for SecureEmulationRuntime {
image
));
// Rebase the container-visible working_dir onto its host-side volume
// source, matching EmulationRuntime and docker/podman (#88). Without
// this, `run:` steps and artifact/cache handlers observe different
// host directories.
let host_working_dir =
rebase_working_dir_or_error(working_dir, volumes, "secure_emulation")?;
// Use sandbox to execute the command safely
let result = self
.sandbox
.execute_command(command, env_vars, working_dir)
.execute_command(command, env_vars, &host_working_dir)
.await;
match result {
@@ -403,4 +412,62 @@ mod tests {
assert!(output.stdout.contains("hello world"));
assert_eq!(output.exit_code, 0);
}
/// Regression for #88: a container-visible working dir must be rebased
/// through the `volumes` mapping onto its host counterpart, so commands
/// run in the caller's workspace rather than a hidden sandbox copy.
#[cfg(not(target_os = "windows"))]
#[tokio::test]
async fn secure_emulation_rebases_container_working_dir_via_volumes() {
let runtime = SecureEmulationRuntime::new();
let host_tempdir = tempfile::tempdir().unwrap();
let host = host_tempdir.path();
let container = Path::new("/github/workspace");
// Run `pwd` in the container workspace; with the rebase it should
// print the host tempdir.
let result = runtime
.run_container(
"alpine:latest",
&["pwd"],
&[],
container,
&[(host, container)],
None,
)
.await
.expect("secure_emulation run failed");
assert_eq!(result.exit_code, 0, "stderr: {}", result.stderr);
// `pwd` may canonicalize /var → /private/var on macOS, so compare
// canonical forms.
let canon_host = host.canonicalize().unwrap();
let canon_pwd = PathBuf::from(result.stdout.trim()).canonicalize().unwrap();
assert_eq!(canon_pwd, canon_host);
}
#[tokio::test]
async fn secure_emulation_errors_when_volumes_dont_cover_working_dir() {
let runtime = SecureEmulationRuntime::new();
// /github/workspace doesn't exist on host and no volume covers it.
let result = runtime
.run_container(
"alpine:latest",
&["echo", "nope"],
&[],
Path::new("/github/workspace"),
&[],
None,
)
.await;
let err = result.expect_err("should error when no volume covers working_dir");
let msg = err.to_string();
assert!(
msg.contains("not covered by any volume mount"),
"unexpected error: {}",
msg
);
}
}