diff --git a/.github/workflows/tests.yaml b/.github/workflows/tests.yaml index 3035a2cd..713e513b 100644 --- a/.github/workflows/tests.yaml +++ b/.github/workflows/tests.yaml @@ -1,14 +1,15 @@ name: Tests -on: +on: pull_request: branches: - main types: [assigned, opened, synchronize, reopened, ready_for_review] - paths: + paths: - hs - .github/** - jit_runtime/** - - parallel-orch/** + - scheduler/** + - executor/** - preprocessor/** - overlay-sandbox/** - test/** @@ -16,10 +17,11 @@ on: push: branches: - main - paths: + paths: - hs - jit_runtime/** - - parallel-orch/** + - scheduler/** + - executor/** - preprocessor/** - overlay-sandbox/** - test/** @@ -31,12 +33,12 @@ jobs: strategy: fail-fast: false matrix: - os: + os: - ubuntu-24.04 runs-on: ${{ matrix.os }} if: github.event.pull_request.draft == false steps: - - uses: actions/checkout@v2 + - uses: actions/checkout@v2 - name: Set up Python 3.12 uses: actions/setup-python@v2 with: @@ -53,7 +55,7 @@ jobs: python_pkgs/bin/pip install -r requirements.txt python_pkgs/bin/pip install psutil python_pkgs/bin/python -c "import shasta; import libdash; import libbash" - (cd parallel-orch; make) + (cd executor; make) - name: Running Correctness Tests run: | export PASH_SPEC_TOP=$(pwd) diff --git a/README.md b/README.md index d60fde08..e5471ef5 100644 --- a/README.md +++ b/README.md @@ -15,8 +15,10 @@ The project's top-level directory contains the following: - `deps`: Dependencies required by `hs`. - `docs`: Documentation and architectural diagrams. - `model-checking`: Tools and utilities for model checking. -- `parallel-orch`: Main orchestration components. -- `pash-spec.sh`: Entry script to initiate the `hs` process. +- `scheduler`: Scheduler daemon — manages speculative execution order and dependency tracking. +- `executor`: Executor — runs commands in sandboxes with tracing. +- `jit_runtime`: JIT runtime — shell scripts sourced during execution for state management. +- `preprocessor`: Preprocessor — transforms shell ASTs for speculative execution. - `README.md`: This documentation file. - `report`: Generated reports related to test runs and performance metrics. - `requirements.txt`: List of Python dependencies. @@ -42,7 +44,7 @@ This script will handle all the necessary installations, including dependencies, ### Running `hs` -The main entry script to initiate `hs` is `pash-spec.sh`. This script sets up the necessary environment and invokes the orchestrator in `parallel-orch/orch.py`. It's designed to accept a variety of arguments to customize its behavior, such as setting debug levels or specifying log files. +The main entry script to initiate `hs` is the `hs` script. This script sets up the necessary environment, launches the scheduler daemon, preprocesses the input script, and executes it with speculative execution. It accepts a variety of arguments to customize its behavior, such as setting debug levels or specifying log files. Example of running the script: diff --git a/docs/component_pointers.md b/docs/component_pointers.md index 0027dacf..4209e345 100644 --- a/docs/component_pointers.md +++ b/docs/component_pointers.md @@ -2,7 +2,7 @@ ## hS Overview hS is a speculative execution system that can run shell programs in an out-of-order fashion -while maintaining equivalence to sequantial execution. +while maintaining equivalence to sequantial execution. It does this by transforming and executing the program in a controlled, sandboxed manner, then carefully reason about the dependencies to before choosing to commit, discard, or re-execute the commands it speculated @@ -37,21 +37,18 @@ done ## Preprocessing This step analyzes and transforms input shell script. -(Also this is the part where the most legacy code and technical debts are.) Major callpath: ``` -deps/pash/pa.sh - deps/pash/compiler/pash.py:main - preprocess_and_execute_asts - deps/pash/compiler/preprocessor/preprocessor.py:preprocess - deps/pash/compiler/shell_ast/ast_to_ast.py:replace_ast_region - deps/pash/compiler/speculative/util_spec.py:serialize_partial_order - execute_script +hs (entry point) + preprocessor/preprocessor.py:preprocess + preprocessor/ast_transform.py:replace_ast_regions + preprocessor/transformation.py:serialize_partial_order + execute_script ``` `breakpoint()` at -- end of `deps/pash/compiler/preprocessor/preprocessor.py:preprocess_asts` to see "partial_order" files +- end of `preprocessor/preprocessor.py:preprocess_asts` to see "partial_order" files - `preprocess_and_execute_asts` to see preprocessed script files (a.k.a. program skeleton) Alternatively turn on `-d 2` and dig through logs @@ -59,13 +56,13 @@ Alternatively turn on `-d 2` and dig through logs ## Program Skeleton Looking at the program skeleton we can see -- Control flow: the control flow structure is kept and +- Control flow: the control flow structure is kept and - HS_LOOP_LIST: the implicit runtime control flow hint -Inside `pash_runtime.sh`, the major call path is: +Inside the JIT runtime, the major call path is: ``` -deps/pash/compiler/pash_runtime.sh - deps/pash/compiler/orchestrator_runtime/speculative/speculative_runtime.sh +jit_runtime/jit.sh + (communicates with scheduler via unix socket) ``` We will come back to this when we look at command execution @@ -78,13 +75,12 @@ We will come back to this when we look at command execution ### Core Data Structures Major callpath: ``` -deps/pash/compiler/orchestrator_runtime/speculative/pash_spec_init_setup.sh - scheduler_server.py:main - Scheduler.run +scheduler/scheduler_server.py:main + Scheduler.run ``` -parallel_orch/node.py:`HSProg` --- backend's representation of the shell program +scheduler/node.py:`HSProg` --- backend's representation of the shell program `HSBasicBlock` --- backend's representation of basic block -parallel_orch/partial_program_order.py:`PartialProgramOrder` --- Execution state of the shell program +scheduler/partial_program_order.py:`PartialProgramOrder` --- Execution state of the shell program ```Python class CFGEdgeType(Enum): IF_TAKEN = auto() @@ -121,7 +117,7 @@ All of these information are directly parsed from the "partial order" file ## Command Execution and Sandboxing ### Core Data Structure -parallel_orch/node.py: +scheduler/node.py: ```Python class NodeState(Enum): INIT = auto() @@ -138,41 +134,41 @@ class ConcreteNode: cnid: ConcreteNodeId abstract_node: Node state: NodeState - # exists for EXEC or SPEC_E or subsequent states, erased for READY + # exists for EXEC or SPEC_E or subsequent states, erased for READY exec_id: int - # Nodes to check for fs dependencies before this node can be committed - # for this particular execution of the main sandbox. - # No need to do the same for the background sandbox since it will always get committed. + # Nodes to check for fs dependencies before this node can be committed + # for this particular execution of the main sandbox. + # No need to do the same for the background sandbox since it will always get committed. to_be_resolved_snapshot: "set[NodeId]" - # Read and write sets for this node + # Read and write sets for this node rwset: RWSet - # The wait trace file for this node + # The wait trace file for this node wait_env_file: str - # This can only be set while in the frontier and the background node execution is enabled - # TODO: For now ignore this. Maybe there is a better way to do this. - # background_sandbox: Sandbox + # This can only be set while in the frontier and the background node execution is enabled + # TODO: For now ignore this. Maybe there is a better way to do this. + # background_sandbox: Sandbox - # Exists when the node is in COMMITED or SPEC_F + # Exists when the node is in COMMITED or SPEC_F exec_result: ExecResult - # Updated when the node is loop changing and the node is transitioning - # into COMMITTED or SPEC_F + # Updated when the node is loop changing and the node is transitioning + # into COMMITTED or SPEC_F loop_list_context: HSLoopListContext - # read-only, the value of initial loop_list_context - # used when reset_to_ready + # read-only, the value of initial loop_list_context + # used when reset_to_ready init_loop_list_context: HSLoopListContext spec_pre_env: str - # Exists when node is in READY + # Exists when node is in READY assignments: "list[NodeId]" - # Exists when node is in EXE or SPEC_EXE, it acts as a cache for - # the trace file content + # Exists when node is in EXE or SPEC_EXE, it acts as a cache for + # the trace file content trace_lines: list - # Exists when node is in EXE or SPEC_EXE, it it an opened file - # or none when such file doesn't exist + # Exists when node is in EXE or SPEC_EXE, it it an opened file + # or none when such file doesn't exist trace_fd=None trace_ctx=None ``` @@ -181,18 +177,18 @@ Major callpath: ``` Node.start_executing Node.start_command - parallel_orch/executor.py:run_trace_sandboxed - parallel-orch/run_command.sh + executor/executor.py:run_trace_sandboxed + executor/run_command.sh fd_util -> try -> strace ``` -The `env_file` is explicitly passed around. The environment capturing and restoring happens at `deps/pash/compiler/orchestrator_runtime/pash_declare_vars.sh` and `deps/pash/compiler/orchestrator_runtime/pash_source_declare_vars.sh` +The `env_file` is explicitly passed around. The environment capturing and restoring happens at `jit_runtime/pash_declare_vars.sh` and `jit_runtime/pash_source_declare_vars.sh` -`breakpoint()` at `parallel_orc/partial_program_order.py:handle_complete`, see the completed files +`breakpoint()` at `scheduler/partial_program_order.py:handle_complete`, see the completed files ## Backend: Speculation ### Core data structure -parallel-orch/partial_program_order.py:`PartialProgramOrder` +scheduler/partial_program_order.py:`PartialProgramOrder` ```Python class PartialProgramOrder: def __init__(self, abstract_nodes: "dict[NodeId, Node]", edges: "dict[NodeId, list[NodeId]]", @@ -200,10 +196,10 @@ class PartialProgramOrder: self.hsprog = hs_prog self.concrete_nodes: dict[ConcreteNodeId, ConcreteNode] = {} self.frontier = set() - # self.run_after = {} - # Nodes that we have received "wait" for + # self.run_after = {} + # Nodes that we have received "wait" for self.canon_exec_order: list[ConcreteNodeId] = list() - # Nodes that we think should happen, and haven't received "wait" for + # Nodes that we think should happen, and haven't received "wait" for self.spec_exec_order: list[ConcreteNodeId] = list() self.to_be_resolved: dict[ConcreteNodeId, list[ConcreteNodeId]] = {} self.temp_new_env = None @@ -212,7 +208,7 @@ class PartialProgramOrder: Major callpath: ``` -parallel-orch/scheduler_server.py:Scheduler.run +scheduler/scheduler_server.py:Scheduler.run Scheduler.process_next_cmd PartialProgramOrder.handle_complete PartialProgramOrder.handle_wait diff --git a/docs/pdb.patch b/docs/pdb.patch index aa621d74..0abeb193 100644 --- a/docs/pdb.patch +++ b/docs/pdb.patch @@ -6,10 +6,10 @@ index 0f692e96..b60db641 100644 start_server() { -- python3 -S "$PASH_SPEC_TOP/parallel-orch/scheduler_server.py" "$@" & +- python3 -S "$PASH_SPEC_TOP/scheduler/scheduler_server.py" "$@" & - export daemon_pid=$! - ## Wait until daemon has established connection -+ python3 -S "$PASH_SPEC_TOP/parallel-orch/scheduler_server.py" "$@" ++ python3 -S "$PASH_SPEC_TOP/scheduler/scheduler_server.py" "$@" + # export daemon_pid=$! + # # Wait until daemon has established connection + # pash_spec_wait_until_scheduler_listening @@ -96,7 +96,7 @@ index 8569e55..0852d1a 100755 @@ -35,7 +29,9 @@ export PASH_SPEC_SCHEDULER_SOCKET="${PASH_SPEC_TMP_PREFIX}/scheduler_socket" ## TODO: Replace this with a call to pa.sh (which will start the scheduler on its own). - # python3 "$PASH_SPEC_TOP/parallel-orch/orch.py" "$@" + # python3 "$PASH_SPEC_TOP/scheduler/scheduler_server.py" "$@" -"$PASH_TOP/pa.sh" --speculative "$@" +a=$1 +shift diff --git a/parallel-orch/Makefile b/executor/Makefile similarity index 100% rename from parallel-orch/Makefile rename to executor/Makefile diff --git a/parallel-orch/executor.py b/executor/executor.py similarity index 64% rename from parallel-orch/executor.py rename to executor/executor.py index 89bc33aa..3a049a2e 100644 --- a/parallel-orch/executor.py +++ b/executor/executor.py @@ -1,10 +1,12 @@ -import config +"""Executor component — launches commands in sandboxes and traces them.""" + import logging import subprocess -import util import os from dataclasses import dataclass +from executor_util import ptempfile, ptempdir, create_sandbox, copy, PASH_SPEC_TOP + @dataclass class ExecCtxt: @@ -30,53 +32,46 @@ class ExecArgs: speculate_mode: bool lower_sandboxes: list[str] -# This module executes a sequence of commands -# and traces them with Riker. -# All commands are run inside an overlay sandbox. def set_pgid(): os.setpgid(0, 0) def run_assignment_and_return_env_file(assignment: str, pre_execution_env_file: str): - post_execution_env_file = util.ptempfile(prefix='hs_assignment_post_env') + post_execution_env_file = ptempfile(prefix='hs_assignment_post_env') logging.debug(f'Running assignment: {assignment} | pre_execution_env_file: {pre_execution_env_file} | post_execution_env_file: {post_execution_env_file}') - run_script = f'{config.PASH_SPEC_TOP}/parallel-orch/run_assignment.sh' + run_script = f'{PASH_SPEC_TOP}/executor/run_assignment.sh' args = ["/bin/bash", run_script, assignment, pre_execution_env_file, post_execution_env_file] process = subprocess.run(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True) - util.copy(pre_execution_env_file + '.fds', post_execution_env_file + '.fds') + copy(pre_execution_env_file + '.fds', post_execution_env_file + '.fds') return post_execution_env_file def run_trace_sandboxed(args: ExecArgs): - run_script = f'{config.PASH_SPEC_TOP}/parallel-orch/run_command.sh' + run_script = f'{PASH_SPEC_TOP}/executor/run_command.sh' - trace_file = util.ptempfile(prefix='hs_trace') - outfiles_dir = util.ptempdir(prefix='hs_outfiles') - stderr_file = util.ptempfile(prefix='hs_stderr') + trace_file = ptempfile(prefix='hs_trace') + outfiles_dir = ptempdir(prefix='hs_outfiles') + stderr_file = ptempfile(prefix='hs_stderr') logging.debug(f'Scheduler: Trace file for: {args.concrete_node_id}: {trace_file}') logging.debug(f'Scheduler: Stdout file for: {args.concrete_node_id} is: {outfiles_dir}') logging.debug(f'Scheduler: Stderr file for: {args.concrete_node_id} is: {stderr_file}') - sandbox_dir, tmp_dir = util.create_sandbox() - post_execution_env_file = util.ptempfile(prefix='hs_post_env') + sandbox_dir, tmp_dir = create_sandbox() + post_execution_env_file = ptempfile(prefix='hs_post_env') lower_dirs_str = ':'.join(args.lower_sandboxes) speculate_mode = "speculate" if args.speculate_mode else "standard" cmd = ["/bin/bash", run_script, args.command, trace_file, outfiles_dir, args.pre_execution_env_file, sandbox_dir, tmp_dir, speculate_mode, str(args.concrete_node_id), post_execution_env_file, str(args.execution_id), lower_dirs_str ] logging.debug(cmd) process = subprocess.Popen(cmd, stdout=None, stderr=None, preexec_fn=set_pgid) - # For debugging - # process = subprocess.Popen(cmd) return ExecCtxt(process, trace_file, outfiles_dir, stderr_file, args.pre_execution_env_file, post_execution_env_file, sandbox_dir) def commit_workspace(workspace_path): - ## Call commit-sandbox.sh to commit the uncommitted sandbox to the main workspace - run_script = f'{config.PASH_SPEC_TOP}/deps/try/try' + run_script = f'{PASH_SPEC_TOP}/deps/try/try' args = ["/bin/bash", run_script, "-i", "/run/mount", "commit", workspace_path] process = subprocess.check_output(args) return process -## Read trace and capture each command def read_trace(sandbox_dir, trace_file): if sandbox_dir == "": path = trace_file @@ -85,12 +80,3 @@ def read_trace(sandbox_dir, trace_file): logging.debug(f'Reading trace from: {path}') with open(path) as f: return f.read().split('\n')[:-1] - -def read_env_file(env_file, sandbox_dir=None): - if sandbox_dir is None: - path = env_file - else: - path = f"{sandbox_dir}/upperdir/{env_file}" - logging.debug(f'Reading env from: {path}') - out = subprocess.check_output([f"{os.getenv('PASH_TOP')}/compiler/orchestrator_runtime/pash_filter_vars.sh", path]) - return out.decode("utf-8") diff --git a/executor/executor_util.py b/executor/executor_util.py new file mode 100644 index 00000000..5938acc9 --- /dev/null +++ b/executor/executor_util.py @@ -0,0 +1,30 @@ +"""Standalone utilities for the executor component.""" + +import os +import shutil +import tempfile + +PASH_SPEC_TMP_PREFIX = os.environ.get("PASH_SPEC_TMP_PREFIX", "/tmp/pash_spec/") +PASH_SPEC_TOP = os.environ.get("PASH_SPEC_TOP", "") + + +def ptempfile(prefix=''): + fd, name = tempfile.mkstemp(dir=PASH_SPEC_TMP_PREFIX, prefix=prefix + '_') + os.close(fd) + return name + + +def ptempdir(prefix=''): + return tempfile.mkdtemp(dir=PASH_SPEC_TMP_PREFIX, prefix=prefix + '_') + + +def create_sandbox(): + os.makedirs(f"{PASH_SPEC_TMP_PREFIX}/tmp/pash_spec/a", exist_ok=True) + os.makedirs(f"{PASH_SPEC_TMP_PREFIX}/tmp/pash_spec/b", exist_ok=True) + sdir = tempfile.mkdtemp(dir=f"{PASH_SPEC_TMP_PREFIX}/tmp/pash_spec/a", prefix="sandbox_") + tdir = tempfile.mkdtemp(dir=f"{PASH_SPEC_TMP_PREFIX}/tmp/pash_spec/b", prefix="sandbox_") + return sdir, tdir + + +def copy(path_from, path_to): + shutil.copy(path_from, path_to) diff --git a/parallel-orch/fd_util.c b/executor/fd_util.c similarity index 100% rename from parallel-orch/fd_util.c rename to executor/fd_util.c diff --git a/parallel-orch/run_assignment.sh b/executor/run_assignment.sh similarity index 66% rename from parallel-orch/run_assignment.sh rename to executor/run_assignment.sh index 4e6a4b28..7d3f7c67 100644 --- a/parallel-orch/run_assignment.sh +++ b/executor/run_assignment.sh @@ -7,6 +7,6 @@ POST_EXEC_ENV=${3?No Riker env file given} ## Functions now exported from parent hs script, no need to source # source "$PASH_SPEC_TOP/jit_runtime/pash_spec_init_setup.sh" -RUN=$(printf 'source %s; %s\n source ${PASH_SPEC_TOP}/parallel-orch/pash_declare_vars.sh %s' "${PRE_ENV_FILE}" "${ASSIGNMENT_STRING}" "${POST_EXEC_ENV}") +RUN=$(printf 'source %s; %s\n source ${RUNTIME_DIR}/pash_declare_vars.sh %s' "${PRE_ENV_FILE}" "${ASSIGNMENT_STRING}" "${POST_EXEC_ENV}") bash -c "$RUN" diff --git a/parallel-orch/run_command.sh b/executor/run_command.sh similarity index 95% rename from parallel-orch/run_command.sh rename to executor/run_command.sh index 9c6805a0..e27231a8 100755 --- a/parallel-orch/run_command.sh +++ b/executor/run_command.sh @@ -31,7 +31,7 @@ fi # echo tempdir $TEMPDIR # echo sandbox $SANDBOX_DIR -${RUNTIME_LIBRARY_DIR}/fd_util -f "${LATEST_ENV_FILE}.fds" -p ${STDOUT_FILE} bash "${PASH_SPEC_TOP}/deps/try/try" -D "${SANDBOX_DIR}" -L "${LOWER_DIRS}" "${PASH_SPEC_TOP}/parallel-orch/template_script_to_execute.sh" +${RUNTIME_LIBRARY_DIR}/fd_util -f "${LATEST_ENV_FILE}.fds" -p ${STDOUT_FILE} bash "${PASH_SPEC_TOP}/deps/try/try" -D "${SANDBOX_DIR}" -L "${LOWER_DIRS}" "${PASH_SPEC_TOP}/executor/template_script_to_execute.sh" exit_code=$? ## Only used for debugging # ls -R "${SANDBOX_DIR}/upperdir" 1>&2 diff --git a/parallel-orch/set-diff.c b/executor/set-diff.c similarity index 100% rename from parallel-orch/set-diff.c rename to executor/set-diff.c diff --git a/parallel-orch/template_script_to_execute.sh b/executor/template_script_to_execute.sh similarity index 59% rename from parallel-orch/template_script_to_execute.sh rename to executor/template_script_to_execute.sh index fa9de046..b7e39a78 100755 --- a/parallel-orch/template_script_to_execute.sh +++ b/executor/template_script_to_execute.sh @@ -5,5 +5,5 @@ if [ "speculate" == "$EXEC_MODE" ]; then fi # Magic, don't touch without consulting Di -RUN=$(printf 'source %s 2>/dev/null; %s\n exit_code=$?; hs_runtime_tmp_args=("$@")\n hs_set_options_cmd="$(set +o)"; source ${PASH_SPEC_TOP}/parallel-orch/pash_declare_vars.sh %s; trap - EXIT; exit $exit_code' "${LATEST_ENV_FILE}" "${CMD_STRING}" "${POST_EXEC_ENV}") +RUN=$(printf 'source %s 2>/dev/null; %s\n exit_code=$?; hs_runtime_tmp_args=("$@")\n hs_set_options_cmd="$(set +o)"; source ${RUNTIME_DIR}/pash_declare_vars.sh %s; trap - EXIT; exit $exit_code' "${LATEST_ENV_FILE}" "${CMD_STRING}" "${POST_EXEC_ENV}") strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $TRACE_FILE env -i bash -c "$RUN" diff --git a/hs b/hs index f4944552..abfa2af0 100755 --- a/hs +++ b/hs @@ -17,7 +17,7 @@ export PASH_TOP="${PASH_TOP:-$PASH_SPEC_TOP}" ## Runtime directories export RUNTIME_DIR="$PASH_SPEC_TOP/jit_runtime" -export RUNTIME_LIBRARY_DIR="$PASH_SPEC_TOP/parallel-orch" +export RUNTIME_LIBRARY_DIR="$PASH_SPEC_TOP/executor" export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib/" ## Setup cgroups for memory protection (if writable) @@ -312,7 +312,7 @@ pash_spec_wait_until_scheduler_listening() { } start_server() { - "$PASH_PYTHON" "$PASH_SPEC_TOP/parallel-orch/scheduler_server.py" "$@" & + "$PASH_PYTHON" "$PASH_SPEC_TOP/scheduler/scheduler_server.py" "$@" & export daemon_pid=$! ## Wait until daemon has established connection pash_spec_wait_until_scheduler_listening diff --git a/jit_runtime/jit.sh b/jit_runtime/jit.sh index 06423989..43729709 100755 --- a/jit_runtime/jit.sh +++ b/jit_runtime/jit.sh @@ -1,21 +1,27 @@ #!/bin/bash -## Speculative-only JIT runtime for pash-spec +## JIT runtime for pash-spec speculative execution ## -## Assumes the following variable is set: -## pash_spec_command_id: the node id for the specific command - -## -## (1) Save shell state +## This script is sourced by the preprocessed shell script for each command. +## It saves the current shell state, communicates with the scheduler daemon, +## and either applies the scheduler's result or falls back to eval execution. ## +## Assumes the following variables are set: +## pash_spec_command_id: the node id for the specific command +## RUNTIME_DIR: path to jit_runtime directory +## RUNTIME_LIBRARY_DIR: path to executor directory (for fd_util, set-diff) + +############################################################################### +# Section 1: Save shell state +############################################################################### + +## Save exit status and shell options export pash_previous_exit_status="$?" export pash_previous_set_status=$- source "$RUNTIME_DIR/pash_set_from_to.sh" "$pash_previous_set_status" "${DEFAULT_SET_STATE:-huB}" pash_redir_output echo "$$: (1) Pre-ec, pre-set, jit-set: ($pash_previous_exit_status, $pash_previous_set_status, $-)" -## -## Save IFS for proper restoration in speculative_runtime.sh (matching fae47999 pattern) -## +## Save IFS for proper restoration later pash_redir_output echo "$$: [JIT] Before save - IFS=$(declare -p IFS 2>&1 || echo 'unset')" if [ -z "${IFS+x}" ]; then unset PASH_OLD_IFS @@ -26,27 +32,110 @@ pash_redir_output echo "$$: [JIT] Saved PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2 IFS=$' \t\n' pash_redir_output echo "$$: [JIT] After setting default - IFS=$(declare -p IFS 2>&1 || echo 'unset')" -## ## Save positional parameters and set options before they get overwritten by sourcing -## This is needed by pash_source_declare_vars.sh, because "source" mess up $@ -## +## This is needed by pash_source_declare_vars.sh, because "source" messes up $@ hs_runtime_tmp_args=("$@") hs_set_options_cmd="$(set +o)" pash_redir_output echo "$$: [JIT] Saved positional parameters: ${hs_runtime_tmp_args[@]}" -## -## (2) Speculative execution - ask scheduler -## +############################################################################### +# Section 2: Save variables and communicate with scheduler +############################################################################### + export pash_speculative_command_id=$pash_spec_command_id -source "$RUNTIME_DIR/speculative/speculative_runtime.sh" -## -## IFS is NOT restored here - it's handled by sourcing output_variable_file in speculative_runtime.sh -## This matches the fae47999 implementation pattern -## -pash_redir_output echo "$$: [JIT] End of speculative execution - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" +pash_redir_output echo "$$: (2) Before asking the scheduler for cmd: ${pash_speculative_command_id} exit code..." + +## Save the shell variables to a file (necessary for expansion) +export pash_runtime_shell_variables_file="${PASH_TMP_PREFIX}/variables_$RANDOM$RANDOM$RANDOM" +unset cmd_exit_code +unset output_variable_file +unset stdout_file +set +u +pash_redir_output echo "$$: [JIT] Before restore for declare - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" +if [ -z "${PASH_OLD_IFS+x}" ]; then + unset IFS +else + IFS="$PASH_OLD_IFS" +fi +pash_redir_output echo "$$: [JIT] After restore, before pash_declare_vars - IFS=$(declare -p IFS 2>&1 || echo 'unset')" +source "$RUNTIME_DIR/pash_declare_vars.sh" "$pash_runtime_shell_variables_file" +pash_redir_output echo "$$: [JIT] After pash_declare_vars - IFS=$(declare -p IFS 2>&1 || echo 'unset')" +IFS=$' \t\n' +pash_redir_output echo "$$: [JIT] After setting default - IFS=$(declare -p IFS 2>&1 || echo 'unset')" +pash_redir_output echo "$$: (1) Bash variables saved in: $pash_runtime_shell_variables_file" +pash_redir_output echo "$$: [JIT] Contents of pash_runtime_shell_variables_file IFS/PASH_OLD_IFS lines:" +pash_redir_output grep -E "^declare.*IFS" "$pash_runtime_shell_variables_file" || pash_redir_output echo " (no IFS declarations found)" + +## Determine all current loop iterations and send them to the scheduler +pash_loop_iter_counters=${pash_loop_iters:-None} +pash_redir_output echo "$$: Loop node iteration counters: $pash_loop_iter_counters" + +## Send and receive from scheduler daemon (blocking) +msg="Wait:${pash_speculative_command_id}|Loop iters:${pash_loop_iter_counters}|Variables file:${pash_runtime_shell_variables_file}" +daemon_response=$(pash_spec_communicate_scheduler "$msg") + +############################################################################### +# Section 3: Handle scheduler response +############################################################################### + +if [[ "$daemon_response" == *"OK:"* ]]; then + # shellcheck disable=SC2206 + response_args=($daemon_response) + pash_redir_output echo "$$: (2) Scheduler responded: $daemon_response" + + cmd_exit_code=${response_args[1]} + output_variable_file=${response_args[2]} + stdout_file=${response_args[3]} + + pash_redir_output echo "$$: (2) Recovering stdout from: $stdout_file" + + pash_redir_output echo "$$: (2) Recovering script variables from: $output_variable_file" + pash_redir_output echo "$$: [JIT] Contents of output_variable_file IFS/PASH_OLD_IFS lines:" + pash_redir_output grep -E "^declare.*IFS" "$output_variable_file" || pash_redir_output echo " (no IFS declarations found)" + pash_redir_output echo "$$: [JIT] Before pash_restore_fds - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" + source "$RUNTIME_DIR/pash_restore_fds.sh" "${output_variable_file}.fds" "${stdout_file}" + pash_redir_output echo "$$: [JIT] After pash_restore_fds - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" + source "$RUNTIME_DIR/pash_source_declare_vars.sh" "$output_variable_file" + pash_redir_output echo "$$: [JIT] After sourcing output_variable_file: IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" + +elif [[ "$daemon_response" == *"UNSAFE:"* ]]; then + pash_redir_output echo "$$: (2) Scheduler responded: $daemon_response" + pash_redir_output echo "$$: (2) Executing command: $pash_speculative_command_id" + ## Execute the command via eval fallback + cmd="$(cat "$PASH_SPEC_NODE_DIRECTORY/$pash_speculative_command_id")" + pash_redir_output echo "$$: [JIT] UNSAFE: Before restore for eval - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" + if [ -z "${PASH_OLD_IFS+x}" ]; then + unset IFS + else + IFS="$PASH_OLD_IFS" + fi + pash_redir_output echo "$$: [JIT] UNSAFE: After restore for eval - IFS=$(declare -p IFS 2>&1 || echo 'unset')" + # shellcheck disable=SC2086 + eval "$cmd" + cmd_exit_code=$? +elif [ -z "$daemon_response" ]; then + ## Scheduler crashed + pash_redir_output echo "$$: ERROR: (2) Scheduler crashed!" + exit 1 +else + pash_redir_output echo "$$: ERROR: (2) Scheduler responded garbage ${daemon_response}!" + exit 1 +fi + +############################################################################### +# Section 4: Cleanup and return exit code +############################################################################### + +pash_redir_output echo "$$: (2) Scheduler returned exit code: ${cmd_exit_code} for cmd with id: ${pash_speculative_command_id}." + +pash_runtime_final_status=${cmd_exit_code} +unset cmd_exit_code +unset output_variable_file +unset cmd +unset stdout_file -pash_redir_output echo "$$: [JIT] End of jit.sh (returning to caller) - IFS=$(declare -p IFS 2>&1 || echo 'unset')" +pash_redir_output echo "$$: [JIT] End of jit.sh - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" ## Exit with the result (exit "$pash_runtime_final_status") diff --git a/jit_runtime/pash_filter_vars.sh b/jit_runtime/pash_filter_vars.sh deleted file mode 100755 index d33dccd3..00000000 --- a/jit_runtime/pash_filter_vars.sh +++ /dev/null @@ -1,19 +0,0 @@ -#!/bin/bash - -## This sources variables that were produced from `declare -p` - -## TODO: Fix this to not source read only variables -## TODO: Does this work with arrays - -## TODO: Fix this to not source pash variables so as to not invalidate PaSh progress - -## TODO: Fix this filtering - -filter_vars_file() -{ - cat "$1" | grep -v "^declare -\([A-Za-z]\|-\)* \(pash\|BASH\|LINENO\|EUID\|GROUPS\|cmd_exit_code\)" - # The extension below is done for the speculative pash - # | grep -v "LS_COLORS" -} - -filter_vars_file "$1" diff --git a/jit_runtime/speculative/speculative_runtime.sh b/jit_runtime/speculative/speculative_runtime.sh deleted file mode 100755 index 32260153..00000000 --- a/jit_runtime/speculative/speculative_runtime.sh +++ /dev/null @@ -1,106 +0,0 @@ -#!/bin/bash - - -## TODO: Define the client in pash_spec_init_setup (which should be sourced by pash_init_setup) - -pash_redir_output echo "$$: (2) Before asking the scheduler for cmd: ${pash_speculative_command_id} exit code..." - -## TODO: Correctly save variables -## Save the shell variables to a file (necessary for expansion) -export pash_runtime_shell_variables_file="${PASH_TMP_PREFIX}/variables_$RANDOM$RANDOM$RANDOM" -unset cmd_exit_code -unset output_variable_file -unset stdout_file -set +u -pash_redir_output echo "$$: [SPEC] Before restore for declare - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" -if [ -z "${PASH_OLD_IFS+x}" ]; then - unset IFS -else - IFS="$PASH_OLD_IFS" -fi -pash_redir_output echo "$$: [SPEC] After restore, before pash_declare_vars - IFS=$(declare -p IFS 2>&1 || echo 'unset')" -source "$RUNTIME_DIR/pash_declare_vars.sh" "$pash_runtime_shell_variables_file" -pash_redir_output echo "$$: [SPEC] After pash_declare_vars - IFS=$(declare -p IFS 2>&1 || echo 'unset')" -IFS=$' \t\n' -pash_redir_output echo "$$: [SPEC] After setting default - IFS=$(declare -p IFS 2>&1 || echo 'unset')" -pash_redir_output echo "$$: (1) Bash variables saved in: $pash_runtime_shell_variables_file" -pash_redir_output echo "$$: [SPEC] Contents of pash_runtime_shell_variables_file IFS/PASH_OLD_IFS lines:" -pash_redir_output grep -E "^declare.*IFS" "$pash_runtime_shell_variables_file" || pash_redir_output echo " (no IFS declarations found)" - -## TODO: We want to send the environment to the scheduler. -## Once the scheduler determines if there are environment changes, it can then -## decide to rerun or not the speculated commands with the new environment. - - -## Determine all current loop iterations and send them to the scheduler -pash_loop_iter_counters=${pash_loop_iters:-None} -pash_redir_output echo "$$: Loop node iteration counters: $pash_loop_iter_counters" - -## Send and receive from daemon -msg="Wait:${pash_speculative_command_id}|Loop iters:${pash_loop_iter_counters}|Variables file:${pash_runtime_shell_variables_file}" -daemon_response=$(pash_spec_communicate_scheduler "$msg") # Blocking step, daemon will not send response until it's safe to continue - -## Receive an exit code -if [[ "$daemon_response" == *"OK:"* ]]; then - # shellcheck disable=SC2206 - response_args=($daemon_response) - pash_redir_output echo "$$: (2) Scheduler responded: $daemon_response" - - cmd_exit_code=${response_args[1]} - output_variable_file=${response_args[2]} - stdout_file=${response_args[3]} - - pash_redir_output echo "$$: (2) Recovering stdout from: $stdout_file" - #cat "${stdout_file}/1" - - ## TODO: Restore the variables (doesn't work currently because variables are printed using `env`) - pash_redir_output echo "$$: (2) Recovering script variables from: $output_variable_file" - pash_redir_output echo "$$: [SPEC] Contents of output_variable_file IFS/PASH_OLD_IFS lines:" - pash_redir_output grep -E "^declare.*IFS" "$output_variable_file" || pash_redir_output echo " (no IFS declarations found)" - pash_redir_output echo "$$: [SPEC] Before pash_restore_fds - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" - source "$RUNTIME_DIR/pash_restore_fds.sh" "${output_variable_file}.fds" "${stdout_file}" - pash_redir_output echo "$$: [SPEC] After pash_restore_fds - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" - source "$RUNTIME_DIR/pash_source_declare_vars.sh" "$output_variable_file" - pash_redir_output echo "$$: [SPEC] After sourcing output_variable_file: IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" - -elif [[ "$daemon_response" == *"UNSAFE:"* ]]; then - pash_redir_output echo "$$: (2) Scheduler responded: $daemon_response" - pash_redir_output echo "$$: (2) Executing command: $pash_speculative_command_id" - ## Execute the command. - ## KK 2023-06-01 Does `eval` work in general? We need to be precise - ## about which commands are unsafe to determine how to execute them. - cmd="$(cat "$PASH_SPEC_NODE_DIRECTORY/$pash_speculative_command_id")" - pash_redir_output echo "$$: [SPEC] UNSAFE: Before restore for eval - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" - if [ -z "${PASH_OLD_IFS+x}" ]; then - unset IFS - else - IFS="$PASH_OLD_IFS" - fi - pash_redir_output echo "$$: [SPEC] UNSAFE: After restore for eval - IFS=$(declare -p IFS 2>&1 || echo 'unset')" - ## KK 2023-06-01 Not sure if this shellcheck warning must be resolved: - ## > note: Double quote to prevent globbing and word splitting. - # shellcheck disable=SC2086 - eval "$cmd" - cmd_exit_code=$? -elif [ -z "$daemon_response" ]; then - ## Trouble... Daemon crashed, rip - pash_redir_output echo "$$: ERROR: (2) Scheduler crashed!" - exit 1 -else - pash_redir_output echo "$$: ERROR: (2) Scheduler responded garbage ${daemon_response}!" - exit 1 -fi - - -pash_redir_output echo "$$: (2) Scheduler returned exit code: ${cmd_exit_code} for cmd with id: ${pash_speculative_command_id}." - - -pash_runtime_final_status=${cmd_exit_code} -unset cmd_exit_code -unset output_variable_file -unset cmd -unset stdout_file - -pash_redir_output echo "$$: [SPEC] End of speculative_runtime.sh - IFS=$(declare -p IFS 2>&1 || echo 'unset'), PASH_OLD_IFS=$(declare -p PASH_OLD_IFS 2>&1 || echo 'unset')" - -## TODO: Also need to use wrap_vars maybe to `set` properly etc diff --git a/parallel-orch/pash_declare_vars.sh b/parallel-orch/pash_declare_vars.sh deleted file mode 100755 index 894c36bd..00000000 --- a/parallel-orch/pash_declare_vars.sh +++ /dev/null @@ -1,14 +0,0 @@ -#!/bin/bash - -vars_file="${1?File not given}" - -# pash_redir_output echo "Writing vars to: $vars_file" - -echo "cd \"${PWD}\"" > "$vars_file" -declare -p >> "$vars_file" -declare -f >> "$vars_file" -trap >> "$vars_file" -echo "BASH_ARGV0=\"$0\"" >> "$vars_file" -echo '${hs_set_options_cmd}' >> "$vars_file" -echo 'set -- "${hs_runtime_tmp_args[@]}"' >> "$vars_file" -${RUNTIME_LIBRARY_DIR}/fd_util -s -f "$vars_file.fds" diff --git a/parallel-orch/trace.py b/parallel-orch/trace.py deleted file mode 100644 index e1ec859f..00000000 --- a/parallel-orch/trace.py +++ /dev/null @@ -1,473 +0,0 @@ -import re -import sys -import os -from typing import Tuple -from enum import Enum -import logging -from copy import deepcopy - - -class Ref(Enum): - - STDIN = sys.stdin - STDOUT = os.path.abspath(os.sep) - STDERR = sys.stderr - ROOT = os.path.abspath(os.sep) - # Not sure this is always correct - # but it doesn't affect the results - CWD = os.getcwd() - LAUNCH_EXE = "" - - -class PathRef: - - def __init__(self, ref, path, permissions, no_follow, env): - self.ref = PathRefKey(env, ref) - self.path = path - self.is_read, self.is_write, self.is_exec = self.resolve_permissions( - permissions) - self.is_nofollow = no_follow - - def resolve_permissions(self, permissions: str): - if "r" in permissions: - is_read = True - else: - is_read = False - if "w" in permissions: - is_write = True - else: - is_write = False - if "x" in permissions: - is_exec = True - else: - is_exec = False - return is_read, is_write, is_exec - - def __str__(self): - return f"PathRef({self.ref}, {self.path}, {'r' if self.is_read else '-'}{'w' if self.is_write else '-'}{'x' if self.is_exec else '-'} {'no follow' if self.is_nofollow else ''})" - - def __repr__(self) -> str: - return self.__str__() - - def get_resolved_path(self): - - if isinstance(self.ref, PathRef): - self.ref = self.ref.get_resolved_path() - - # Remove dupliate prefixes - if not self.path.startswith("/"): - modified_path = "/" + self.path - else: - modified_path = self.path - commonprefix = os.path.commonprefix([self.ref, modified_path]) - ref_without_prefix = self.ref.replace(commonprefix, "", 1) - path_without_prefix = modified_path.replace(commonprefix, "", 1) - - if path_without_prefix.startswith("/"): - path_without_prefix = path_without_prefix.replace("/", "", 1) - - return os.path.join(commonprefix, ref_without_prefix, path_without_prefix).replace("/./", "/") - - - -class PathRefKey: - - def __init__(self, env, lhs_ref) -> None: - self.env = env - self.lhs_ref = lhs_ref - - def __eq__(self, other): - return (self.env, self.lhs_ref) == (other.env, other.lhs_ref) - - def __ne__(self, other) -> bool: - return not (self == other) - - def __hash__(self): - return hash((self.env, self.lhs_ref)) - - def __str__(self): - return f"Key({self.lhs_ref}@{self.env})" - - def __repr__(self) -> str: - return self.__str__() - - -class ExpectResult(): - - def __init__(self, ref, result): - self.ref = ref - self.result = result - - def __str__(self, ref, result): - return f"ExpectResult({self.ref}, {self.result})" - - -class PipeRef: - - def __init__(self, lhs_ref, env): - self.ref = PathRefKey(env, lhs_ref) - - def __str__(self): - return f"PipeRef({self.lhs_ref})" - - -def log_resolved_trace_items(resolved_dict): - for k, v in resolved_dict.items(): - try: - logging.debug(f" {k}: {v}") - except: - logging.debug(f'{k}: {v}') - - -def remove_command_redir(cmd): - return cmd.split(">")[0].rstrip() - - -def remove_command_prefix(line) -> str: - return line.split(f"]: ")[1].rstrip() - - -def get_command_prefix(line): - return line.split(f"]: ")[0].lstrip("[") - - -def is_no_command_prefix(line): - return "No Command" in get_command_prefix(line) - - -def is_new_path_ref(trace_item): - return "PathRef" in trace_item - -def is_pipe_ref(trace_item): - return "PipeRef" in trace_item - - -def get_path_ref_id(trace_item): - return trace_item.split("=")[0].strip() - - -def get_path_ref_open_config(trace_item): - assert (is_new_path_ref(trace_item)) - # WARNING: HACK - open_config_suffix = trace_item.split(", ")[2] - - open_config = re.split('\(|\)', open_config_suffix)[0] - # WARNING: HACK: hard-coded replacement - open_config = open_config.replace("truncate create", "").rstrip() - return open_config - - -def get_path_ref_no_follow(trace_item): - return "nofollow" in trace_item - - -def is_path_ref_read(trace_item: PathRef): - return trace_item.is_read - - -def is_path_ref_write(trace_item: PathRef): - return trace_item.is_write - - -def is_path_ref_execute(trace_item: PathRef): - return trace_item.is_write - - -def is_path_ref_empty(trace_item: PathRef): - return not trace_item.is_read and not trace_item.is_write and not trace_item.is_exec - - -def get_path_ref_name(trace_item): - assert (is_new_path_ref(trace_item)) - return trace_item.split(", ")[1].replace('"', '') - - -def get_path_ref_ref(trace_item): - assert (is_new_path_ref(trace_item)) - return trace_item.split(", ")[0].split("(")[1] - - -def is_no_command_prefix(line): - if line.startswith(f"[No Command"): - return True - return False - - -def is_launch(line): - return "Launch(" in line - - -def parse_launch_command(trace_item): - assignment_prefix = trace_item.split("], ")[0].split( - "([Command ")[1].rstrip("]").strip() - assignment_suffix = ", ".join(trace_item.split("], ")[1:]).strip() - assignment_string = assignment_suffix[1:-2].split(",") - assignments = [(x.split("=")) for x in assignment_string] - return assignment_prefix, assignments - - -def get_lauch_name(trace_item): - assert (is_launch(trace_item)) - launch_name_dirty = trace_item.split("],")[0] - launch_name = launch_name_dirty.split("Command ")[1] - return launch_name - - -def get_no_command_ref_id(trace_item): - return trace_item.split("=")[0].strip() - - -def get_no_command_ref_ref(trace_item): - return trace_item.split("=")[1].strip() - - -def is_prefix_of_cmd(line, prefix): - if prefix is not None and prefix in get_command_prefix(line): - return True - return False - - -def is_expect_result(trace_item): - return "ExpectResult(" in trace_item - - -def parse_expect_result(trace_item): - return trace_item.lstrip("ExpectResult(").split(")")[0].split(", ") - -def parse_pipe_ref(trace_item): - return trace_item.split("] = ")[0].lstrip("[").split(", ") - -def parse_launch(refs_dict, keys_order, env, line) -> None: - assignment_prefix, assignments = parse_launch_command( - remove_command_prefix(line)) - for assignment in assignments: - lhs_ref = PathRefKey(assignment_prefix, assignment[0].strip()) - rhs_ref = PathRefKey(env, assignment[1].strip()) - refs_dict[lhs_ref] = refs_dict[rhs_ref] - keys_order.append(lhs_ref) - -def add_ref_to_refs_dict(refs_dict, keys_order, lhs_ref, ref): - refs_dict[lhs_ref] = ref - keys_order.append(lhs_ref) - - -def parse_final_refs(refs_dict, keys_order, env, line) -> None: - line = remove_command_prefix(line) - path_ref_id = get_no_command_ref_id(line).strip() - lhs_ref = PathRefKey(env, path_ref_id) - rhs_ref = get_no_command_ref_ref(line) - if rhs_ref == "CWD": - add_ref_to_refs_dict(refs_dict, keys_order, lhs_ref, Ref.CWD) - elif rhs_ref == "ROOT": - add_ref_to_refs_dict(refs_dict, keys_order, lhs_ref, Ref.ROOT) - elif rhs_ref == "STDERR": - add_ref_to_refs_dict(refs_dict, keys_order, lhs_ref, Ref.STDERR) - elif rhs_ref == "STDIN": - add_ref_to_refs_dict(refs_dict, keys_order, lhs_ref, Ref.STDIN) - elif rhs_ref == "STDOUT": - add_ref_to_refs_dict(refs_dict, keys_order, lhs_ref, Ref.STDOUT) - elif rhs_ref == "LAUNCH_EXE": - add_ref_to_refs_dict(refs_dict, keys_order, lhs_ref, Ref.LAUNCH_EXE) - - -def parse_new_path_ref(refs_dict, keys_order, env, line): - line = remove_command_prefix(line).strip() - lhs_ref = PathRefKey(env, get_path_ref_id(line).strip()) - ref = get_path_ref_ref(line) - name = get_path_ref_name(line) - open_config = get_path_ref_open_config(line) - no_follow = get_path_ref_no_follow(line) - path_ref = PathRef(ref, name, open_config, no_follow, env) - refs_dict[lhs_ref] = path_ref - keys_order.append(lhs_ref) - -def parse_expect_result_item(expect_result_dict, env, line): - line = remove_command_prefix(line).strip() - path_ref_id, result = parse_expect_result(line) - lhs_ref = PathRefKey(env, path_ref_id) - expect_result_dict[lhs_ref] = ExpectResult(lhs_ref, result) - -def parse_pipe_ref_item(refs_dict, keys_order, env, line): - line = remove_command_prefix(line).strip() - # lhs_ref, rhs_ref = parse_pipe_ref(line) - # Warning HACK: This is a hack to get the correct lhs_ref - # we are probably ok with this because it. - rhs_ref, lhs_ref = parse_pipe_ref(line) - lhs_key = PathRefKey(env, lhs_ref) - lhs_key_rev = PathRefKey(env, rhs_ref) - pipe_ref = PipeRef(rhs_ref, env) - pipe_ref_rev = PipeRef(lhs_ref, env) - refs_dict[lhs_key] = pipe_ref.ref - refs_dict[lhs_key_rev] = pipe_ref_rev.ref - keys_order.append(lhs_key) - -def parse_rw_sets(trace_object) -> None: - # logging.trace("".join(trace_object)) - refs_dict = {} - expect_result_dict = {} - keys_order = [] - # In the first iteration, we get the refs - for line in trace_object: - # This branch will always execute first - env = get_command_prefix(line).lstrip("Command").strip() - if is_no_command_prefix(line): - if is_launch(line): - parse_launch(refs_dict, keys_order, env, line) - elif " = " in line: - parse_final_refs(refs_dict, keys_order, env, line) - # Parses Launch(...) - elif is_launch(line): - parse_launch(refs_dict, keys_order, env, line) - # Parses PathRef(...) - elif is_new_path_ref(line): - parse_new_path_ref(refs_dict, keys_order, env, line) - # Parses PipeRef - elif is_pipe_ref(line): - parse_pipe_ref_item(refs_dict, keys_order, env, line) - # Parses ExpectResult(...) - elif is_expect_result(line): - parse_expect_result_item(expect_result_dict, env, line) - return refs_dict, expect_result_dict, keys_order - - -def traverse_path_ref(refs_dict: dict, ref: PathRef): - if isinstance(ref, PathRef) and not ref.is_nofollow and isinstance(refs_dict[ref.ref], PathRef): - return traverse_path_ref(refs_dict, refs_dict[ref.ref]) - else: - return ref.ref - - -def resolve_rw_set_refs(refs_dict): - for ref_item, ref in refs_dict.items(): - if isinstance(ref, PathRef): - refs_dict[ref_item].ref = traverse_path_ref(refs_dict, ref) - return refs_dict - - -def replace_path_ref_terminal_nodes(refs_dict: dict): - refs_dict_new = {} - for i, ref in refs_dict.items(): - if isinstance(ref, PathRef): - - # HACK: This is hard-coded stdout - if ref.path == "" and ref.is_nofollow: - continue - else: - # If ref of ref is string, it means that we reached a terminal node. - if isinstance(ref.ref, str): - pass - elif ref.ref not in refs_dict: - key = PathRefKey("No Command", "r1") - ref.ref = refs_dict[key].value - else: - - if isinstance(refs_dict[ref.ref], Ref): - ref.ref = deepcopy(str(refs_dict[ref.ref].value)) - else: - ref.ref = deepcopy(refs_dict[ref.ref]) - assert(i not in refs_dict_new) - refs_dict_new[i] = deepcopy(ref) - return refs_dict_new - - -def resolve_dir_rw_paths(read_set, write_set, dir_set): - prefix = os.path.commonprefix(dir_set) - suffixes = [dir.replace(prefix, "") for dir in dir_set] - # Warning: HACK - dir_string = prefix - for dir in suffixes: - dir_string = os.path.join(dir_string, dir) - to_add = os.path.join(prefix, dir_string) - if to_add.endswith("/"): - write_set.append(to_add) - else: - write_set.append(to_add + "/") - - -def resolve_dir_accesses_from_parsed_items(resolved_dict_replaced, expect_result_dict, - key, previous_key, resolved_trace_object, - write_set, dir_set): - if previous_key in resolved_dict_replaced: - previous_resolved_trace_object = resolved_dict_replaced[previous_key] - relevant_current_expect_result = expect_result_dict.get(previous_key) - relevant_previous_expect_result = expect_result_dict.get(key) - if relevant_current_expect_result is not None and relevant_previous_expect_result is not None: - if isinstance(previous_resolved_trace_object, PathRef) and \ - is_path_ref_write(previous_resolved_trace_object) and \ - relevant_current_expect_result.result == "SUCCESS" and \ - relevant_previous_expect_result.result == "SUCCESS": - dir_set.append(resolved_trace_object.get_resolved_path()) - write_set.pop() - - -def resolve_rw_sets_from_parsed_items(resolved_dict_replaced, expect_result_dict, keys_order): - read_set = set() - write_set = [] - dir_set = [] - for i, key in enumerate(keys_order): - if key not in resolved_dict_replaced: - continue - resolved_trace_object = resolved_dict_replaced[key] - if isinstance(resolved_trace_object, Ref): - continue - # WARNING: HACK: We need to make sure this condition does not lead to missed dependencies - # KK 2023-05-03: I don't see where this is useful - # if resolved_trace_object.get_resolved_path().startswith(os.path.abspath('/tmp/pash_spec')) or - # continue - - ## Each separate node has a different /dev/tty (even though they all seem to write to it) - ## so we never want to keep /dev/tty in the read-write sets of any node. - ## We take care of writes to stdout and stderr elsewhere in the code. - ## TODO: Generalize this to other special files too (make a global list of such files) - ## TODO: We actually want to add these to the read-write sets, but then don't take them - ## into account when doing the resolution. Trace should not have any scheduling - ## logic, it should just parse the trace. - if resolved_trace_object.get_resolved_path().startswith(os.path.abspath('/dev/tty')): - continue - if is_path_ref_read(resolved_trace_object): - read_set.add(resolved_trace_object.get_resolved_path()) - if is_path_ref_write(resolved_trace_object): - write_set.append(resolved_trace_object.get_resolved_path()) - # This is a sign that a directory declaration might exist - if is_path_ref_empty(resolved_trace_object) and i > 0: - pass - resolve_dir_rw_paths(read_set, write_set, dir_set) - return read_set, write_set - -# Parse the trace object and gather rw sets for this command -# TODO: PathRefs now also contain environments. -# Figure out a way to resolve ref_id+env key combinations. - - -def parse_and_gather_cmd_rw_sets(trace_object) -> Tuple[set, set]: - refs_dict, expect_result_dict, keys_order = parse_rw_sets(trace_object) - resolved_dict = resolve_rw_set_refs(refs_dict) - resolved_dict_replaced = replace_path_ref_terminal_nodes(resolved_dict) - read_set, write_set = resolve_rw_sets_from_parsed_items( - resolved_dict_replaced, expect_result_dict, keys_order) - return read_set, set(write_set) - - -def parse_exit_code(trace_object) -> int: - for line in reversed(trace_object): - if "Exit(" in line: - return int(line.split("Exit(")[1].rstrip(")\n")) - -# Trace can be called as a script with the trace file to analyze as an argument -def main(): - logging.basicConfig(level=logging.DEBUG) - trace_file = sys.argv[1] - with open(trace_file, "r") as f: - trace_object = f.readlines() - read_set, write_set = parse_and_gather_cmd_rw_sets(trace_object) - print("Read set:") - for r in read_set: - print(r) - print("Write set:") - for w in write_set: - print(w) - print("Exit code:") - print(parse_exit_code(trace_object)) - -if __name__ == "__main__": - main() \ No newline at end of file diff --git a/preprocessor/transformation.py b/preprocessor/transformation.py index a3657ed5..02d1a4b3 100644 --- a/preprocessor/transformation.py +++ b/preprocessor/transformation.py @@ -27,7 +27,7 @@ class EdgeReason(Enum): - """CFG edge types — names must match CFGEdgeType in parallel-orch/node.py""" + """CFG edge types — names must match CFGEdgeType in scheduler/node.py""" IF_TAKEN = auto() ELSE_TAKEN = auto() LOOP_TAKEN = auto() diff --git a/report/benchmarks/dgsh/inner/17-10000x/run_trace_parse b/report/benchmarks/dgsh/inner/17-10000x/run_trace_parse index 62515c27..89b66474 100755 --- a/report/benchmarks/dgsh/inner/17-10000x/run_trace_parse +++ b/report/benchmarks/dgsh/inner/17-10000x/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-100 TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/17-1000x/run_trace_parse b/report/benchmarks/dgsh/inner/17-1000x/run_trace_parse index 100a23a1..cd8c9453 100755 --- a/report/benchmarks/dgsh/inner/17-1000x/run_trace_parse +++ b/report/benchmarks/dgsh/inner/17-1000x/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-100 TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/17-100M/run_trace_parse b/report/benchmarks/dgsh/inner/17-100M/run_trace_parse index e4220443..f76e6151 100755 --- a/report/benchmarks/dgsh/inner/17-100M/run_trace_parse +++ b/report/benchmarks/dgsh/inner/17-100M/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-100 TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/17-100x/run_trace_parse b/report/benchmarks/dgsh/inner/17-100x/run_trace_parse index 502c9892..e2d8f9f5 100755 --- a/report/benchmarks/dgsh/inner/17-100x/run_trace_parse +++ b/report/benchmarks/dgsh/inner/17-100x/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-100 TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/17-10G/run_trace_parse b/report/benchmarks/dgsh/inner/17-10G/run_trace_parse index 1fa97a5b..c2d9ab22 100755 --- a/report/benchmarks/dgsh/inner/17-10G/run_trace_parse +++ b/report/benchmarks/dgsh/inner/17-10G/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-10G TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/17-10x/run_trace_parse b/report/benchmarks/dgsh/inner/17-10x/run_trace_parse index b3693ee7..868646ee 100755 --- a/report/benchmarks/dgsh/inner/17-10x/run_trace_parse +++ b/report/benchmarks/dgsh/inner/17-10x/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-10x TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/17-1G/run_trace_parse b/report/benchmarks/dgsh/inner/17-1G/run_trace_parse index 3225d76f..f69897cf 100755 --- a/report/benchmarks/dgsh/inner/17-1G/run_trace_parse +++ b/report/benchmarks/dgsh/inner/17-1G/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-1G" TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/17/run_trace_parse b/report/benchmarks/dgsh/inner/17/run_trace_parse index f7d765d8..979dc933 100755 --- a/report/benchmarks/dgsh/inner/17/run_trace_parse +++ b/report/benchmarks/dgsh/inner/17/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}") TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/2/run_trace_parse b/report/benchmarks/dgsh/inner/2/run_trace_parse index 05d8efa9..6fd1c6b5 100755 --- a/report/benchmarks/dgsh/inner/2/run_trace_parse +++ b/report/benchmarks/dgsh/inner/2/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}") TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/4/run_trace_parse b/report/benchmarks/dgsh/inner/4/run_trace_parse index 0a69e045..b88f3f84 100755 --- a/report/benchmarks/dgsh/inner/4/run_trace_parse +++ b/report/benchmarks/dgsh/inner/4/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}") TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/5-1G/run_trace_parse b/report/benchmarks/dgsh/inner/5-1G/run_trace_parse index 7cb5fe85..188b735a 100755 --- a/report/benchmarks/dgsh/inner/5-1G/run_trace_parse +++ b/report/benchmarks/dgsh/inner/5-1G/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-1G" TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/7-1000x/run_trace_parse b/report/benchmarks/dgsh/inner/7-1000x/run_trace_parse index bd07205e..d8ab768b 100755 --- a/report/benchmarks/dgsh/inner/7-1000x/run_trace_parse +++ b/report/benchmarks/dgsh/inner/7-1000x/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-100 TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/7-100x/run_trace_parse b/report/benchmarks/dgsh/inner/7-100x/run_trace_parse index cd897178..0ef7e434 100755 --- a/report/benchmarks/dgsh/inner/7-100x/run_trace_parse +++ b/report/benchmarks/dgsh/inner/7-100x/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-100 TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/7-10x/run_trace_parse b/report/benchmarks/dgsh/inner/7-10x/run_trace_parse index dda515ea..8c5560f7 100755 --- a/report/benchmarks/dgsh/inner/7-10x/run_trace_parse +++ b/report/benchmarks/dgsh/inner/7-10x/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-10x TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/7/run_trace_parse b/report/benchmarks/dgsh/inner/7/run_trace_parse index acf9ee4f..adb6b498 100755 --- a/report/benchmarks/dgsh/inner/7/run_trace_parse +++ b/report/benchmarks/dgsh/inner/7/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}") TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/dgsh/inner/8-100M/run_trace_parse b/report/benchmarks/dgsh/inner/8-100M/run_trace_parse index 5df4a4ee..2d734857 100755 --- a/report/benchmarks/dgsh/inner/8-100M/run_trace_parse +++ b/report/benchmarks/dgsh/inner/8-100M/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/dgsh", f"{BENCHMARK_NO}-100 TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/dgsh") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/micro/inner/100echos/run_trace_parse b/report/benchmarks/micro/inner/100echos/run_trace_parse index eb0cd0ed..106caa4c 100755 --- a/report/benchmarks/micro/inner/100echos/run_trace_parse +++ b/report/benchmarks/micro/inner/100echos/run_trace_parse @@ -5,7 +5,7 @@ benchmark_dir="$hs_base/report/benchmarks/micro/scripts" run_script=$benchmark_dir/100echos.strace.parse.sh output_file="$hs_base/report/output/micro/100echos/strace_parse_time" -export PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +export PARSE="python3 $hs_base/scheduler/trace_v2.py" # Ensure the output directory exists mkdir -p "$(dirname "$output_file")" diff --git a/report/benchmarks/micro/inner/100echos2/run_trace_parse b/report/benchmarks/micro/inner/100echos2/run_trace_parse index 93953205..82b911ea 100755 --- a/report/benchmarks/micro/inner/100echos2/run_trace_parse +++ b/report/benchmarks/micro/inner/100echos2/run_trace_parse @@ -5,7 +5,7 @@ benchmark_dir="$hs_base/report/benchmarks/micro/scripts" run_script=$benchmark_dir/100echos2.strace.parse.sh output_file="$hs_base/report/output/micro/100echos2/strace_parse_time" -export PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +export PARSE="python3 $hs_base/scheduler/trace_v2.py" # Ensure the output directory exists mkdir -p "$(dirname "$output_file")" diff --git a/report/benchmarks/micro/inner/giant_file/run_trace_parse b/report/benchmarks/micro/inner/giant_file/run_trace_parse index 4eb8e414..f32418fb 100755 --- a/report/benchmarks/micro/inner/giant_file/run_trace_parse +++ b/report/benchmarks/micro/inner/giant_file/run_trace_parse @@ -5,7 +5,7 @@ benchmark_dir="$hs_base/report/benchmarks/micro/scripts" run_script=$benchmark_dir/giant_file.strace.parse.sh output_file="$hs_base/report/output/micro/giant_file/strace_parse_time" -export PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +export PARSE="python3 $hs_base/scheduler/trace_v2.py" # Ensure the output directory exists mkdir -p "$(dirname "$output_file")" diff --git a/report/benchmarks/micro/inner/giant_file2/run_trace_parse b/report/benchmarks/micro/inner/giant_file2/run_trace_parse index 2d59e394..eb0641c7 100755 --- a/report/benchmarks/micro/inner/giant_file2/run_trace_parse +++ b/report/benchmarks/micro/inner/giant_file2/run_trace_parse @@ -5,7 +5,7 @@ benchmark_dir="$hs_base/report/benchmarks/micro/scripts" run_script=$benchmark_dir/giant_file2.strace.parse.sh output_file="$hs_base/report/output/micro/giant_file2/strace_parse_time" -export PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +export PARSE="python3 $hs_base/scheduler/trace_v2.py" # Ensure the output directory exists mkdir -p "$(dirname "$output_file")" diff --git a/report/benchmarks/micro/inner/multi_files/run_trace_parse b/report/benchmarks/micro/inner/multi_files/run_trace_parse index 33e7893c..e906548d 100755 --- a/report/benchmarks/micro/inner/multi_files/run_trace_parse +++ b/report/benchmarks/micro/inner/multi_files/run_trace_parse @@ -5,7 +5,7 @@ benchmark_dir="$hs_base/report/benchmarks/micro/scripts" run_script=$benchmark_dir/multi_files.strace.parse.sh output_file="$hs_base/report/output/micro/multi_files/strace_parse_time" -export PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +export PARSE="python3 $hs_base/scheduler/trace_v2.py" # Ensure the output directory exists mkdir -p "$(dirname "$output_file")" diff --git a/report/benchmarks/micro/inner/run_trace_only b/report/benchmarks/micro/inner/run_trace_only index 15bf7be3..a49ac86c 100755 --- a/report/benchmarks/micro/inner/run_trace_only +++ b/report/benchmarks/micro/inner/run_trace_only @@ -91,7 +91,7 @@ if __name__ == '__main__': output_base = hs_base / "report" / "output" / local_name output_base.mkdir(parents=True, exist_ok=True) scripts_dir = test_base / "scripts" - tracev2_base = hs_base / "parallel-orch" / "trace_v2.py" + tracev2_base = hs_base / "scheduler" / "trace_v2.py" ####################### # SPECIFY ENV VARS HERE diff --git a/report/benchmarks/micro/inner/scripts/100echos.strace.parse.sh b/report/benchmarks/micro/inner/scripts/100echos.strace.parse.sh index 90c35841..5b43c811 100755 --- a/report/benchmarks/micro/inner/scripts/100echos.strace.parse.sh +++ b/report/benchmarks/micro/inner/scripts/100echos.strace.parse.sh @@ -14,7 +14,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" logfile=$(generate_unique_file) strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $logfile env -i echo "Hello World!" diff --git a/report/benchmarks/micro/inner/scripts/100echos.strace.sh b/report/benchmarks/micro/inner/scripts/100echos.strace.sh index f878b177..7bc6a89e 100755 --- a/report/benchmarks/micro/inner/scripts/100echos.strace.sh +++ b/report/benchmarks/micro/inner/scripts/100echos.strace.sh @@ -14,7 +14,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" logfile=$(generate_unique_file) diff --git a/report/benchmarks/micro/inner/scripts/100echos2.strace.parse.sh b/report/benchmarks/micro/inner/scripts/100echos2.strace.parse.sh index c5761394..8175270a 100755 --- a/report/benchmarks/micro/inner/scripts/100echos2.strace.parse.sh +++ b/report/benchmarks/micro/inner/scripts/100echos2.strace.parse.sh @@ -14,7 +14,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" logfile=$(generate_unique_file) diff --git a/report/benchmarks/micro/inner/scripts/100echos2.strace.sh b/report/benchmarks/micro/inner/scripts/100echos2.strace.sh index d5051d80..437d1d26 100755 --- a/report/benchmarks/micro/inner/scripts/100echos2.strace.sh +++ b/report/benchmarks/micro/inner/scripts/100echos2.strace.sh @@ -14,7 +14,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" logfile=$(generate_unique_file) diff --git a/report/benchmarks/micro/inner/scripts/100echos_try.sh b/report/benchmarks/micro/inner/scripts/100echos_try.sh index a12f2c55..37114d53 100755 --- a/report/benchmarks/micro/inner/scripts/100echos_try.sh +++ b/report/benchmarks/micro/inner/scripts/100echos_try.sh @@ -1,6 +1,6 @@ tempfile=$(mktemp) command="strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $tempfile" -tracev2="python3 $hs_base/parallel-orch/tracev2" +tracev2="python3 $hs_base/scheduler/tracev2" $try echo "Hello World!" $try echo "Hello World!" diff --git a/report/benchmarks/micro/inner/scripts/giant_file.strace.parse.sh b/report/benchmarks/micro/inner/scripts/giant_file.strace.parse.sh index c6b3ff30..a4928f29 100755 --- a/report/benchmarks/micro/inner/scripts/giant_file.strace.parse.sh +++ b/report/benchmarks/micro/inner/scripts/giant_file.strace.parse.sh @@ -14,7 +14,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" OUTPUT=${OUTPUT:-.} diff --git a/report/benchmarks/micro/inner/scripts/giant_file.strace.sh b/report/benchmarks/micro/inner/scripts/giant_file.strace.sh index f1177987..1923f880 100755 --- a/report/benchmarks/micro/inner/scripts/giant_file.strace.sh +++ b/report/benchmarks/micro/inner/scripts/giant_file.strace.sh @@ -14,7 +14,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" OUTPUT=${OUTPUT:-.} diff --git a/report/benchmarks/micro/inner/scripts/giant_file2.strace.parse.sh b/report/benchmarks/micro/inner/scripts/giant_file2.strace.parse.sh index af0249b6..7493d5d3 100755 --- a/report/benchmarks/micro/inner/scripts/giant_file2.strace.parse.sh +++ b/report/benchmarks/micro/inner/scripts/giant_file2.strace.parse.sh @@ -14,7 +14,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" OUTPUT=${OUTPUT:-.} diff --git a/report/benchmarks/micro/inner/scripts/giant_file2.strace.sh b/report/benchmarks/micro/inner/scripts/giant_file2.strace.sh index c65a314c..1accff49 100755 --- a/report/benchmarks/micro/inner/scripts/giant_file2.strace.sh +++ b/report/benchmarks/micro/inner/scripts/giant_file2.strace.sh @@ -14,7 +14,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" OUTPUT=${OUTPUT:-.} diff --git a/report/benchmarks/micro/inner/scripts/multi_files.strace.parse.sh b/report/benchmarks/micro/inner/scripts/multi_files.strace.parse.sh index 7be683b3..d1f09da3 100755 --- a/report/benchmarks/micro/inner/scripts/multi_files.strace.parse.sh +++ b/report/benchmarks/micro/inner/scripts/multi_files.strace.parse.sh @@ -17,7 +17,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" logfile=$(generate_unique_file) strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $logfile env -i python3 "$SCRIPTS"/multi_files.py "$OUTPUT"/foo diff --git a/report/benchmarks/micro/inner/scripts/multi_files.strace.sh b/report/benchmarks/micro/inner/scripts/multi_files.strace.sh index 8e1ae9e2..f25aa00a 100755 --- a/report/benchmarks/micro/inner/scripts/multi_files.strace.sh +++ b/report/benchmarks/micro/inner/scripts/multi_files.strace.sh @@ -17,7 +17,7 @@ generate_unique_file() { echo "$filename" } hs_base=$(git rev-parse --show-toplevel) -PARSE="python3 $hs_base/parallel-orch/trace_v2.py" +PARSE="python3 $hs_base/scheduler/trace_v2.py" logfile=$(generate_unique_file) diff --git a/report/benchmarks/nlp10m/inner/6_1/run_trace_parse b/report/benchmarks/nlp10m/inner/6_1/run_trace_parse index f7ba231e..8a220e2e 100755 --- a/report/benchmarks/nlp10m/inner/6_1/run_trace_parse +++ b/report/benchmarks/nlp10m/inner/6_1/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/nlp10m", f"{BENCHMARK_NO}") TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/nlp10m") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/nlp10m/inner/6_7/run_trace_parse b/report/benchmarks/nlp10m/inner/6_7/run_trace_parse index 8015b65e..bf41a876 100755 --- a/report/benchmarks/nlp10m/inner/6_7/run_trace_parse +++ b/report/benchmarks/nlp10m/inner/6_7/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/nlp10m", f"{BENCHMARK_NO}") TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/nlp10m") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/nlp10m/inner/8.3_3/run_trace_parse b/report/benchmarks/nlp10m/inner/8.3_3/run_trace_parse index c198e8bc..f7997666 100755 --- a/report/benchmarks/nlp10m/inner/8.3_3/run_trace_parse +++ b/report/benchmarks/nlp10m/inner/8.3_3/run_trace_parse @@ -13,7 +13,7 @@ TEST_BASE = os.path.join(HS_BASE, "report/benchmarks/nlp10m", f"{BENCHMARK_NO}") TEST_UPPER = os.path.join(HS_BASE, "report") RESOURCE_DIR = os.path.join(HS_BASE, "report/resources/nlp10m") STRACE = "strace -y -f --seccomp-bpf --trace=fork,clone,%file -o $(mktemp) env -i bash -c" -PARSE = os.path.join(HS_BASE, "parallel-orch/trace_v2.py") +PARSE = os.path.join(HS_BASE, "scheduler/trace_v2.py") PARSE_CMD = f"python3 {PARSE}" # Script to run diff --git a/report/benchmarks/teraseq/inner/Dockerfile.hs b/report/benchmarks/teraseq/inner/Dockerfile.hs index fc47857d..eb1a3c91 100644 --- a/report/benchmarks/teraseq/inner/Dockerfile.hs +++ b/report/benchmarks/teraseq/inner/Dockerfile.hs @@ -13,7 +13,8 @@ RUN apt install -y bc curl graphviz bsdmainutils libffi-dev locales locales-all # try deps RUN apt install -y expect mergerfs attr COPY deps deps -COPY parallel-orch parallel-orch +COPY scheduler scheduler +COPY executor executor COPY .git .git RUN mkdir -p /srv/hs/report/benchmarks/teraseq COPY report/benchmarks/teraseq/inner /srv/hs/report/benchmarks/teraseq diff --git a/requirements.txt b/requirements.txt index 750ff598..b4647ff7 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,12 +1,11 @@ -# Python dependencies for pash-spec preprocessor -# These packages provide shell script parsing and AST transformation capabilities +# Python dependencies for pash-spec preprocessor and scheduler # Core AST and parsing libraries shasta~=0.5 libdash libbash~=0.1.14 -# Additional dependencies from original PaSh pyproject.toml -pash-annotations~=0.2.4 -graphviz +# Shell variable expansion (used by scheduler's static analysis) sh-expand~=0.2.0 + +psutil diff --git a/parallel-orch/analysis.py b/scheduler/analysis.py similarity index 100% rename from parallel-orch/analysis.py rename to scheduler/analysis.py diff --git a/parallel-orch/config.py b/scheduler/config.py similarity index 100% rename from parallel-orch/config.py rename to scheduler/config.py diff --git a/parallel-orch/node.py b/scheduler/node.py similarity index 100% rename from parallel-orch/node.py rename to scheduler/node.py diff --git a/parallel-orch/partial_program_order.py b/scheduler/partial_program_order.py similarity index 100% rename from parallel-orch/partial_program_order.py rename to scheduler/partial_program_order.py diff --git a/parallel-orch/scheduler_server.py b/scheduler/scheduler_server.py similarity index 98% rename from parallel-orch/scheduler_server.py rename to scheduler/scheduler_server.py index 6ad228af..12a49411 100644 --- a/parallel-orch/scheduler_server.py +++ b/scheduler/scheduler_server.py @@ -1,9 +1,14 @@ import argparse import logging import signal +import sys +import os + +# Add executor directory to sys.path so scheduler modules can import executor +sys.path.insert(0, os.path.join(os.path.dirname(os.path.abspath(__file__)), '..', 'executor')) + import util import config -import os from partial_program_order import PartialProgramOrder, NodeId from node import LoopStack, ConcreteNodeId diff --git a/parallel-orch/trace_v2.py b/scheduler/trace_v2.py similarity index 100% rename from parallel-orch/trace_v2.py rename to scheduler/trace_v2.py diff --git a/parallel-orch/util.py b/scheduler/util.py similarity index 100% rename from parallel-orch/util.py rename to scheduler/util.py diff --git a/scripts/install_deps_ubuntu20.sh b/scripts/install_deps_ubuntu20.sh index b048779f..86dd8e76 100755 --- a/scripts/install_deps_ubuntu20.sh +++ b/scripts/install_deps_ubuntu20.sh @@ -9,6 +9,9 @@ export PASH_SPEC_TOP=${PASH_SPEC_TOP:-$(git rev-parse --show-toplevel --show-sup ## Download submodule dependencies (try only - deps/pash removed) git submodule update --init --recursive deps/try +## Build fd_util and set-diff for speculative execution +(cd executor; make) + ## Install Python dependencies for preprocessor # Find Python 3.12+ PASH_PYTHON="" @@ -49,10 +52,6 @@ echo "Upgrading pip..." echo "Installing Python dependencies for preprocessor..." "$PASH_VENV/bin/pip" install -r "$PASH_SPEC_TOP/requirements.txt" -# Install psutil for parallel-orch scheduler -echo "Installing psutil..." -"$PASH_VENV/bin/pip" install psutil - # Verify installation echo "Verifying Python dependencies..." "$PASH_VENV/bin/python" -c "import shasta; import libdash; import libbash" || { @@ -62,5 +61,4 @@ echo "Verifying Python dependencies..." echo "✓ Python dependencies installed successfully" -## Build fd_util for speculative execution -(cd parallel-orch; make) +