perf(startup): add report-only startup measurement#149
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
📝 WalkthroughWalkthroughAdds a startup-time measurement harness (Rust CLI), documentation and planning files, and a report-only CI job that builds Forge artifacts and runs the harness to measure process spawn-to-exit startup latency across four execution modes. ChangesStartup Time Measurement Harness
Sequence DiagramsequenceDiagram
autonumber
participant CI as CI Job / Dev Shell
participant Harness as startup_time (parent)
participant FS as Filesystem
participant ForgeBuilder as cargo/rustc
participant Forge as Forge binary (child)
CI->>Harness: run ./target/startup_time --forge <path> --warmups 2 --reps 20
Harness->>FS: create temp workdir + write fixture.fg
Harness->>ForgeBuilder: run `forge build` (bytecode/native/AOT) with FORGE_LIB_DIR as needed
ForgeBuilder->>FS: write built artifacts (bytecode, native, AOT)
loop for each mode
Harness->>Forge: spawn child process (mode-specific command)
Forge-->>Harness: child exits with stdout "ok"
Harness->>Harness: record duration
end
Harness->>CI: print per-mode stats (min/median/p95/max) and exit status
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 6/10 reviews remaining, refill in 23 minutes and 4 seconds. Comment |
Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d2d5070860
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| fn checked_command(command: &mut Command, label: &str) -> Result<Output, String> { | ||
| let output = command | ||
| .output() |
There was a problem hiding this comment.
Enforce child-process timeout in benchmark harness
checked_command uses Command::output(), which waits indefinitely for the child to exit. If any measured mode hangs (for example from a startup deadlock/regression), the new startup-benchmark CI job will block until GitHub Actions kills the job at its global timeout instead of failing fast, which defeats the intended report signal and burns CI time. Please run children with an explicit deadline and terminate them on timeout.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Line 13: Update the CHANGELOG entry for "Startup time measurement harness" by
appending the required PR reference suffix in the format " ([`#PR`](link))" so it
reads "- **Startup time measurement harness** — `tools/startup_time.rs` …
correctness checks. CI runs it as a report-only signal before the `<10ms` native
startup target becomes a hard gate. ([`#PR`](link))"; replace "PR" and "link" with
the actual pull request number and its GitHub URL and ensure the entry remains
under the [Unreleased] section.
In `@tools/startup_time.rs`:
- Around line 219-223: The checked_command function currently uses
Command::output() which can block; replace it with Command::spawn() to get a
Child, then poll the child using Child::try_wait in a loop with a chosen timeout
duration (e.g. const TIMEOUT: Duration = Duration::from_secs(30)); if try_wait
returns Some(status) return the captured output via Child::wait and
Child::stdout/stderr handling, but if the timeout elapses call Child::kill(),
wait for termination with Child::wait(), and return an Err including the label
and that the process timed out; use Child::kill, Child::try_wait, and
Child::wait to implement the logic inside checked_command.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 77834cc7-57f2-40bd-b648-ea090700afee
📒 Files selected for processing (5)
.github/workflows/ci.yml.planning/startup-time-under-10ms.plan.mdCHANGELOG.mddocs/performance/startup.mdtools/startup_time.rs
| ### Added | ||
|
|
||
| - **Standalone source-runtime native binaries for Forge servers** — `forge build --native` now links against `libforge_lang.a` when available and emits a single executable that embeds Forge source and starts interpreter-only runtime features like `@server` without shelling out to the `forge` CLI. `--aot` remains bytecode/VM-only and continues to reject decorator-driven servers with guidance to use `--native`. | ||
| - **Startup time measurement harness** — `tools/startup_time.rs` measures source, bytecode, native source-runtime, and bytecode AOT process startup with correctness checks. CI runs it as a report-only signal before the `<10ms` native startup target becomes a hard gate. |
There was a problem hiding this comment.
Add the PR reference link to this changelog entry.
This entry is missing the required ([#PR](link)) suffix.
Suggested fix
-- **Startup time measurement harness** — `tools/startup_time.rs` measures source, bytecode, native source-runtime, and bytecode AOT process startup with correctness checks. CI runs it as a report-only signal before the `<10ms` native startup target becomes a hard gate.
+- **Startup time measurement harness** — `tools/startup_time.rs` measures source, bytecode, native source-runtime, and bytecode AOT process startup with correctness checks. CI runs it as a report-only signal before the `<10ms` native startup target becomes a hard gate. ([`#149`](https://github.com/humancto/forge-lang/pull/149))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - **Startup time measurement harness** — `tools/startup_time.rs` measures source, bytecode, native source-runtime, and bytecode AOT process startup with correctness checks. CI runs it as a report-only signal before the `<10ms` native startup target becomes a hard gate. | |
| - **Startup time measurement harness** — `tools/startup_time.rs` measures source, bytecode, native source-runtime, and bytecode AOT process startup with correctness checks. CI runs it as a report-only signal before the `<10ms` native startup target becomes a hard gate. ([`#149`](https://github.com/humancto/forge-lang/pull/149)) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CHANGELOG.md` at line 13, Update the CHANGELOG entry for "Startup time
measurement harness" by appending the required PR reference suffix in the format
" ([`#PR`](link))" so it reads "- **Startup time measurement harness** —
`tools/startup_time.rs` … correctness checks. CI runs it as a report-only signal
before the `<10ms` native startup target becomes a hard gate. ([`#PR`](link))";
replace "PR" and "link" with the actual pull request number and its GitHub URL
and ensure the entry remains under the [Unreleased] section.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/startup_time.rs`:
- Line 290: The poll loop currently sleeps with
std::thread::sleep(Duration::from_millis(5)), which is too coarse for a <10ms
benchmark and can bias timings; change the backoff in that wait loop to a much
shorter or adaptive interval (e.g., start with microsecond sleeps and
progressively increase up to at most 1ms) so that the call site using try_wait()
doesn't introduce ~5ms of artificial latency—locate the sleep call
(std::thread::sleep(Duration::from_millis(5))) and replace it with a
short/adaptive backoff strategy (microseconds-first with an upper bound of 1ms
or exponential backoff capped at 1ms).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 4fb60961-6dd1-4e2d-8d85-3518cc6e9327
📒 Files selected for processing (2)
.planning/startup-time-under-10ms.plan.mdtools/startup_time.rs
| CHILD_TIMEOUT.as_secs_f64() | ||
| )); | ||
| } | ||
| std::thread::sleep(Duration::from_millis(5)); |
There was a problem hiding this comment.
Polling granularity is too coarse for a <10ms benchmark.
Line 290 sleeps for 5ms between try_wait() polls, which can add up to ~5ms artificial latency per run and distort results in the exact range this benchmark targets. Use a shorter/adaptive backoff (e.g., microseconds to 1ms max).
Suggested change
- std::thread::sleep(Duration::from_millis(5));
+ // Keep timeout polling overhead below the measured startup budget.
+ std::thread::sleep(Duration::from_micros(200));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/startup_time.rs` at line 290, The poll loop currently sleeps with
std::thread::sleep(Duration::from_millis(5)), which is too coarse for a <10ms
benchmark and can bias timings; change the backoff in that wait loop to a much
shorter or adaptive interval (e.g., start with microsecond sleeps and
progressively increase up to at most 1ms) so that the call site using try_wait()
doesn't introduce ~5ms of artificial latency—locate the sleep call
(std::thread::sleep(Duration::from_millis(5))) and replace it with a
short/adaptive backoff strategy (microseconds-first with an upper bound of 1ms
or exponential backoff capped at 1ms).
Summary
forge, buildslibforge_lang.a, and measures startup without enforcing the<10mstarget yet.Roadmap
ROADMAP.md:Startup time: < 10ms (vs ~100ms for interpreter).Local startup sample
startup.source_run median_ms=4.169 p95_ms=4.667startup.bytecode_run median_ms=3.858 p95_ms=4.088startup.native_source_runtime median_ms=4.193 p95_ms=7.165startup.aot_bytecode median_ms=4.309 p95_ms=4.547Test plan
Made with Cursor
Summary by CodeRabbit
New Features
Documentation
Chores