-
Notifications
You must be signed in to change notification settings - Fork 0
polish: unified improvement pass — ZPE-Ink #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,6 +5,8 @@ on: | |
| pull_request: | ||
| types: [opened] | ||
|
|
||
| permissions: {} | ||
|
|
||
| jobs: | ||
| add-to-project: | ||
| runs-on: ubuntu-latest | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,6 +4,18 @@ | |
|
|
||
| # ZPE-Ink | ||
|
|
||
| 5.590209480060199x structured-tier ratio. | ||
|
|
||
| - `.zpink` packet codec. | ||
| - >5× compression vs raw float32. Structured-tier round-trip verified. Release surface: `FAIL`. | ||
| - Install unit: `code/`. | ||
| - Transfer unit: `.zpink` packets. | ||
| - Bindings: repo-local sources. | ||
| - Authority surface: `proofs/reruns/contradiction_resolution_local/contradiction_resolution_manifest.json`. | ||
| - Python 3.11+. | ||
| - Rust toolchain. | ||
| - `wasm32-unknown-unknown` target. | ||
|
|
||
| SAL v6.0 — free below $100M annual revenue. See [LICENSE](LICENSE). | ||
|
|
||
| --- | ||
|
|
@@ -42,7 +54,10 @@ Prereqs for local verification: Python 3.11+, Rust toolchain, and `wasm32-unknow | |
| <img src=".github/assets/readme/section-bars/what-this-is.svg" alt="WHAT THIS IS" width="100%"> | ||
| </p> | ||
|
|
||
| ZPE-Ink is the staged codec surface for `.zpink` stream encoding and decoding. The installable release unit is the Python package under `code/`. The Rust/WASM/Swift/C# bindings are repo-local source surfaces and are not part of the pip install unit. | ||
| - Deterministic `.zpink` transfer surface. | ||
| - Python package under `code/`. | ||
| - Rust/WASM/Swift/C# repo-local sources. | ||
| - Pip install excludes binding artifacts. | ||
|
|
||
| <p> | ||
| <img src=".github/assets/readme/section-bars/quickstart-and-authority-point.svg" alt="QUICKSTART AND AUTHORITY POINT" width="100%"> | ||
|
|
@@ -77,6 +92,18 @@ python -m zpe_ink verify-roundtrip</code></pre> | |
| <td>Contact</td> | ||
| <td><code>architects@zer0pa.ai</code></td> | ||
| </tr> | ||
| <tr> | ||
| <td>Install unit</td> | ||
| <td><code>code/</code> Python package</td> | ||
| </tr> | ||
| <tr> | ||
| <td>Transfer unit</td> | ||
| <td><code>.zpink</code> packet streams</td> | ||
| </tr> | ||
| <tr> | ||
| <td>Binding packaging</td> | ||
| <td><code>repo-local source surfaces</code></td> | ||
|
Comment on lines
+104
to
+105
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nitpick (typo): Consider using consistent terminology for bindings across the README. Here you use "repo-local source surfaces," but earlier you say "repo-local sources." Choosing one term and using it throughout will make the README clearer. |
||
| </tr> | ||
| <tr> | ||
| <td>Current verdict</td> | ||
| <td><code>INCONCLUSIVE</code> (release surface <code>FAIL</code>)</td> | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,12 +4,13 @@ | |
| import json | ||
| import math | ||
| import random | ||
| import shutil | ||
| import statistics | ||
| import sys | ||
| import tarfile | ||
| import time | ||
| import zipfile | ||
| from pathlib import Path | ||
| from pathlib import Path, PurePosixPath | ||
| from typing import Any | ||
|
|
||
| ROOT = Path(__file__).resolve().parents[1] | ||
|
|
@@ -146,6 +147,54 @@ def _parse_inkml_samples(root: Path, limit: int) -> tuple[list[Path], list[list[ | |
| return files, samples, parse_failures | ||
|
|
||
|
|
||
| def _validated_archive_path(destination_root: Path, archive_name: str) -> Path: | ||
| normalized = archive_name.replace("\\", "/") | ||
| relative_path = PurePosixPath(normalized) | ||
| if relative_path.is_absolute() or any(part == ".." for part in relative_path.parts): | ||
| raise ValueError(f"unsafe archive member path: {archive_name}") | ||
| if not relative_path.parts: | ||
| return destination_root | ||
| target = destination_root.joinpath(*relative_path.parts).resolve(strict=False) | ||
| if target != destination_root and destination_root not in target.parents: | ||
| raise ValueError(f"archive member escapes extraction root: {archive_name}") | ||
| return target | ||
|
|
||
|
|
||
| def _safe_extract_tar(archive_path: Path, destination_root: Path) -> None: | ||
| destination_root = destination_root.resolve() | ||
| with tarfile.open(archive_path, "r:*") as handle: | ||
| for member in handle.getmembers(): | ||
| if member.name in {"", ".", "./"}: | ||
| continue | ||
| target = _validated_archive_path(destination_root, member.name) | ||
| if member.isdir(): | ||
| target.mkdir(parents=True, exist_ok=True) | ||
| continue | ||
| if not member.isfile(): | ||
| raise ValueError(f"unsupported tar member type for safe extraction: {member.name}") | ||
| target.parent.mkdir(parents=True, exist_ok=True) | ||
| extracted = handle.extractfile(member) | ||
| if extracted is None: | ||
| raise ValueError(f"missing tar member payload: {member.name}") | ||
| with extracted, target.open("wb") as output: | ||
| shutil.copyfileobj(extracted, output) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🚨 suggestion (security): Zip extraction could benefit from a size guard to reduce risk from overly large or zip-bomb-style archives. As written, this will extract arbitrarily large members and an unbounded number of entries, which can be abused (e.g., zip bombs) if the input isn’t fully trusted. Consider enforcing:
You can enforce these by wrapping |
||
|
|
||
|
|
||
| def _safe_extract_zip(archive_path: Path, destination_root: Path) -> None: | ||
| destination_root = destination_root.resolve() | ||
| with zipfile.ZipFile(archive_path) as handle: | ||
| for member in handle.infolist(): | ||
| if member.filename in {"", ".", "./"}: | ||
| continue | ||
| target = _validated_archive_path(destination_root, member.filename) | ||
| if member.is_dir(): | ||
| target.mkdir(parents=True, exist_ok=True) | ||
| continue | ||
| target.parent.mkdir(parents=True, exist_ok=True) | ||
| with handle.open(member) as extracted, target.open("wb") as output: | ||
| shutil.copyfileobj(extracted, output) | ||
|
|
||
|
|
||
| def main() -> int: | ||
| parser = argparse.ArgumentParser() | ||
| parser.add_argument("--artifact-root", required=True) | ||
|
|
@@ -189,8 +238,7 @@ def main() -> int: | |
| log_path, | ||
| "net_new_mathwriting_download", | ||
| ) | ||
| with tarfile.open(math_tgz, "r:gz") as handle: | ||
| handle.extractall(math_dir) | ||
| _safe_extract_tar(math_tgz, math_dir) | ||
| append_command_log(log_path, "net_new_mathwriting_extract", f"tar -xzf {math_tgz}", 0, "extracted", "") | ||
|
|
||
| math_extract = math_dir / "mathwriting-2024-excerpt" | ||
|
|
@@ -229,8 +277,7 @@ def main() -> int: | |
| log_path, | ||
| "net_new_crohme_fallback_download", | ||
| ) | ||
| with zipfile.ZipFile(crohme_zip) as handle: | ||
| handle.extractall(crohme_dir) | ||
| _safe_extract_zip(crohme_zip, crohme_dir) | ||
| append_command_log(log_path, "net_new_crohme_extract", f"unzip {crohme_zip}", 0, "extracted", "") | ||
|
|
||
| crohme_extract = crohme_dir / "ICFHR_package" | ||
|
|
@@ -317,8 +364,7 @@ def main() -> int: | |
| log_path, | ||
| "net_new_uji_download", | ||
| ) | ||
| with zipfile.ZipFile(uji_zip) as handle: | ||
| handle.extractall(uji_dir) | ||
| _safe_extract_zip(uji_zip, uji_dir) | ||
| append_command_log(log_path, "net_new_uji_extract", f"unzip {uji_zip}", 0, "extracted", "") | ||
|
|
||
| uji_samples = load_uji_pen_characters(uji_dir, limit=140) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Explicitly disabling all default
GITHUB_TOKENpermissions may break actions that expect at leastcontents: read.This hard-disables the default
GITHUB_TOKENand can break any step that implicitly relies oncontents: reador other scopes (e.g. reading repo metadata, listing PRs, posting comments). If the workflow truly only uses a separate PAT and never the default token, this is fine; otherwise, consider explicitly granting only the needed scopes, for example:Please double-check the job steps to confirm none depend on the default token’s permissions before keeping
permissions: {}.