Skip to content

Commit 5571bc5

Browse files
committed
diagrams
1 parent b78ae04 commit 5571bc5

15 files changed

Lines changed: 128 additions & 7 deletions

ARCHITECTURE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,8 +66,8 @@ in `docs/design/limb-design.md`, `docs/design/bigint-design.md`, and `docs/desig
6666

6767
## Visual guides
6868

69-
- `docs/diagrams/architecture-stack-mermaid.md`module layering from application to helper utilities.
70-
- `docs/diagrams/build-flow-mermaid.md`configure/build/test/bindings workflow so new contributors can run the project end-to-end.
69+
- `docs/diagrams/core-architecture.mermaid.md`layered helpers from `limb``bigint` → the umbrella helpers and GEMM stack.
70+
- [`docs/diagrams/docs-sitemap.mermaid.md`](diagrams/docs-sitemap.mermaid.md)site map summarizing the docs portal and related resources.
7171

7272
This guide is intentionally light and developer-facing—if you need a runnable overview, `docs/index.md`
7373
acts as the higher-level docs portal introduced in the README.

BENCHMARKS.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,10 @@ Each row contains:
4343

4444
Use this CSV to plot accuracy vs. storage or compare latency across the three modes.
4545

46+
## Diagrams
47+
48+
View the [benchmark comparison diagram](docs/diagrams/benchmarks.mermaid.md) for a quick latency/storage summary that highlights the 15–22× wins.
49+
4650
## Results sharing
4751

4852
When opening a pull request, add the latest benchmark rows (or a summary table) to this file or reference the CSV as part of your performance discussion so reviewers can reproduce the numbers.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ target_link_libraries(... t81::t81lib)
102102

103103
## GPU backends
104104

105-
Optional CUDA/ROCm backends can be enabled with `-DUSE_CUDA=ON` / `-DUSE_ROCM=ON` so the Python bindings link against the GPU kernels. `t81lib` exposes a compact `TensorMetadata` ABI that carries device, dtype, shape, and stride info, allowing `where`, `clamp`, `lerp`, and `addcmul` to work directly on NumPy arrays or Torch tensors. See [docs/gpu.md](docs/gpu.md) and [docs/torch.md](docs/torch.md) for build flags, device routing, supported ops, and lifetime details.
105+
Optional CUDA/ROCm backends can be enabled with `-DUSE_CUDA=ON` / `-DUSE_ROCM=ON` so the Python bindings link against the GPU kernels. `t81lib` exposes a compact `TensorMetadata` ABI that carries device, dtype, shape, and stride info, allowing `where`, `clamp`, `lerp`, and `addcmul` to work directly on NumPy arrays or Torch tensors. See [docs/gpu.md](docs/gpu.md), [docs/torch.md](docs/torch.md), and the [GPU dispatch diagram](docs/diagrams/gpu-dispatch.mermaid.md) for build flags, device routing, supported ops, and lifetime details.
106106

107107
## CLI helpers
108108

docs/api-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# API overview
22

3-
This page captures the high-level helpers exposed by the umbrella header so you can understand the building blocks without diving into every header.
3+
This page captures the high-level helpers exposed by the umbrella header so you can understand the building blocks without diving into every header. Review the [core architecture diagram](diagrams/core-architecture.mermaid.md) for an inheritance/data-flow sketch of the same helpers.
44

55
## Core numerics
66

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
```mermaid
2+
pie title Latency & storage comparison (relative)
3+
"FP32 latency" : 100
4+
"PTQ latency" : 45
5+
"QAT latency" : 38
6+
"FP32 storage" : 100
7+
"PTQ storage" : 22
8+
"QAT storage" : 24
9+
```
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
```mermaid
2+
flowchart LR
3+
subgraph Core [t81::core]
4+
limb[l1:limb (48 trits)]
5+
bigint[bigint (limb slices)]
6+
end
7+
subgraph HighLevel [Umbrella helpers]
8+
Int[t81::Int]
9+
Float[t81::Float / FloatN]
10+
BigInt[t81::BigInt alias]
11+
Ratio[t81::Ratio]
12+
Vector[t81::Vector]
13+
end
14+
limb --> Int
15+
limb --> Float
16+
bigint --> BigInt
17+
BigInt --> Ratio
18+
Vector --> Float
19+
Float --> Ratio
20+
Vector --> Int
21+
subgraph Ops [Arithmetic & GEMM]
22+
GEMM[t81::linalg::gemm_ternary]
23+
Fixed[t81::Fixed<N>]
24+
end
25+
Int --> Ops
26+
Float --> Ops
27+
Vector --> GEMM
28+
Fixed --> GEMM
29+
click Float "docs/api-overview.md" "See the helper summary"
30+
```
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
```mermaid
2+
graph TD
3+
Docs[Docs portal]
4+
GettingStarted[Getting started]
5+
Specs[Specs & design]
6+
Examples[Examples & testing]
7+
Docs --> GettingStarted
8+
Docs --> Specs
9+
Docs --> Examples
10+
GettingStarted --> README
11+
GettingStarted --> PythonInstall
12+
GettingStarted --> CLI
13+
Specs --> Spec
14+
Specs --> Design
15+
Specs --> APIOverview
16+
Examples --> Demos
17+
Examples --> Tests
18+
Examples --> Benchmarks
19+
README[README.md]
20+
PythonInstall[docs/python-install.md]
21+
CLI[docs/references/cli-usage.md]
22+
Spec[docs/t81lib-spec-v1.0.0.md]
23+
Design[docs/design/]
24+
APIOverview[docs/api-overview.md]
25+
Demos[examples/README.md]
26+
Tests[tests/]
27+
Benchmarks[BENCHMARKS.md]
28+
```
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
```mermaid
2+
flowchart LR
3+
torch[PyTorch tensor] --> extract[Extract metadata]
4+
numpy[NumPy array] --> extract
5+
extract --> validate[Validate device/dtype]
6+
validate --> dispatch[Dispatch to backend]
7+
dispatch --> cuda[CUDA kernel]
8+
dispatch --> rocm[ROCm kernel]
9+
dispatch --> cpu[CPU fallback]
10+
cuda --> wrap[Wrap GPU tensor]
11+
rocm --> wrap
12+
cpu --> wrap
13+
wrap --> return[Return to caller]
14+
subgraph Errors
15+
mismatch[Device mismatch] --> error[Error path]
16+
unsupported[Unsupported dtype] --> error
17+
end
18+
validate --> mismatch
19+
validate --> unsupported
20+
```
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
```mermaid
2+
flowchart TB
3+
tryte[Packed trytes (limbs)]
4+
load[Load registers (AVX/NEON)]
5+
mask[Mask & expand trits]
6+
multiply[Multiply columns]
7+
accumulate[Accumulate into FP32/BF16]
8+
store[Store to output buffer]
9+
tryte --> load --> mask --> multiply --> accumulate --> store
10+
style load stroke:#333,stroke-width:1px
11+
style mask stroke:#f66,stroke-width:1px
12+
style multiply stroke:#36c,stroke-width:1px
13+
```
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
```mermaid
2+
sequenceDiagram
3+
participant PyTorch
4+
participant Quantizer
5+
participant CLI
6+
participant Runtime
7+
8+
PyTorch->>Quantizer: export float model
9+
Quantizer->>Quantizer: `t81.torch` quantizes (TernaryTensor)
10+
Quantizer->>CLI: pack weights, store GGUF
11+
CLI->>Runtime: load GGUF
12+
Runtime->>Runtime: run `gemm_ternary` + accumulators
13+
Runtime->>PyTorch: return inference results
14+
```

0 commit comments

Comments
 (0)