Balance word-wrap with column-split (issue #35)#36
Merged
Conversation
The default column-width pipeline is now "split first, wrap per page" -
when a table cannot fit on one page width, the page-column-split runs
with each column's *minimum* survivable width as the capacity-planning
input, and each resulting page water-fills its columns within that
page's own horizontal slack. The legacy "wrap whole table, then split
using post-wrap widths" path is preserved as `col_split_strategy =
"wrap_first"` so the two behaviours can be compared empirically before
removing the legacy one.
Concrete example: a 4-column table that splits across 2 pages used to
have page 1 columns crushed to ~0.6" each because the whole-table
water-fill had to make room for the 4th column that ended up on
page 2 anyway. Under the new default, page 1's columns get the full
page 1 slack (~1.45" each); page 2 is unchanged.
API surface:
- `tfl_table(col_split_strategy = c("balanced", "wrap_first"))` - new
default "balanced". "wrap_first" preserves the legacy ordering.
- `tfl_table(row_overflow_max_retries = 5L)` - cap on the new retry
loop. When a row's wrapped height exceeds the page under the
balanced strategy, the bottleneck column's minimum is raised by
0.25" and the width pipeline re-runs. After the cap the final
`paginate_rows()` call goes through the existing `overflow_action`
path so the user-visible error/warn behaviour is unchanged. Set
`row_overflow_max_retries = 0L` to disable the loop entirely.
- `paginate_rows(..., collect_overflows = TRUE)` - new option that
returns `list(pages, overflows)` instead of signalling. Used by the
retry loop in `.tfl_table_to_pagelist_default()`.
Implementation:
- `R/table_columns.R::compute_col_widths()` is now a dispatcher. The
legacy body lives in `.compute_col_widths_wrap_first()` verbatim.
The new body in `.compute_col_widths_balanced()` implements the
Case-A / Case-B / Case-C decision tree from the issue:
A. sum(natural) <= content -> use natural; one page.
B. sum(min) <= content -> single page; water-fill from
natural down to fit.
C. else -> page-split using mins for capacity, then per-page
water-fill from natural down to the page's slack with group
columns pinned at min so data columns absorb the slack.
- `R/wrap.R` gains three helpers:
`.compute_col_min_widths()` extracts the floor measurement that
`.compute_wrapped_widths()` did inline.
`.water_fill_to_budget()` is a pure water-from-top loop that
takes pre-computed minimums.
`.reconcile_page_widths()` flattens per-page width vectors into
one per-column vector (group columns get the MIN across pages,
per the user's design call - group cells frequently flow
across rows via rowspan suppression so data columns benefit
more from the slack).
- `R/table_rows.R::paginate_rows()` gains the `collect_overflows`
param and returns a structured result when set. The
bottleneck column for each overflow is the wrap-eligible column
with the maximum cell height in the offending row.
- `R/table_pagelist.R` reorganises Step 4-6 around a repeat loop that
retries on row overflow. The row-height scratch device's
lifecycle is now fully contained inside each iteration via the
local `.run_pagination_iter()` helper, so the surrounding loop
never holds a viewport across a `compute_col_widths()` call.
- `R/tfl_table.R` adds and validates the two new arguments; both are
persisted on the `tfl_table` object.
All 1095 existing tests still pass. Tests, demos, and docs for the
new behaviour land in follow-up commits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the verification harness and user-facing documentation for the
balanced wrap pipeline:
- tests/testthat/test-wrap.R gains 15+ tests covering:
* `.water_fill_to_budget()` no-op, snap-to-floor, equal-shrink-of-
widest, floor honored when budget too tight, non-wrap cols left
alone;
* `.reconcile_page_widths()` non-group widths per-page, group cols
taking the MIN across pages;
* balanced vs wrap_first on a multi-page input (balanced widths
strictly wider for the wrap-eligible cols);
* balanced == wrap_first when table fits one page;
* group cols pinned at min width across pages under balanced;
* `col_split_strategy` and `row_overflow_max_retries` validation;
* `paginate_rows(collect_overflows = TRUE)` returns pages +
overflow info instead of aborting;
* `row_overflow_max_retries = 0L` disables the retry loop.
- examples/wrap_demos.R gains a "col_split_strategy comparison"
section: scenarios 15-17 render the same input twice (once with
each strategy) to make the per-page width difference visible
side-by-side. Scenario 18 shows the row-overflow retry loop:
with retries disabled the input errors; with retries enabled
(the default) the loop widens the bottleneck column until the
row fits and a PDF is produced.
- A subtle bug found while building scenario 18: the floor-override
mechanism raised `widths_min` but not `widths_natural`, so the
Case-A path (`sum(natural) <= content_width`) ignored the retry's
effect. The fix bumps `widths_natural[j]` whenever a floor
override raises it. Without this fix the retry loop runs but
never increases the rendered column width on Case-A inputs.
- vignettes/v02-tfl_table_intro.Rmd: `col_split_strategy` and
`row_overflow_max_retries` added to the `tfl_table()` argument
table.
- vignettes/v03-tfl_table_styling.Rmd: new "Balanced split-then-wrap"
subsection explaining the strategy choice, the group-column
pin-at-min rule, and the row-overflow retry loop.
- vignettes/v04-troubleshooting.Rmd: "Row wrapped to taller than one
page" updated to mention the retry loop, how to tune it via
`row_overflow_max_retries`, and how to disable it for issue-#28
behaviour.
- design/DECISIONS.md: new entry D-42 capturing the algorithm, the
group-col-at-min design call, the row-overflow retry mechanism,
alternatives considered, and the empirical-comparison rollout
plan with `col_split_strategy = "wrap_first"`.
All 1170+ tests pass. `R CMD check` is 0 errors / 0 warnings / 2 pre-
existing NOTEs (worktree `.git` marker + future-timestamp flake).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tfl_tablewidth pipeline. The new defaultcol_split_strategy = "balanced"page-splits using each column's minimum survivable width, then water-fills each page's columns within that page's actual slack — so multi-page tables get wider, less-wrapped columns per page.row_overflow_max_retries = 5L, default;0Lto disable) that widens the bottleneck column and re-runs the pipeline when a row's wrapped height exceeds the page.col_split_strategy = "wrap_first"so the two can be compared empirically before removing the legacy path.Concrete example: a 4-column table that splits across 2 pages used to crush page-1 columns to ~0.6″ each because whole-table water-fill had to make room for the 4th column (which ended up on page 2 anyway). Under the new default each page-1 column gets the full page-1 slack (~1.4″ each).
What landed
R/table_columns.Rcompute_col_widths()is now a dispatcher. Body split into.resolve_natural_widths()(shared Pass 1-3 setup),.compute_col_widths_wrap_first()(legacy, preserved verbatim),.compute_col_widths_balanced()(new).R/wrap.R.compute_col_min_widths(),.water_fill_to_budget(),.reconcile_page_widths().R/table_rows.Rpaginate_rows()gainscollect_overflows = TRUEmode for the retry loop.R/table_pagelist.RR/tfl_table.Rcol_split_strategyandrow_overflow_max_retries.tests/testthat/test-wrap.Rexamples/wrap_demos.R_balanced.pdf/_wrap_first.pdfcomparisons and a retry-loop demo.vignettes/v02-tfl_table_intro.Rmd,v03-tfl_table_styling.Rmd,v04-troubleshooting.Rmdcol_split_strategy; updated arg table; new troubleshooting entry on the retry loop.design/DECISIONS.mdGroup columns (design call)
Group columns repeat on every page-column-split page. Under
"balanced"they're pinned at their minimum width on every page so data columns absorb the per-page slack. This pairs well with the issue-#29 rowspan-style label flow — multi-line group labels flow across rows, so the column doesn't need full natural width on every page.Test plan
.water_fill_to_budget(),.reconcile_page_widths(),.compute_col_min_widths(), balanced-vs-wrap_first end-to-end, group cols pinned at min,paginate_rows(collect_overflows = TRUE), retry-loop disableR CMD checkclean (0 errors / 0 warnings / 2 pre-existing NOTEs:.gitworktree marker + future-timestamp flake)examples/wrap_demos.Rscenarios 15-18:_balancedvs_wrap_firstRollout
col_split_strategy = "balanced"is the default in this PR. If empirical comparison favors it, a follow-up can remove"wrap_first"and the legacy code path. If a specific table renders worse under the new default,"wrap_first"remains as an escape hatch.Closes #35.
🤖 Generated with Claude Code