Skip to content

Balance word-wrap with column-split (issue #35)#36

Merged
billdenney merged 2 commits into
mainfrom
claude/issue-35-balance-wrap-split
May 11, 2026
Merged

Balance word-wrap with column-split (issue #35)#36
billdenney merged 2 commits into
mainfrom
claude/issue-35-balance-wrap-split

Conversation

@billdenney
Copy link
Copy Markdown
Member

Summary

  • Reverses the order of text-wrap and page-column-split in the tfl_table width pipeline. The new default col_split_strategy = "balanced" page-splits using each column's minimum survivable width, then water-fills each page's columns within that page's actual slack — so multi-page tables get wider, less-wrapped columns per page.
  • Adds a bounded row-overflow retry loop (row_overflow_max_retries = 5L, default; 0L to disable) that widens the bottleneck column and re-runs the pipeline when a row's wrapped height exceeds the page.
  • Preserves the pre-issue-35 algorithm as col_split_strategy = "wrap_first" so the two can be compared empirically before removing the legacy path.

Concrete example: a 4-column table that splits across 2 pages used to crush page-1 columns to ~0.6″ each because whole-table water-fill had to make room for the 4th column (which ended up on page 2 anyway). Under the new default each page-1 column gets the full page-1 slack (~1.4″ each).

What landed

File Change
R/table_columns.R compute_col_widths() is now a dispatcher. Body split into .resolve_natural_widths() (shared Pass 1-3 setup), .compute_col_widths_wrap_first() (legacy, preserved verbatim), .compute_col_widths_balanced() (new).
R/wrap.R New helpers: .compute_col_min_widths(), .water_fill_to_budget(), .reconcile_page_widths().
R/table_rows.R paginate_rows() gains collect_overflows = TRUE mode for the retry loop.
R/table_pagelist.R Step 4-6 wrapped in a repeat loop; per-iteration scratch-device lifecycle.
R/tfl_table.R New args col_split_strategy and row_overflow_max_retries.
tests/testthat/test-wrap.R 15+ new tests covering the new helpers, balanced vs wrap_first behavior, retry-loop firing, arg validation.
examples/wrap_demos.R Scenarios 15-18: side-by-side _balanced.pdf / _wrap_first.pdf comparisons and a retry-loop demo.
vignettes/v02-tfl_table_intro.Rmd, v03-tfl_table_styling.Rmd, v04-troubleshooting.Rmd New section on col_split_strategy; updated arg table; new troubleshooting entry on the retry loop.
design/DECISIONS.md New decision entry D-42.

Group columns (design call)

Group columns repeat on every page-column-split page. Under "balanced" they're pinned at their minimum width on every page so data columns absorb the per-page slack. This pairs well with the issue-#29 rowspan-style label flow — multi-line group labels flow across rows, so the column doesn't need full natural width on every page.

Test plan

  • All 1170+ existing tests still pass; new tests cover .water_fill_to_budget(), .reconcile_page_widths(), .compute_col_min_widths(), balanced-vs-wrap_first end-to-end, group cols pinned at min, paginate_rows(collect_overflows = TRUE), retry-loop disable
  • R CMD check clean (0 errors / 0 warnings / 2 pre-existing NOTEs: .git worktree marker + future-timestamp flake)
  • Compare examples/wrap_demos.R scenarios 15-18:
    • 15/16: per-page columns visibly wider under _balanced vs _wrap_first
    • 17: identical output (no split needed; both strategies converge)
    • 18: retry-disabled errors; retry-enabled produces a PDF after the loop widens the offending column

Rollout

col_split_strategy = "balanced" is the default in this PR. If empirical comparison favors it, a follow-up can remove "wrap_first" and the legacy code path. If a specific table renders worse under the new default, "wrap_first" remains as an escape hatch.

Closes #35.

🤖 Generated with Claude Code

billdenney and others added 2 commits May 11, 2026 09:48
The default column-width pipeline is now "split first, wrap per page" -
when a table cannot fit on one page width, the page-column-split runs
with each column's *minimum* survivable width as the capacity-planning
input, and each resulting page water-fills its columns within that
page's own horizontal slack.  The legacy "wrap whole table, then split
using post-wrap widths" path is preserved as `col_split_strategy =
"wrap_first"` so the two behaviours can be compared empirically before
removing the legacy one.

Concrete example: a 4-column table that splits across 2 pages used to
have page 1 columns crushed to ~0.6" each because the whole-table
water-fill had to make room for the 4th column that ended up on
page 2 anyway.  Under the new default, page 1's columns get the full
page 1 slack (~1.45" each); page 2 is unchanged.

API surface:

- `tfl_table(col_split_strategy = c("balanced", "wrap_first"))` - new
  default "balanced".  "wrap_first" preserves the legacy ordering.
- `tfl_table(row_overflow_max_retries = 5L)` - cap on the new retry
  loop.  When a row's wrapped height exceeds the page under the
  balanced strategy, the bottleneck column's minimum is raised by
  0.25" and the width pipeline re-runs.  After the cap the final
  `paginate_rows()` call goes through the existing `overflow_action`
  path so the user-visible error/warn behaviour is unchanged.  Set
  `row_overflow_max_retries = 0L` to disable the loop entirely.
- `paginate_rows(..., collect_overflows = TRUE)` - new option that
  returns `list(pages, overflows)` instead of signalling.  Used by the
  retry loop in `.tfl_table_to_pagelist_default()`.

Implementation:

- `R/table_columns.R::compute_col_widths()` is now a dispatcher.  The
  legacy body lives in `.compute_col_widths_wrap_first()` verbatim.
  The new body in `.compute_col_widths_balanced()` implements the
  Case-A / Case-B / Case-C decision tree from the issue:
    A. sum(natural) <= content -> use natural; one page.
    B. sum(min)     <= content -> single page; water-fill from
       natural down to fit.
    C. else -> page-split using mins for capacity, then per-page
       water-fill from natural down to the page's slack with group
       columns pinned at min so data columns absorb the slack.
- `R/wrap.R` gains three helpers:
    `.compute_col_min_widths()` extracts the floor measurement that
       `.compute_wrapped_widths()` did inline.
    `.water_fill_to_budget()` is a pure water-from-top loop that
       takes pre-computed minimums.
    `.reconcile_page_widths()` flattens per-page width vectors into
       one per-column vector (group columns get the MIN across pages,
       per the user's design call - group cells frequently flow
       across rows via rowspan suppression so data columns benefit
       more from the slack).
- `R/table_rows.R::paginate_rows()` gains the `collect_overflows`
  param and returns a structured result when set.  The
  bottleneck column for each overflow is the wrap-eligible column
  with the maximum cell height in the offending row.
- `R/table_pagelist.R` reorganises Step 4-6 around a repeat loop that
  retries on row overflow.  The row-height scratch device's
  lifecycle is now fully contained inside each iteration via the
  local `.run_pagination_iter()` helper, so the surrounding loop
  never holds a viewport across a `compute_col_widths()` call.
- `R/tfl_table.R` adds and validates the two new arguments; both are
  persisted on the `tfl_table` object.

All 1095 existing tests still pass.  Tests, demos, and docs for the
new behaviour land in follow-up commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the verification harness and user-facing documentation for the
balanced wrap pipeline:

- tests/testthat/test-wrap.R gains 15+ tests covering:
  * `.water_fill_to_budget()` no-op, snap-to-floor, equal-shrink-of-
    widest, floor honored when budget too tight, non-wrap cols left
    alone;
  * `.reconcile_page_widths()` non-group widths per-page, group cols
    taking the MIN across pages;
  * balanced vs wrap_first on a multi-page input (balanced widths
    strictly wider for the wrap-eligible cols);
  * balanced == wrap_first when table fits one page;
  * group cols pinned at min width across pages under balanced;
  * `col_split_strategy` and `row_overflow_max_retries` validation;
  * `paginate_rows(collect_overflows = TRUE)` returns pages +
    overflow info instead of aborting;
  * `row_overflow_max_retries = 0L` disables the retry loop.

- examples/wrap_demos.R gains a "col_split_strategy comparison"
  section: scenarios 15-17 render the same input twice (once with
  each strategy) to make the per-page width difference visible
  side-by-side.  Scenario 18 shows the row-overflow retry loop:
  with retries disabled the input errors; with retries enabled
  (the default) the loop widens the bottleneck column until the
  row fits and a PDF is produced.

- A subtle bug found while building scenario 18: the floor-override
  mechanism raised `widths_min` but not `widths_natural`, so the
  Case-A path (`sum(natural) <= content_width`) ignored the retry's
  effect.  The fix bumps `widths_natural[j]` whenever a floor
  override raises it.  Without this fix the retry loop runs but
  never increases the rendered column width on Case-A inputs.

- vignettes/v02-tfl_table_intro.Rmd: `col_split_strategy` and
  `row_overflow_max_retries` added to the `tfl_table()` argument
  table.

- vignettes/v03-tfl_table_styling.Rmd: new "Balanced split-then-wrap"
  subsection explaining the strategy choice, the group-column
  pin-at-min rule, and the row-overflow retry loop.

- vignettes/v04-troubleshooting.Rmd: "Row wrapped to taller than one
  page" updated to mention the retry loop, how to tune it via
  `row_overflow_max_retries`, and how to disable it for issue-#28
  behaviour.

- design/DECISIONS.md: new entry D-42 capturing the algorithm, the
  group-col-at-min design call, the row-overflow retry mechanism,
  alternatives considered, and the empirical-comparison rollout
  plan with `col_split_strategy = "wrap_first"`.

All 1170+ tests pass.  `R CMD check` is 0 errors / 0 warnings / 2 pre-
existing NOTEs (worktree `.git` marker + future-timestamp flake).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@billdenney billdenney merged commit 53bb64e into main May 11, 2026
9 checks passed
@billdenney billdenney deleted the claude/issue-35-balance-wrap-split branch May 11, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make word wrapping balance with column wrapping

1 participant