Skip to content

[BDP-102028] feat(optimizer): [1/N] Optimizer Database#530

Merged
mkuchenbecker merged 77 commits into
mainfrom
mkuchenb/optimizer-1
May 21, 2026
Merged

[BDP-102028] feat(optimizer): [1/N] Optimizer Database#530
mkuchenbecker merged 77 commits into
mainfrom
mkuchenb/optimizer-1

Conversation

@mkuchenbecker
Copy link
Copy Markdown
Collaborator

@mkuchenbecker mkuchenbecker commented Apr 6, 2026

Optimizer Stack

PR Content
#527 Data Model
#530 (this) Database Repos
#531 REST service
#533 Analyzer app
#534 Scheduler app
#tbd Spark BatchedOFD app
#tbd Infra, docker-compose, smoke test

Summary

PR 1 of N in the optimizer stack.
Overall Project
Service Design doc.

Spring Data JPA repositories for all four optimizer tables with filtered query support, plus tests exercising save/find, filtered queries, upsert semantics, and append-only history.

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

Repositories: TableOperationsRepository, TableOperationsHistoryRepository, TableStatsRepository, TableStatsHistoryRepository — each with JPQL filtered query methods.

Tests: Repository tests for all four tables plus OptimizerServiceContextTest verifying the Spring context loads.

Testing Done

  • Manually Tested on local docker setup. Please include commands ran, and their output.
  • Added new tests for the changes made.
  • Updated existing tests to reflect the changes made.
  • No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • Some other form of testing like staging or soak time in production. Please explain.

./gradlew :services:optimizer:test — all tests pass (H2 in MySQL mode).

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

mkuchenbecker and others added 3 commits April 3, 2026 11:00
Introduces the optimizer service module with:
- MySQL/H2 schema for table_operations, table_stats, table_stats_history,
  and table_operations_history
- JPA entities with JSON column support (vladmihalcea hibernate-types)
- All model/DTO/enum types: OperationType, OperationStatus, TableStats,
  CompleteOperationRequest, JobResult, OperationMetrics, etc.
- JPA AttributeConverters for JobResult and OperationMetrics JSON columns
- MapStruct mapper (OptimizerMapper) for entity→DTO conversion
- Spring Boot application shell and build wiring (settings.gradle,
  build.gradle dockerPrereqs)

No repositories, controllers, or service layer yet — those follow in
subsequent PRs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove OperationMetrics class and converter; stats are read
  directly from table_stats instead of duplicating into operations
- Remove orphanFilesDeleted/orphanBytesDeleted from history entity,
  DTO, and schema; operation-specific data belongs in the result JSON
- Add addedSizeBytes to CommitDelta for tracking write volume
- Fix OperationType javadoc to describe current state, not roadmap
- Fix TableOperationsHistoryRow javadoc: written on operation
  complete, not by Spark app directly
- Add field comments to all DTOs and request objects

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Spring Data JPA repositories for all four optimizer tables with
filtered query support. Includes tests exercising save/find,
filtered queries, upsert semantics, and append-only history.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mkuchenbecker and others added 4 commits April 6, 2026 11:35
Address PR review comments: rename findFiltered → find across all repos,
remove redundant findByTableUuid/findByTableUuidSince from history repos,
add explicit assertion to context test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Shared JPA entities and repositories for optimizer apps (analyzer,
scheduler). All repos expose a single find method with optional filters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These fields never belonged in the data model — remove them at the
source rather than adding then deleting in a later PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagate CompleteOperationRequest orphan field removal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker changed the title feat(optimizer): add repositories and repository tests feat(optimizer): [1/N] Optimizer Repository Apr 6, 2026
@mkuchenbecker mkuchenbecker changed the title feat(optimizer): [1/N] Optimizer Repository feat(optimizer): [2/N] Optimizer Repository Apr 6, 2026
@mkuchenbecker mkuchenbecker marked this pull request as ready for review April 6, 2026 19:46
@mkuchenbecker mkuchenbecker changed the title feat(optimizer): [2/N] Optimizer Repository feat(optimizer): [1/N] Optimizer Repository Apr 6, 2026
Comment thread apps/optimizer/build.gradle Outdated
mkuchenbecker and others added 3 commits April 30, 2026 09:55
- Widen-to-tighten: VARCHAR(255) -> VARCHAR(128) for database_name
  and table_name across all entities and the schema, aligning with
  prod conventions (can always be widened later, not tightened).
- Rename databaseId -> databaseName in TableStatsRow,
  TableStatsHistoryRow, TableStatsDto, TableStatsHistoryDto, and
  UpsertTableStatsRequest for consistency with the operations
  entities and DTOs.
- Drop the unused metrics field from TableOperationsRow,
  TableOperationsDto, and the schema. Add a TODO note in the schema
  that per-operation metric columns will be added as operations are
  onboarded.
- Rename submittedAt -> completedAt in TableOperationsHistoryRow,
  TableOperationsHistoryDto, and the schema (column submitted_at ->
  completed_at, index idx_submitted_at -> idx_completed_at). The
  history row is written when the complete endpoint is called, so
  the timestamp captures completion; submission time is already on
  table_operations.scheduled_at.
- Change TableStatsHistoryRow.id from BIGINT auto-increment to
  VARCHAR(36) UUID, set by the caller, matching the other
  id-bearing entities.
- Add @JsonIgnoreProperties(ignoreUnknown = true) to CommitDelta
  for consistency with TableStats and SnapshotMetrics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mkuchenbecker and others added 3 commits May 14, 2026 17:33
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TableOperation becomes a pure operation record. Consumers (scheduler) look
up TableStats at the point they need it, rather than carrying enrichment
data on the model type.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	services/optimizer/src/main/java/com/linkedin/openhouse/optimizer/model/TableOperation.java
#	services/optimizer/src/main/java/com/linkedin/openhouse/optimizer/model/TableStats.java
Comment thread services/optimizer/src/main/resources/db/optimizer-schema.sql
mkuchenbecker and others added 5 commits May 18, 2026 14:44
…tion

markSchedulingBatch returns only a count of rows transitioned; callers
that need to know *which* rows they own must re-query. findClaimedIds
takes the same id list + scheduledAt watermark passed to the UPDATE and
returns the subset whose SCHEDULING transition matches that watermark —
i.e. the rows this caller actually claimed in this call.

Used by the scheduler to subset its bin to actually-claimed operations
before submitting the Spark job; without this the scheduler can launch
a job for ids another instance already owns and then incorrectly mark
all of them SCHEDULED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…review)

Per @abhisheknath2011 review comment 3262776356:
> "We could change all the internal model add Dto suffix something like
> TableOperationsDto. This aligns with the existing services codebase."

Renames (suffix added):
- CompleteOperationRequest    -> CompleteOperationRequestDto
- UpsertTableStatsRequest     -> UpsertTableStatsRequestDto
- OperationType (enum)        -> OperationTypeDto
- OperationStatus (enum)      -> OperationStatusDto
- HistoryStatus (enum)        -> HistoryStatusDto
- TableStats (inner payload)  -> TableStatsPayloadDto
- TableStats.SnapshotMetrics  -> TableStatsPayloadDto.SnapshotMetricsDto
- TableStats.CommitDelta      -> TableStatsPayloadDto.CommitDeltaDto

Cross-reference updates inside api/model. Internal model layer
(services/optimizer/.../model/) is intentionally unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…view)

Per @abhisheknath2011 review comment 3262769497:
> "Can we change the client side API to api.spec instead of api.model?
> This also aligns with existing services."

Mechanical package rename. The 10 api wire types move from
services/optimizer/.../api/model/ to services/optimizer/.../api/spec/.
No type or signature changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mkuchenbecker and others added 7 commits May 20, 2026 14:51
Reversal of an earlier inconsistency surfaced by abhisheknath2011 in
the PR #527 review thread on api/spec/HistoryStatusDto.java. The api
wire types are the canonical contract; they should carry the canonical
name. The internal-model types are transfer objects between layers and
now carry the Dto suffix.

api/spec/ — Dto stripped from class + filename (10 files):
  CompleteOperationRequestDto -> CompleteOperationRequest
  HistoryStatusDto             -> HistoryStatus
  OperationStatusDto           -> OperationStatus
  OperationTypeDto             -> OperationType
  TableOperationsDto           -> TableOperations
  TableOperationsHistoryDto    -> TableOperationsHistory
  TableStatsDto                -> TableStats
  TableStatsHistoryDto         -> TableStatsHistory
  TableStatsPayloadDto         -> TableStatsPayload
  UpsertTableStatsRequestDto   -> UpsertTableStatsRequest

model/ — Dto added to class + filename (8 files):
  HistoryStatus           -> HistoryStatusDto
  OperationStatus         -> OperationStatusDto
  OperationType           -> OperationTypeDto
  Table                   -> TableDto
  TableOperation          -> TableOperationDto
  TableOperationsHistory  -> TableOperationsHistoryDto
  TableStats              -> TableStatsDto
  TableStatsHistory       -> TableStatsHistoryDto

Both renames land on opt-0 because opt-0 owns api/spec/ and model/.
Cascade up the stack in follow-up commits.

Out of scope here: HistoryStatus enum value additions (CANCELED, QUEUED)
also raised in the same review thread; separate semantic change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
# Conflicts:
#	services/optimizer/src/main/java/com/linkedin/openhouse/optimizer/model/TableOperationsHistoryDto.java
model.TableOperationDto grows a jobId field; api.TableOperations
conversions copy it across the api ↔ model boundary. The api DTO
already had the field; the model side was missing it.

Relocated from opt-5 to its proper owner per the model-layer rule.
Model ↔ db plumbing for the same field lands on opt-1 in a follow-up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Companion to the opt-0 jobId field addition: now that
model.TableOperationDto carries jobId, wire it through toRow/fromRow
so the db row's job_id column round-trips through the model layer.

Relocated from opt-5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nRequest

Symbol rename only. The HistoryStatus enum (SUCCESS/FAILED) and the
once-terminal semantics are unchanged; the endpoint's behavior is the
same. Future broadening (CANCELED/QUEUED, idempotency, mid-lifecycle
status changes) is a separate concern.

Method names + URL path will follow on opt-2; Spark-app caller +
docs follow on opt-5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mkuchenbecker added a commit that referenced this pull request May 20, 2026
…527)

## Optimizer Stack

| PR | Content |
|---|---|
| #527 **(this)** | API and
internal models |
| #530 | Database Repos |
| #531 | REST service |
| #533 | Analyzer app |
| #534 | Scheduler app |
| #tbd | Spark BatchedOFD app |
| #tbd | Infra, docker-compose, smoke test |

## Summary

PR 0 of N in the optimizer stack. 
[Overall
Project](https://docs.google.com/document/d/1oGQWkmlVw0HG-D4Nx37q0oUEQd53Ni4q5Q7h1voaZIQ/edit?tab=t.0#heading=h.vtl818e4m9f7)
[Service Design
doc](https://docs.google.com/document/d/1xYOD7iuPjZO05UWfT1Dkmf4lYNw4iOjqY10UT2X1FAg/edit?tab=t.0).

Introduces the optimizer service API and internal model

<img width="631" height="410" alt="image"
src="https://github.com/user-attachments/assets/8833471d-069a-49a2-9a10-5eb7f3b96a72"
/>

## Changes

- [ ] Client-facing API Changes
- [x] Internal API Changes
- [ ] Bug Fixes
- [x] New Features
- [ ] Performance Improvements
- [ ] Code Style
- [ ] Refactoring
- [ ] Documentation
- [ ] Tests

## Testing Done

- [ ] Manually Tested on local docker setup. Please include commands
ran, and their output.
- [ ] Added new tests for the changes made.
- [ ] Updated existing tests to reflect the changes made.
- [x] No tests added or updated. Please explain why. If unsure, please
feel free to ask for help.
- [ ] Some other form of testing like staging or soak time in
production. Please explain.

This PR contains only the data model (entities, DTOs, converters).
Repository tests follow in PR 1. Verified:
- `./gradlew :services:optimizer:compileJava` passes
- `./gradlew compileJava` (full project) passes with no regressions
- Spotless formatting passes

# Additional Information

- [ ] Breaking Changes
- [ ] Deprecations
- [x] Large PR broken into smaller PRs, and PR plan linked in the
description.

---------

Co-authored-by: mkuchenbecker <mkuchenbecker@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker changed the base branch from mkuchenb/optimizer-0 to main May 20, 2026 22:48
Copy link
Copy Markdown
Member

@abhisheknath2011 abhisheknath2011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to create the new my schemas in all the DBs before before deploying this version internally.

Repo public API now:
  - find(...) with Optional<T> filters + required Pageable, on all four repos
  - updateBatch(ids, fromStatus, toStatus, Optional<Instant> scheduledAt,
                Optional<String> jobId) — replaces markSchedulingBatch,
    markScheduledBatch, markPendingBatch
  - cancel(ids) — replaces cancelDuplicatePendingBatch; deletes by-id with
    a defensive PENDING-only gate
  - findLatest(opType, Pageable) — was findLatestPerTable
  - history.find(tableUuid, Pageable) — was findByTableUuidOrderByCompletedAtDesc

Side-effect columns on updateBatch use COALESCE with Optional.empty() →
leave-unchanged. scheduledAt is not cleared on SCHEDULING → PENDING revert;
status is the source of truth and the watermark is overwritten on the next
claim.

@Modifying queries get flushAutomatically + clearAutomatically so the L1
cache reflects the change immediately (caught by the unit tests).

Spring Data @query can't share an "IS NULL OR IN :list" pattern (Hibernate
expands the list inline and the IS NULL check turns ungrammatical). The
find path uses two internal queries dispatched by the default method —
one with the ids predicate, one without.

Callers (service, analyzer, scheduler) update on opt-2..opt-4 in
follow-up commits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Instant.now() on Linux CI carries nanoseconds; MySQL TIMESTAMP(6) and
H2 in MySQL mode store microseconds. The scheduledAt = :scheduledAt
predicate in find(...) compared nano-resolution param against
micro-resolution stored value and missed. Local (macOS, micro-only)
hid the bug.

Truncate to ChronoUnit.MICROS at write time in the one repo test that
exercises the watermark round-trip.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@abhisheknath2011
Copy link
Copy Markdown
Member

We need to create the new my schemas in all the DBs before before deploying this version internally.

Discussed offline. We should be good for now as we don't have k8s deployment spec for the new service and newly added code does not touch existing codepath.

@mkuchenbecker
Copy link
Copy Markdown
Collaborator Author

We talked offline. Because there is no k8s deployment spec for the optimizer the mysql changes will not be deployed. This assumption would break deployments to test clusters if it is wrong, which will be detected automatically post-commit.

This is strictly preferable because we can get to a working state before deploying and impacting production.

@mkuchenbecker mkuchenbecker merged commit fac85a3 into main May 21, 2026
1 check passed
mkuchenbecker added a commit that referenced this pull request May 22, 2026
## Optimizer Stack

| PR | Content |
|---|---|
| #527 | Data Model |
| #530 | Database Repos |
| #531 **(this)** | REST
service |
| #533 | Analyzer app |
| #534 | Scheduler app |
| #tbd | Spark BatchedOFD app |
| #tbd | Infra, docker-compose, smoke test |

## Summary

PR 2 of N in the optimizer stack. 

Service layer and REST controllers for the optimizer service, plus the
`apps/optimizer` shared module providing lightweight entity/repo copies
for the analyzer and scheduler apps.

## Changes

- [ ] Client-facing API Changes
- [x] Internal API Changes
- [ ] Bug Fixes
- [x] New Features
- [ ] Performance Improvements
- [ ] Code Style
- [ ] Refactoring
- [ ] Documentation
- [x] Tests

**Service layer**: `OptimizerDataService` interface and
`OptimizerDataServiceImpl` — CRUD operations, complete-operation
lifecycle, stats upsert with history double-write, filtered queries.

**Controllers**: `TableOperationsController`,
`TableOperationsHistoryController`, `TableStatsController` — REST
endpoints per the design doc API spec.

**Shared module** (`apps/optimizer`): Lightweight entity and repository
copies used by the analyzer and scheduler apps to read optimizer state
directly from MySQL.

## Testing Done

- [ ] Manually Tested on local docker setup. Please include commands
ran, and their output.
- [x] Added new tests for the changes made.
- [ ] Updated existing tests to reflect the changes made.
- [ ] No tests added or updated. Please explain why. If unsure, please
feel free to ask for help.
- [ ] Some other form of testing like staging or soak time in
production. Please explain.

H2 integration tests in `OptimizerDataServiceImplTest` (5 tests):
- `completeOperation_writesHistoryFromOperationRow` — saves SCHEDULED
row, completes it, asserts history DTO fields
- `completeOperation_notFound_returnsEmpty` — completes nonexistent ID,
asserts empty
- `upsertTableStats_createsNewRow` — upserts new table, asserts DTO and
repo row
- `upsertTableStats_updatesExistingRow` — upserts twice, asserts
overwrite with single row
- `upsertTableStats_appendsHistoryOnEveryCall` — upserts twice, asserts
2 history rows

```
./gradlew :services:optimizer:test
# BUILD SUCCESSFUL — all 25 tests pass (repo tests from PR 1 + 5 new service tests)
```

# Additional Information

- [ ] Breaking Changes
- [ ] Deprecations
- [x] Large PR broken into smaller PRs, and PR plan linked in the
description.

---------

Co-authored-by: mkuchenbecker <mkuchenbecker@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants