Skip to content

feat(optimizer): [2/N] Optimizer REST Service and Controller#531

Merged
mkuchenbecker merged 140 commits into
mainfrom
mkuchenb/optimizer-2
May 22, 2026
Merged

feat(optimizer): [2/N] Optimizer REST Service and Controller#531
mkuchenbecker merged 140 commits into
mainfrom
mkuchenb/optimizer-2

Conversation

@mkuchenbecker
Copy link
Copy Markdown
Collaborator

@mkuchenbecker mkuchenbecker commented Apr 6, 2026

Optimizer Stack

PR Content
#527 Data Model
#530 Database Repos
#531 (this) REST service
#533 Analyzer app
#534 Scheduler app
#tbd Spark BatchedOFD app
#tbd Infra, docker-compose, smoke test

Summary

PR 2 of N in the optimizer stack.

Service layer and REST controllers for the optimizer service, plus the apps/optimizer shared module providing lightweight entity/repo copies for the analyzer and scheduler apps.

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

Service layer: OptimizerDataService interface and OptimizerDataServiceImpl — CRUD operations, complete-operation lifecycle, stats upsert with history double-write, filtered queries.

Controllers: TableOperationsController, TableOperationsHistoryController, TableStatsController — REST endpoints per the design doc API spec.

Shared module (apps/optimizer): Lightweight entity and repository copies used by the analyzer and scheduler apps to read optimizer state directly from MySQL.

Testing Done

  • Manually Tested on local docker setup. Please include commands ran, and their output.
  • Added new tests for the changes made.
  • Updated existing tests to reflect the changes made.
  • No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • Some other form of testing like staging or soak time in production. Please explain.

H2 integration tests in OptimizerDataServiceImplTest (5 tests):

  • completeOperation_writesHistoryFromOperationRow — saves SCHEDULED row, completes it, asserts history DTO fields
  • completeOperation_notFound_returnsEmpty — completes nonexistent ID, asserts empty
  • upsertTableStats_createsNewRow — upserts new table, asserts DTO and repo row
  • upsertTableStats_updatesExistingRow — upserts twice, asserts overwrite with single row
  • upsertTableStats_appendsHistoryOnEveryCall — upserts twice, asserts 2 history rows
./gradlew :services:optimizer:test
# BUILD SUCCESSFUL — all 25 tests pass (repo tests from PR 1 + 5 new service tests)

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

mkuchenbecker and others added 3 commits April 3, 2026 11:00
Introduces the optimizer service module with:
- MySQL/H2 schema for table_operations, table_stats, table_stats_history,
  and table_operations_history
- JPA entities with JSON column support (vladmihalcea hibernate-types)
- All model/DTO/enum types: OperationType, OperationStatus, TableStats,
  CompleteOperationRequest, JobResult, OperationMetrics, etc.
- JPA AttributeConverters for JobResult and OperationMetrics JSON columns
- MapStruct mapper (OptimizerMapper) for entity→DTO conversion
- Spring Boot application shell and build wiring (settings.gradle,
  build.gradle dockerPrereqs)

No repositories, controllers, or service layer yet — those follow in
subsequent PRs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove OperationMetrics class and converter; stats are read
  directly from table_stats instead of duplicating into operations
- Remove orphanFilesDeleted/orphanBytesDeleted from history entity,
  DTO, and schema; operation-specific data belongs in the result JSON
- Add addedSizeBytes to CommitDelta for tracking write volume
- Fix OperationType javadoc to describe current state, not roadmap
- Fix TableOperationsHistoryRow javadoc: written on operation
  complete, not by Spark app directly
- Add field comments to all DTOs and request objects

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Spring Data JPA repositories for all four optimizer tables with
filtered query support. Includes tests exercising save/find,
filtered queries, upsert semantics, and append-only history.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mkuchenbecker and others added 3 commits April 6, 2026 11:35
Address PR review comments: rename findFiltered → find across all repos,
remove redundant findByTableUuid/findByTableUuidSince from history repos,
add explicit assertion to context test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Service interface and implementation for all optimizer CRUD operations
including complete-operation lifecycle, stats upsert with history
double-write, and filtered queries. Three REST controllers expose the
endpoints. The apps/optimizer shared module provides lightweight
entity/repo copies for the analyzer and scheduler apps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align OptimizerDataServiceImpl with renamed repository methods from
optimizer-1 review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker force-pushed the mkuchenb/optimizer-2 branch from e628426 to ef3260f Compare April 6, 2026 18:37
mkuchenbecker and others added 2 commits April 6, 2026 12:09
Shared JPA entities and repositories for optimizer apps (analyzer,
scheduler). All repos expose a single find method with optional filters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve repo conflicts by taking optimizer-1's clean find-only versions.
Scheduler-specific methods and streamAll removed per review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@mkuchenbecker mkuchenbecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs tests

mkuchenbecker and others added 3 commits April 6, 2026 12:19
These fields never belonged in the data model — remove them at the
source rather than adding then deleting in a later PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagate CompleteOperationRequest orphan field removal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagate CompleteOperationRequest orphan field removal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker changed the title feat(optimizer): add REST service layer, controllers, and shared module feat(optimizer): add REST service layer Apr 6, 2026
@mkuchenbecker mkuchenbecker changed the title feat(optimizer): add REST service layer feat(optimizer): [3/N] Optimizer REST service layer Apr 6, 2026
H2 integration tests for OptimizerDataServiceImpl covering
completeOperation (write history, not-found) and upsertTableStats
(create, update, history append).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strengthen upsertTableStats test to verify history rows contain the raw
delta stats from each call, not just the row count.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mkuchenbecker and others added 6 commits May 20, 2026 15:55
Repo public API now:
  - find(...) with Optional<T> filters + required Pageable, on all four repos
  - updateBatch(ids, fromStatus, toStatus, Optional<Instant> scheduledAt,
                Optional<String> jobId) — replaces markSchedulingBatch,
    markScheduledBatch, markPendingBatch
  - cancel(ids) — replaces cancelDuplicatePendingBatch; deletes by-id with
    a defensive PENDING-only gate
  - findLatest(opType, Pageable) — was findLatestPerTable
  - history.find(tableUuid, Pageable) — was findByTableUuidOrderByCompletedAtDesc

Side-effect columns on updateBatch use COALESCE with Optional.empty() →
leave-unchanged. scheduledAt is not cleared on SCHEDULING → PENDING revert;
status is the source of truth and the watermark is overwritten on the next
claim.

@Modifying queries get flushAutomatically + clearAutomatically so the L1
cache reflects the change immediately (caught by the unit tests).

Spring Data @query can't share an "IS NULL OR IN :list" pattern (Hibernate
expands the list inline and the IS NULL check turns ungrammatical). The
find path uses two internal queries dispatched by the default method —
one with the ids predicate, one without.

Callers (service, analyzer, scheduler) update on opt-2..opt-4 in
follow-up commits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
OptimizerDataServiceImpl pipes Optional<T> filters straight through to
the repo (no .orElse(null) at the boundary). Adds the
optimizer.repo.default-limit config property and threads it into the
list-shaped calls. Service-impl test updates the one direct
statsHistoryRepository.find(...) call to pass Optional.empty().

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Instant.now() on Linux CI carries nanoseconds; MySQL TIMESTAMP(6) and
H2 in MySQL mode store microseconds. The scheduledAt = :scheduledAt
predicate in find(...) compared nano-resolution param against
micro-resolution stored value and missed. Local (macOS, micro-only)
hid the bug.

Truncate to ChronoUnit.MICROS at write time in the one repo test that
exercises the watermark round-trip.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mkuchenbecker added a commit that referenced this pull request May 21, 2026
## Optimizer Stack

| PR | Content |
|---|---|
| #527 | Data Model |
| #530 **(this)** | Database
Repos |
| #531 | REST service |
| #533 | Analyzer app |
| #534 | Scheduler app |
| #tbd | Spark BatchedOFD app |
| #tbd | Infra, docker-compose, smoke test |

## Summary

PR 1 of N in the optimizer stack. 
[Overall
Project](https://docs.google.com/document/d/1oGQWkmlVw0HG-D4Nx37q0oUEQd53Ni4q5Q7h1voaZIQ/edit?tab=t.0#heading=h.vtl818e4m9f7)
[Service Design
doc](https://docs.google.com/document/d/1xYOD7iuPjZO05UWfT1Dkmf4lYNw4iOjqY10UT2X1FAg/edit?tab=t.0).


Spring Data JPA repositories for all four optimizer tables with filtered
query support, plus tests exercising save/find, filtered queries, upsert
semantics, and append-only history.

## Changes

- [ ] Client-facing API Changes
- [ ] Internal API Changes
- [ ] Bug Fixes
- [x] New Features
- [ ] Performance Improvements
- [ ] Code Style
- [ ] Refactoring
- [ ] Documentation
- [x] Tests

**Repositories**: `TableOperationsRepository`,
`TableOperationsHistoryRepository`, `TableStatsRepository`,
`TableStatsHistoryRepository` — each with JPQL filtered query methods.

**Tests**: Repository tests for all four tables plus
`OptimizerServiceContextTest` verifying the Spring context loads.

## Testing Done

- [ ] Manually Tested on local docker setup. Please include commands
ran, and their output.
- [x] Added new tests for the changes made.
- [ ] Updated existing tests to reflect the changes made.
- [ ] No tests added or updated. Please explain why. If unsure, please
feel free to ask for help.
- [ ] Some other form of testing like staging or soak time in
production. Please explain.

`./gradlew :services:optimizer:test` — all tests pass (H2 in MySQL
mode).

# Additional Information

- [ ] Breaking Changes
- [ ] Deprecations
- [x] Large PR broken into smaller PRs, and PR plan linked in the
description.

---------

Co-authored-by: mkuchenbecker <mkuchenbecker@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker changed the base branch from mkuchenb/optimizer-1 to main May 21, 2026 22:42
All four list-style endpoints now require a caller-supplied limit. No
server-side default, no max — getting the API contract right comes first;
bounds can land separately.

- TableOperationsController.listTableOperations: @RequestParam int limit
- TableStatsController.listTableStats: @RequestParam int limit
- TableStatsController.getStatsHistory: drop defaultValue="100"
- TableOperationsHistoryController.getHistory: drop defaultValue="100"
- OptimizerDataService: listTableOperations / listTableStats gain int limit
- OptimizerDataServiceImpl: drop @value("${optimizer.repo.default-limit}")
  and the defaultLimit field; thread caller-supplied limit straight to
  PageRequest.of(0, limit), which cascades to MySQL LIMIT n.
- application.properties: remove now-unused optimizer.repo.default-limit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@abhisheknath2011
Copy link
Copy Markdown
Member

abhisheknath2011 commented May 22, 2026

Optimizer Stack

PR Content
#527 Data Model
#530 Database Repos
#531 (this) REST service
#533 Analyzer app
#534 Scheduler app
#tbd Spark BatchedOFD app
#tbd Infra, docker-compose, smoke test

Summary

PR 2 of N in the optimizer stack. Overall Project Service Design doc.

Service layer and REST controllers for the optimizer service, plus the apps/optimizer shared module providing lightweight entity/repo copies for the analyzer and scheduler apps.

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

Service layer: OptimizerDataService interface and OptimizerDataServiceImpl — CRUD operations, complete-operation lifecycle, stats upsert with history double-write, filtered queries.

Controllers: TableOperationsController, TableOperationsHistoryController, TableStatsController — REST endpoints per the design doc API spec.

Shared module (apps/optimizer): Lightweight entity and repository copies used by the analyzer and scheduler apps to read optimizer state directly from MySQL.

Testing Done

  • Manually Tested on local docker setup. Please include commands ran, and their output.
  • Added new tests for the changes made.
  • Updated existing tests to reflect the changes made.
  • No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • Some other form of testing like staging or soak time in production. Please explain.

H2 integration tests in OptimizerDataServiceImplTest (5 tests):

  • completeOperation_writesHistoryFromOperationRow — saves SCHEDULED row, completes it, asserts history DTO fields
  • completeOperation_notFound_returnsEmpty — completes nonexistent ID, asserts empty
  • upsertTableStats_createsNewRow — upserts new table, asserts DTO and repo row
  • upsertTableStats_updatesExistingRow — upserts twice, asserts overwrite with single row
  • upsertTableStats_appendsHistoryOnEveryCall — upserts twice, asserts 2 history rows
./gradlew :services:optimizer:test
# BUILD SUCCESSFUL — all 25 tests pass (repo tests from PR 1 + 5 new service tests)

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

The PR description looks good with all the design docs including BDP ticket and make the PR comprehensive. But people outside of LinkedIn would not be able to open the links. We are following the convention of not linking internal tickets and doc in the OSS repo. I thought about this in the previous two PRs which are already merged, but did not mention about this. Now I am thinking should we put the internal references, as I am also in the process of raising PRs. Let me know your thoughts on this.

@mkuchenbecker mkuchenbecker changed the title [BDP-102028] feat(optimizer): [2/N] Optimizer REST Service and Controller feat(optimizer): [2/N] Optimizer REST Service and Controller May 22, 2026
@mkuchenbecker
Copy link
Copy Markdown
Collaborator Author

The PR description looks good with all the design docs including BDP ticket and make the PR comprehensive. But people outside of LinkedIn would not be able to open the links. We are following the convention of not linking internal tickets and doc in the OSS repo. I thought about this in the previous two PRs which are already merged, but did not mention about this. Now I am thinking should we put the internal references, as I am also in the process of raising PRs. Let me know your thoughts on this.

agreed and removed up the stack.

Every optimizer endpoint now returns a uniform ApiError body
({code, message, path}) on non-2xx. Reshape updateOperation so the
operationId lives in the URL (and is repeated in the body for
self-describing payloads).

- api/error/ApiError.java — DTO shape.
- api/error/GlobalExceptionHandler.java — @RestControllerAdvice mapping
  framework exceptions to {VALIDATION_ERROR, INVALID_PARAMETER,
  MISSING_PARAMETER, MALFORMED_REQUEST, INTERNAL_ERROR}. Reason of
  ResponseStatusException is parsed as "CODE: message" for endpoint-
  specific 404s.
- Controllers: orElseThrow(ResponseStatusException) replaces bare 404.
  TableOperationsController moves updateOperation to
  POST /v1/optimizer/operations/{id}/update; rejects with 400
  PATH_BODY_MISMATCH when body.operationId != path.id.
- UpdateOperationRequest: @notblank operationId, @NotNull status.
- UpsertTableStatsRequest: @notblank databaseName, tableName.
- spring-boot-starter-validation dep added.
- New ControllerErrorHandlingTest: 13 MockMvc cases covering 404 / 400
  validation / 400 type-mismatch / 400 missing-param / 400 malformed-
  body / 400 path-body-mismatch + happy-path sanity.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mkuchenbecker and others added 4 commits May 22, 2026 10:47
Address review comments on PR #596 — minimal diff, no bean validation,
no per-framework-exception handlers, scoped to the optimizer controllers.

- @RestControllerAdvice scoped to the three optimizer controllers via
  assignableTypes; no longer global.
- Two handlers only: ResponseStatusException → ApiError with code =
  status.name() + reason as message; Exception → 500 INTERNAL_ERROR.
  Drop MethodArgumentNotValidException, MethodArgumentTypeMismatchException,
  MissingServletRequestParameterException, HttpMessageNotReadableException —
  Spring's defaults handle those.
- Drop the "CODE: message" reason-parsing convention.
- Drop @notblank / @NotNull on UpdateOperationRequest and
  UpsertTableStatsRequest; drop @Valid on controllers; drop
  spring-boot-starter-validation dep. Validate operationId / status server-
  side in TableOperationsController.updateOperation — loose-coupling so
  relaxing required fields later doesn't break wire callers.
- String.format throughout; no message concatenation.
- ControllerErrorHandlingTest trimmed from 13 cases to 7: only what the
  controllers actually own (404s, server-side validation on updateOperation,
  happy-path sanity). Framework-level 4xx left to Spring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ng defaults

The custom advice was producing a body shape (ApiError {code, message,
path}) that duplicated Spring Boot's default error JSON ({timestamp,
status, error, message, path}). The only substantive difference was
that Spring Boot 2.7 omits the `message` field by default.

Replace the custom advice with a one-line config:
  server.error.include-message=always

Now ResponseStatusException reasons (e.g. "no operation with id X")
reach the caller via Spring's default error body, no custom code.

- Delete api/error/GlobalExceptionHandler.java
- Delete api/error/ApiError.java
- application.properties: server.error.include-message=always
- ControllerErrorHandlingTest assertions trimmed to status-code-only
  (MockMvc does not trigger Spring's error-dispatch to
  BasicErrorController, so body assertions cannot be made in tests
  even though the body is populated on real HTTP requests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per review — no change needed on this file. The path/body validation
lives in the controller; the DTO carries the same fields as before with
the existing javadoc.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mkuchenbecker and others added 2 commits May 22, 2026 12:29
Address review on PR #531 — each endpoint lists the HTTP response codes
it actually returns, in the same "Resource ACTION: STATUS" style used
by services/tables.

Codes per endpoint:
- POST /operations/{id}/update — 201, 400, 404
- GET  /operations/{id}        — 200, 404
- GET  /operations             — 200, 400
- POST /operations-history     — 201
- GET  /operations-history/{u} — 200, 400
- PUT  /stats/{u}              — 200
- GET  /stats/{u}              — 200, 404
- GET  /stats                  — 200, 400
- GET  /stats/{u}/history      — 200, 400

Annotations only — no runtime behavior change, no new tests required.
swagger-annotations 2.1.11 is already on the optimizer classpath via
openhouse.springboot-conventions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@abhisheknath2011 abhisheknath2011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mkuchenbecker. The initial set of APIs look good. We can iterate and update as needed.

@mkuchenbecker mkuchenbecker merged commit 78ba0ab into main May 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants