Skip to content

HDDS-10611. Design document for MPU GC Optimization#9793

Merged
spacemonkd merged 12 commits intoapache:masterfrom
spacemonkd:HDDS-10611-design
Mar 10, 2026
Merged

HDDS-10611. Design document for MPU GC Optimization#9793
spacemonkd merged 12 commits intoapache:masterfrom
spacemonkd:HDDS-10611-design

Conversation

@spacemonkd
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

HDDS-10611. Design document for MPU GC Optimization

Please describe your PR in detail:
This PR adds the design doc for optimizing the OM GC pressure by the MPU file handling

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10611

How was this patch tested?

N/A

Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Copy link
Copy Markdown
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @devabhishekpal for the design on this long overdue issue. I am +1 on the overall direction. Left some comments.

Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
@ivandika3 ivandika3 requested a review from szetszwo February 21, 2026 09:24
@spacemonkd
Copy link
Copy Markdown
Contributor Author

Thanks for the exhaustive review and inputs @ivandika3.
I updated the document with the new details, please do let me know in case I am missing something in the understanding and also if something else could be improved.

FYI, I have a sample/PoC patch created if anybody wants to check the changes.
master...devabhishekpal:ozone:HDDS-10611

@spacemonkd spacemonkd requested a review from ivandika3 February 22, 2026 13:47
Copy link
Copy Markdown
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the design @devabhishekpal @rakeshadr. Overall LGTM, just a few things we can clarify in the doc.

Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
@spacemonkd spacemonkd requested a review from errose28 February 25, 2026 18:23
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
Comment thread hadoop-hdds/docs/content/design/mpu-gc-optimization.md Outdated
@spacemonkd
Copy link
Copy Markdown
Contributor Author

@ivandika3 @errose28 could you take another look at the current design doc?

Copy link
Copy Markdown
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update and the effort. LGTM +1.

Since we are introducing a flattened schema, we need to be mindful that RocksDB tombstones issue might happen if there are a lot of MPU aborts with large number of parts. So it might be important to put this new MPU metadata table as the list of compacted table (OZONE_OM_COMPACTION_SERVICE_COLUMNFAMILIES_DEFAULT).

Copy link
Copy Markdown
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @devabhishekpal for the write-up and iterative improvements. The only open comment I see is this one on using deprecated labels for proto fields but we can make that decision in the implementation.

@spacemonkd
Copy link
Copy Markdown
Contributor Author

Hi Ethan, thanks for the approval. I had forgot about this. However we have decided to move on with approach 1 which plays better with backward compatibility concerns.

The related protobuf changes for this has been merged already and there we have marked the field as deprecate where applicable.
Ref #9867 which introduces the protobuf changes.

@spacemonkd
Copy link
Copy Markdown
Contributor Author

Thanks for the reviews and inputs @errose28 @ivandika3 @ChenSammi @jojochuang.
Merging this design doc.

@spacemonkd spacemonkd merged commit b5db27d into apache:master Mar 10, 2026
16 checks passed
* write `multipartPartTable[multipartPartKey] = OmMultipartPartInfo{openKey, partName, partNumber, dataSize, modificationTime, objectID, updateID, metadata, keyLocationList, fileEncryptionInfo?, fileChecksum?}`
* keep current part open key in `openKeyTable` (needed later by list/complete/abort),
* if overwriting an existing part row, delete old part open key and adjust quota.
* `multipartInfoTable[multipartKey]` is still updated for metadata/updateID.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If updating the multipartInfoTable is still necessary here, can we retain some information from the partKeyInfoList, such as the part name and part number, but remove the partKeyInfo? This way, future requests can avoid scanning the multipartPartTable and only need to query a single key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants