Skip to content

#2879 - Remove Associated Storage S3 Objects During Record Deletion#2896

Merged
corsacca merged 9 commits intoDiscipleTools:developfrom
kodinkat:2879-remove-associated-storage-s3-objects-during-record-deletion
Mar 4, 2026
Merged

#2879 - Remove Associated Storage S3 Objects During Record Deletion#2896
corsacca merged 9 commits intoDiscipleTools:developfrom
kodinkat:2879-remove-associated-storage-s3-objects-during-record-deletion

Conversation

@kodinkat
Copy link
Copy Markdown
Collaborator

…deletion

                - Implemented a new private method to remove S3-backed storage objects associated with a post, including image fields in post meta, comment attachments, and user meta.
                - Integrated this method into the delete_post function to ensure storage cleanup occurs during post deletion.
                - Added checks for the availability of DT_Storage_API to prevent errors if the API is not enabled.
@github-actions
Copy link
Copy Markdown

Code Review

Overall the approach is well-structured and the placement of purge_post_storage_objects before the database deletions is correct—S3 keys are fetched while the data still exists, then the DB rows are deleted. The guard clause structure and consistency with the existing delete_comment S3 cleanup logic (same audio_url/image_url meta keys) are both good.

A few issues worth addressing:


1. Silent failure on S3 delete errors (medium)

DT_Storage_API::delete_file() returns null on exception and false when no client is available. Currently all return values are discarded, so if an object fails to delete the post record is still removed from the DB with no trace of the orphaned S3 object.

// current
DT_Storage_API::delete_file( $storage_key );

// suggested: at minimum log failures so they can be investigated
$result = DT_Storage_API::delete_file( $storage_key );
if ( $result === null ) {
    dt_write_log( "purge_post_storage_objects: failed to delete S3 object '{$storage_key}' for post {$post_id}" );
}

This is a best-effort cleanup (acceptable), but without logging there is no way to audit or recover orphaned objects.


2. Empty-value check inconsistency (minor)

The post-meta section uses !empty( $storage_key ) but the dt_post_user_meta section uses $meta_value === '' (strict empty-string comparison). These behave differently for null, '0', etc. S3 keys will never be those values in practice, but aligning them to the same check avoids confusion:

// inconsistent
if ( !empty( $storage_key ) ) { ... }          // post meta
if ( empty( $meta_key ) || $meta_value === '' ) { continue; }  // post_user_meta

// consistent
if ( empty( $meta_key ) || empty( $meta_value ) ) { continue; }

3. No test coverage

The new method has no corresponding tests. The existing test suite already covers delete_post (e.g., tests/unit-test-api.php). A test that mocks or stubs DT_Storage_API and verifies that delete_file is called with the correct keys when a post with image fields is deleted would be valuable, especially to prevent regressions if the S3 purge logic is changed later.


4. Performance note (low priority / informational)

Each file is deleted via a separate deleteObject API call. For records with many image fields or many comments, this will be noticeably slow since deletions happen synchronously before the method returns. S3 supports batch deletion (DeleteObjects, up to 1000 keys per request) which would reduce round-trips. Worth considering if large-scale deletions are expected, but probably not urgent for now.


What looks good

  • Guard clause (class_exists + method_exists + is_enabled) is correct and mirrors the pattern used in delete_comment.
  • Hardcoded audio_url / image_url meta keys match what the existing single-comment deletion logic uses (dt-posts.php:1412–1416), so coverage is consistent.
  • array_unique( array_filter( $comment_storage_keys ) ) correctly deduplicates and filters empty values before looping.
  • get_field_key_from_meta reuse for dt_post_user_meta rows is appropriate.
  • The dt_post_user_meta query uses $wpdb->dt_post_user_meta consistent with the rest of the codebase.

                - Updated the purge_post_storage_objects method to log failures when deleting S3 objects associated with posts, ensuring better tracking of issues during storage cleanup.
                - Added checks for the result of DT_Storage_API::delete_file to log a message if the deletion fails, improving the robustness of the storage management process.
@github-actions
Copy link
Copy Markdown

Code Review

Thanks for addressing issue #2879. The overall approach is correct—reading S3 keys from the DB before deleting the DB records ensures the data needed for cleanup is available. Here are some issues worth addressing:


Bug: Incomplete failure detection in delete_file() return value check

DT_Storage_API::delete_file() has three possible return values:

  • An array on success
  • null on exception (S3 API error)
  • false when the S3 client could not be built

The PR only logs a warning when $result === null, silently swallowing the false case. While is_enabled() is checked at the top, there are scenarios where build_client_and_config() can fail even when storage appears enabled, and those failures would go unlogged.

The existing pattern in dt-posts-endpoints.php (line 966) handles this more robustly:

 = ['file_deleted'] ?? false;

Suggest replacing all three if ( $result === null ) checks with:

if ( \!is_array( $result ) || empty( $result['file_deleted'] ) ) {
    dt_write_log( ... );
}

Hardcoded comment meta keys are fragile

The query filters on meta_key IN ( 'audio_url', 'image_url' ). These are magic strings. If the comment system adds additional attachment types in the future (e.g., video_url, file_url), this code silently misses them. Consider defining these as a constant or providing an apply_filters() hook so plugins and future code can extend the list:

$comment_storage_meta_keys = apply_filters( 'dt_post_comment_storage_meta_keys', [ 'audio_url', 'image_url' ] );

No extensibility hook for the overall cleanup

There is no action hook that allows plugins or child post types to register additional S3 cleanup logic. Given that DT is a platform with plugins, a do_action( 'dt_purge_post_storage_objects', $post_type, $post_id ) at the end of the method would be a useful affordance.


Minor: global $wpdb; declared but only used conditionally

global $wpdb; is declared unconditionally at the top of the method. If is_array( $field_settings ) is false and there are no comment meta or post user meta rows, this global is never used. Not a bug, but a slight style nit.


Behaviour question: silent failure on S3 deletion

If DT_Storage_API::delete_file() fails for some keys, the post is still deleted from the database, leaving orphaned S3 objects. The PR logs these failures but doesn't surface them to the caller. Is this intentional? If so, it should be documented in the docblock (e.g., "S3 deletion failures are logged but do not block record deletion"). If orphaned objects are a concern, a dt_post_deleted hook listener could be used to retry or notify an admin.


Missing test coverage

No PHPUnit tests are included. Given that this is a data-deletion path and the existing test suite covers delete_post behaviour, it would be valuable to add at least a test verifying that:

  1. purge_post_storage_objects is a no-op when DT_Storage_API is not available.
  2. The correct S3 keys are collected and passed to delete_file() when image fields exist.

A mock/stub of DT_Storage_API would keep this within the existing PHPUnit setup.


Summary: The logic is correct and the ordering (purge before DB deletion) is right. The main actionable items are the incomplete $result === null check (which can miss false returns) and the hardcoded comment meta keys. The rest are extensibility/documentation suggestions.

@kodinkat
Copy link
Copy Markdown
Collaborator Author

@corsacca review ready....

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 2, 2026

Code Review

Thanks for this PR! Cleaning up S3 objects on post deletion is important for avoiding orphaned storage. Overall the approach is solid, but there are a few issues to address.


Bug: Empty filter array causes invalid SQL

In purge_post_storage_objects, if the dt_post_comment_storage_meta_keys filter returns an empty array (e.g. a plugin removes all keys), this code produces invalid SQL:

$comment_storage_meta_keys = apply_filters( 'dt_post_comment_storage_meta_keys', [ 'audio_url', 'image_url' ] );
$placeholders = implode( ', ', array_fill( 0, count( $comment_storage_meta_keys ), '%s' ) );
// If empty: AND cm.meta_key IN ( )  <- invalid SQL, causes DB error

Add a guard before the query:

$comment_storage_meta_keys = apply_filters( 'dt_post_comment_storage_meta_keys', [ 'audio_url', 'image_url' ] );
if ( !empty( $comment_storage_meta_keys ) ) {
    // run the comment meta query
}

Issue: S3 deletion happens before DB deletion (ordering risk)

purge_post_storage_objects is called before the database rows are deleted. If any of the subsequent $wpdb->query() calls fail (e.g. dt_notifications, p2p, comments), the S3 objects are permanently gone but the DB records still exist pointing to deleted objects.

Consider calling purge_post_storage_objects after all DB deletions succeed, or at least after the main post record is deleted. The current dt_post_deleted action hook at the end of delete_post might be a better home for this logic:

// After all DB deletes, before/within dt_post_deleted hook
self::purge_post_storage_objects( $post_type, $post_id );
do_action( 'dt_post_deleted', $post_type, $post_id, $post_title );

Performance: Filter called inside a loop in dt-posts.php

In dt-posts/dt-posts.php, the new filter call is inside the foreach ( $comments_meta as $meta ) loop:

foreach ( $comments_meta as $meta ) {
    // ...
    $comment_storage_meta_keys = apply_filters( 'dt_post_comment_storage_meta_keys', [ 'audio_url', 'image_url' ] ); // called every iteration
    if ( in_array( $meta->meta_key, $comment_storage_meta_keys ) && ... ) {

This runs apply_filters once per comment meta row. Move it outside the loop:

$comment_storage_meta_keys = apply_filters( 'dt_post_comment_storage_meta_keys', [ 'audio_url', 'image_url' ] );
foreach ( $comments_meta as $meta ) {
    // ...
    if ( in_array( $meta->meta_key, $comment_storage_meta_keys ) && ... ) {

Minor: Incomplete failure logging for delete_file

DT_Storage_API::delete_file() returns false when the S3 client can't be built (e.g. missing class) and null on exception. The current code only logs the null case:

if ( $result === null ) {
    dt_write_log( ... );
}

While the is_enabled() guard at the top reduces the chance of a false return, it's worth using a unified failure check for robustness:

if ( !$result ) { // catches both false and null
    dt_write_log( ... );
}

Minor: Use null coalescing operator

In the dt_post_user_meta loop:

// Current
$meta_key   = isset( $row['meta_key'] ) ? $row['meta_key'] : '';
$meta_value = isset( $row['meta_value'] ) ? $row['meta_value'] : '';

// Preferred (ARRAY_A always sets these keys)
$meta_key   = $row['meta_key'] ?? '';
$meta_value = $row['meta_value'] ?? '';

Good things worth noting

  • The class_exists + method_exists + is_enabled() guard is a solid defensive check for optional functionality.
  • Handling thumbnails automatically via delete_file (which already deletes _thumbnail and _large_thumbnail variants) is clean — no extra work needed in the new method.
  • The docblock is clear and helpful.
  • Using array_unique( array_filter( ... ) ) to deduplicate and strip empty values before the S3 delete loop is a nice touch.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 2, 2026

Code Review

The approach is sound: purging S3 objects before the DB rows are deleted ensures the storage keys are still retrievable when needed. The three-phase cleanup (post meta → comment meta → dt_post_user_meta) is thorough, and the defensive guard (class_exists + method_exists + is_enabled()) is the right pattern for optional functionality.

A few issues are worth addressing before merge:


1. apply_filters called inside the loop (dt-posts.php) — medium

The new $comment_storage_meta_keys variable is assigned inside the foreach body of get_post_comments_meta, so apply_filters() is invoked on every meta row. The value is invariant across iterations and should be hoisted above the loop.

// current — filter runs once per meta row
foreach ( $comments_meta as $meta ) {
    $meta_value = $meta->meta_value;
    $comment_storage_meta_keys = apply_filters( 'dt_post_comment_storage_meta_keys', [ 'audio_url', 'image_url' ] );
    if ( in_array( $meta->meta_key, $comment_storage_meta_keys ) && ... ) { ... }
}

// preferred — compute once
$comment_storage_meta_keys = apply_filters( 'dt_post_comment_storage_meta_keys', [ 'audio_url', 'image_url' ] );
foreach ( $comments_meta as $meta ) {
    $meta_value = $meta->meta_value;
    if ( in_array( $meta->meta_key, $comment_storage_meta_keys ) && ... ) { ... }
}

2. delete_file failure check misses the false return — minor

DT_Storage_API::delete_file() returns:

  • an array on success
  • null on a caught exception
  • false when build_client_and_config() can't build the S3 client

The current check $result === null only logs the exception case. Although the is_enabled() guard at the top makes a false return unlikely, it is still possible if the S3 client fails to initialise after the gate check. A unified failure check is more robust:

// current
if ( $result === null ) { dt_write_log( ... ); }

// preferred
if ( !$result ) { dt_write_log( ... ); }

3. delete_post_comment not updated to use the new filter — minor

The new filter dt_post_comment_storage_meta_keys was added to get_post_comments_meta to make the set of storage-backed comment meta keys extensible. However, delete_post_comment (dt-posts.php:1410–1416) still hard-codes the check for audio_url/image_url. A plugin that adds a custom storage meta key via this filter will have it cleaned up on record deletion (via the new method) and on display (via get_post_comments_meta), but not on individual comment deletion. Keeping the three call-sites consistent would be the complete fix.


4. Partial atomicity risk — low / informational

purge_post_storage_objects is called before the DB deletions. If S3 deletions succeed but a subsequent DB query fails (unlikely but possible under DB load or a plugin hook that throws), the DB rows survive pointing to objects that no longer exist in S3. The reverse is already handled — S3 failures are non-fatal and the DB deletions proceed. Worth noting in a code comment so future maintainers understand the intentional trade-off.


5. Missing phpcs:ignore comment on dt_post_user_meta query — minor

The comment meta query has a // phpcs:ignore annotation explaining why the interpolation is safe. The dt_post_user_meta query below it uses the same interpolation pattern (FROM $wpdb->dt_post_user_meta) but lacks the annotation. PHPCS may flag it on CI.


What looks good

  • Correct placement of the call (before DB deletions, after dt_before_post_deleted action).
  • Thumbnails are automatically cleaned up inside DT_Storage_API::delete_file, so no extra work is needed here.
  • array_unique( array_filter( ... ) ) in the comment meta loop is a nice defensive touch.
  • The docblock clearly explains the three storage locations being purged.
  • No permissions are re-checked inside the private method — the public delete_post entry point already handles that, which is correct.

                - Enhanced the error handling for S3 object deletions in both the  and comment processing methods, ensuring that failures are logged correctly.
                - Introduced a filter for comment storage meta keys to streamline the deletion process for associated audio and image URLs.
                - Improved code readability and maintainability by consolidating repeated logic for checking and deleting S3 objects.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

Code Review

This PR correctly addresses #2879 by cleaning up S3 storage objects before the database records are deleted (so keys are still readable). The overall structure is solid and the changes in dt-posts.php are a clear improvement over the original hardcoded audio_url/image_url checks.


What is good

  • Correct ordering: purge_post_storage_objects runs after dt_before_post_deleted but before any DELETE queries, so S3 keys are still readable.
  • Failure detection: The !is_array($result) || empty($result['file_deleted']) check correctly covers all three return states of delete_file() (array on success, null on exception, false when client unavailable).
  • Extensibility: The dt_post_comment_storage_meta_keys filter lets plugins/child themes register additional comment-meta keys without forking core.
  • Strict in_array: Adding true as the third argument in get_post_comments_meta is the right fix.
  • Deduplication: array_unique(array_filter($comment_storage_keys)) is correct for the comment-meta section.
  • Guard clause: class_exists + method_exists + is_enabled() mirrors the pattern used elsewhere in the codebase.

Issues worth addressing

1. N+1 queries in the post-meta image loop (performance)

get_post_meta($post_id, $field_key, true) is called once per image field inside the loop, firing a separate DB query for every image field. A single get_post_meta($post_id) call returns all meta in one query:

$all_post_meta = get_post_meta( $post_id ); // one query upfront
foreach ( $field_settings as $field_key => $settings ) {
    if ( isset( $settings['type'] ) && $settings['type'] === 'image' ) {
        $storage_key = $all_post_meta[ $field_key ][0] ?? '';
        if ( !empty( $storage_key ) ) {
            // ... delete logic
        }
    }
}

The format from get_post_meta($post_id) (no third argument) is ['field_key' => ['value', ...]], so [0] gives the first value, equivalent to get_post_meta($post_id, $field_key, true).

2. dt_post_user_meta query fetches all rows, then does O(n x m) field lookup

The query retrieves every user-meta row for the post regardless of field type, then calls get_field_key_from_meta() on each one. That function iterates all field settings twice (once for direct match, once for link key scan). For a post type with many fields and many users this could become expensive.

A lighter approach: collect image-type field keys first, then add a WHERE meta_key IN (...) clause to the SQL, similar to how comment meta keys are already filtered.

3. No unit tests for purge_post_storage_objects

This method has no test coverage. Given that it is a destructive, side-effectful operation called during delete_post, a regression could silently orphan S3 objects. A test that stubs DT_Storage_API and verifies delete_file is called with the correct keys for a post with image fields and comments would be valuable.

4. Filter applied in three independent locations

apply_filters('dt_post_comment_storage_meta_keys', ...) is now called independently in delete_post_comment, get_post_comments_meta, and purge_post_storage_objects. If a filter callback has side effects or returns state-dependent results the three call-sites could diverge. A shared static helper would ensure consistency and make the default list a single source of truth:

private static function get_comment_storage_meta_keys(): array {
    return apply_filters( 'dt_post_comment_storage_meta_keys', [ 'audio_url', 'image_url' ] );
}

Summary

The core logic is correct and this is a meaningful improvement. Main things to consider before merging:

  1. Replace the get_post_meta loop with a single upfront call (easy win, fewer DB queries).
  2. Add at least a basic test covering the happy path for purge_post_storage_objects.
  3. Consider a shared helper for dt_post_comment_storage_meta_keys to keep the three call-sites consistent.

The dt_post_user_meta O(n x m) issue is worth a follow-up but is unlikely to cause problems at typical scale.

…etion logic

                - Introduced a new method, get_comment_storage_meta_keys, to centralize the retrieval of comment meta keys backed by storage, enhancing code maintainability.
                - Updated the purge_post_storage_objects method to utilize the new retrieval method, streamlining the process of deleting S3 objects associated with comments.
                - Improved SQL query preparation for fetching comment storage keys, ensuring better performance and security.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

Code Review — PR #2879: Remove Associated Storage S3 Objects During Record Deletion

Good addition overall. Cleaning up S3 objects before DB records are deleted is the right approach, and the extensibility points (filter + action) are solid design. Here are my observations:


Issues

1. Hook ordering risk (medium)

purge_post_storage_objects is called after do_action('dt_before_post_deleted', ...). If any plugin hooked into dt_before_post_deleted modifies or clears post meta or comment meta, those storage objects will be silently orphaned.

The docstring says "The purge runs before database deletes so that storage keys can be read" — this is true for the DB deletes, but not for the dt_before_post_deleted hook. Consider:

  • Moving purge_post_storage_objects to before do_action('dt_before_post_deleted'), or
  • Documenting the constraint: hooks in dt_before_post_deleted must not delete storage-key meta if they want proper S3 cleanup.

2. DT_Posts:: vs self:: (minor)

In purge_post_storage_objects, the line:

$field_settings = DT_Posts::get_post_field_settings( $post_type );

Since purge_post_storage_objects is a static method of DT_Posts, this should use self::get_post_field_settings( $post_type ) for consistency with the rest of the class.

3. Visibility inconsistency (minor)

get_comment_storage_meta_keys() is protected (enabling subclass override of the key list), but purge_post_storage_objects() is private (preventing subclass override of the purge logic). If the intent is to let subclasses customize storage cleanup, purge_post_storage_objects should also be protected. If not, get_comment_storage_meta_keys can be private.

4. dt_post_user_meta fetches all rows, filters in PHP (minor)

This fetches all user-meta rows for the post and then filters by image-type field in PHP. At delete time this is acceptable, but it could be more efficient by pre-computing the set of image field keys and adding AND meta_key IN (...) to the SQL. Low priority since posts with many user-meta rows are uncommon.

5. class_exists check inconsistency (minor)

purge_post_storage_objects defensively guards with class_exists('DT_Storage_API'), but the modified delete_post_comment in dt-posts.php still calls DT_Storage_API::is_enabled() directly without that guard (inherited from original code, not introduced here). Worth aligning the patterns for consistency.


Positive notes

  • Correct ordering: S3 purge runs before DB deletes, so keys are still readable.
  • apply_filters('dt_post_comment_storage_meta_keys', ...) with a safe non-array fallback is clean and extensible.
  • do_action('dt_purge_post_storage_objects', ...) at the end lets plugins clean up their own storage objects.
  • Error logging via dt_write_log on failed deletions is the right approach for best-effort cleanup.
  • class_exists / method_exists guard in purge_post_storage_objects is more defensive than the existing pattern.
  • $wpdb->prepare() is correctly used for all SQL; the phpcs:ignore comment is properly scoped and explained.
  • array_unique(array_filter(...)) before iterating comment storage keys avoids redundant API calls.

Summary

The core logic is sound. The main thing worth addressing before merge is issue #1 (hook ordering) — the rest are style/minor improvements. No security concerns identified.

kodinkat added 2 commits March 3, 2026 11:08
…ools-theme into 2879-remove-associated-storage-s3-objects-during-record-deletion
                - Improved the SQL query construction for fetching comment storage keys by consolidating the query into a single prepared statement, enhancing performance and security.
                - Added a PHPCS ignore comment to clarify the safety of the constructed placeholders.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

Code Review

This PR correctly addresses #2879. The three-phase purge (post meta → comment meta → dt_post_user_meta) is thorough, and the placement before DB deletes ensures keys are still readable. It's clear previous feedback has been incorporated — the result-checking pattern (\!is_array($result) || empty($result['file_deleted'])), the empty-array SQL guard, moving the filter call outside the loop, and the phpcs:ignore annotations are all solid.

A few remaining items worth addressing before merge:


1. Visibility inconsistency — minor

get_comment_storage_meta_keys() is protected (allowing subclass override) while purge_post_storage_objects() is private (blocking it). If the intent is to let subclasses customise which keys are treated as storage-backed, they should also be able to override the purge logic. Either both should be protected, or if no subclass extension is needed, get_comment_storage_meta_keys can be private.


2. Hook-ordering constraint not documented — minor

The docblock says "The purge runs before database deletes", which is accurate. However, the call is placed after do_action('dt_before_post_deleted', ...). A plugin hooked into dt_before_post_deleted that modifies or removes storage-key meta (e.g. to swap an S3 key before deletion) could silently orphan S3 objects. Worth adding a sentence to the docblock:

 * Note: this runs after the dt_before_post_deleted action. Plugins hooked into
 * dt_before_post_deleted must not delete or modify storage-key meta if they
 * want S3 cleanup to complete correctly.

3. Parent calling child class — minor, consistent with existing codebase

Inside Disciple_Tools_Posts::purge_post_storage_objects() (in posts.php):

$field_settings = DT_Posts::get_post_field_settings( $post_type );

get_post_field_settings is defined in the child class DT_Posts (dt-posts.php), so the parent is calling down to the child. This works at runtime and matches many existing calls in posts.php, so it's not a blocker — but using self:: would be better and is the same fix mentioned in a previous review. Since self:: in a parent method resolves at the parent level and get_post_field_settings lives in the child, using static:: would give you late-static binding and avoid naming the concrete class explicitly.


4. No unit tests — minor

The new deletion path has no test coverage. Given that purge_post_storage_objects calls DT_Storage_API::delete_file before irreversible DB deletes, a regression here could silently orphan S3 objects. A minimal PHPUnit test that stubs DT_Storage_API and asserts delete_file is called with the right keys when a post with image fields is deleted would be valuable.


What looks good

  • Correct ordering: purge before DB deletes, after dt_before_post_deleted
  • \!is_array($result) || empty($result['file_deleted']) correctly covers all three return states of delete_file()
  • \!empty($comment_storage_meta_keys) guard before the IN (...) query prevents invalid SQL
  • apply_filters('dt_post_comment_storage_meta_keys', ...) hoisted outside the foreach in get_post_comments_meta
  • get_comment_storage_meta_keys() as a shared helper eliminates the duplicated hardcoded arrays
  • do_action('dt_purge_post_storage_objects', ...) at the end gives plugins a clean extension point
  • array_unique(array_filter(...)) before iterating comment keys is a good defensive touch
  • phpcs:ignore annotations on both SQL interpolations are correctly scoped and explained

…y retrieval

                                - Disabled PHPCS warning for interpolated SQL queries to clarify the safety of constructed placeholders.
                                - Re-enabled the warning after the SQL query preparation to maintain code quality standards.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

PR Review: Remove Associated Storage S3 Objects During Record Deletion

Overall this is a well-structured implementation that addresses a real data hygiene gap. The approach is sound and follows existing codebase patterns. Below are my findings.


Positive aspects

  • Correct sequencing — calling purge_post_storage_objects() before the DB deletes (but after dt_before_post_deleted) is the right order so storage keys are still readable when the deletion runs. The docstring makes this clear.
  • Extensibility — both the new filter (dt_post_comment_storage_meta_keys) and the action (dt_purge_post_storage_objects) give plugins/extensions clean hooks to participate in the cleanup.
  • Defensive guard in purge_post_storage_objects — checking class_exists, method_exists, and is_enabled() at the top is robust and prevents errors on sites that don't have the storage library loaded.
  • array_unique( array_filter( ... ) ) on comment storage keys is good defensive coding.
  • Thumbnail cleanup is implicitDT_Storage_API::delete_file() already handles thumbnail variants for image extensions, so those are cleaned up automatically without extra calls here.
  • Filter return validation in get_comment_storage_meta_keys() (falls back to the default if filter returns a non-array) is a good safety net.

Issues / Suggestions

1. delete_file return value: null vs false vs array

DT_Storage_API::delete_file() has three possible return values:

  • array with file_deleted => true — success
  • null — exception during S3 call (genuine failure)
  • false — S3 client could not be constructed (configuration issue)

The failure check used throughout the new code:

if ( \!is_array(  ) || empty( ['file_deleted'] ) ) {
    dt_write_log( "... failed to delete S3 object ..." );
}

treats both null and false as failures, which is correct. In practice the early guard in purge_post_storage_objects (DT_Storage_API::is_enabled()) should prevent the false path, but a short inline comment explaining this distinction would help future readers.

2. Thumbnails for image post-meta fields not explicitly cleaned up (minor — probably fine)

For image fields in post meta, delete_file is called with the raw S3 key. Since delete_file checks the file extension and deletes thumbnail variants when the extension is an image type, this should work correctly — but only if the stored meta value is a bare S3 key (e.g., site_id/random.jpg) rather than a full URL. Looking at how get_file_url is used in get_post_comments_meta, the pattern is consistent: keys are stored, URLs are generated on read. This should be fine, but worth confirming is consistently the case for all image post-meta fields too.

3. private vs protected on purge_post_storage_objects

The method is declared private, which means subclasses of Disciple_Tools_Posts cannot override the storage purge behaviour without also overriding delete_post. If any plugin extends this class and has custom storage needs, they'd have to hook dt_purge_post_storage_objects instead. That's acceptable since the action hook covers the use case, but it's worth being intentional about the access level choice.

4. PHPCS disable comment scope

// phpcs:disable WordPress.DB.PreparedSQL.InterpolatedNotPrepared
 = ->get_col(
    ->prepare(
        "...WHERE cm.meta_key IN (  )...",
        
    )
);
// phpcs:enable WordPress.DB.PreparedSQL.InterpolatedNotPrepared

The `` variable is a safe implode of `%s` literals, so suppressing the warning is justified. This is fine. Just a nit: the scope of the suppress is tight and correct.

5. No cleanup of activity log attachments

The dt_activity_log table is deleted in delete_post, but purge_post_storage_objects doesn't scan it for storage references. If activity log entries can ever carry S3-backed attachments (audio/image), those would leak. If that's currently not possible, it's worth adding a code comment to that effect so future developers don't miss it when they add activity attachments.

6. Shared S3 keys across records

If two posts could ever reference the same S3 key (e.g., an image copied from one record to another), deleting one post would orphan the other's attachment. This is probably not possible given how uploads generate unique random keys, but if any code path copies a key reference rather than re-uploading, this would be a silent data loss. Worth confirming the upload flow always produces unique keys per post.


Summary

The core logic is correct and the implementation is clean. The main things worth addressing before merge:

  1. Confirm that stored meta values for image post-meta fields are always raw S3 keys (not full URLs).
  2. Consider whether dt_activity_log storage attachments are in scope (even if just to document they're excluded).

The filter + action hook design is a nice addition for extensibility.

@kodinkat
Copy link
Copy Markdown
Collaborator Author

kodinkat commented Mar 3, 2026

@corsacca review ready....

@corsacca
Copy link
Copy Markdown
Member

corsacca commented Mar 4, 2026

Thanks @kodinkat!

@corsacca corsacca merged commit b22ffba into DiscipleTools:develop Mar 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove Associated Storage S3 Objects During Record Deletion

2 participants