Skip to content

Disable delete optimization and exit ref loop faster#6219

Open
ddanielr wants to merge 2 commits intoapache:2.1from
ddanielr:bugfix/disable-delete-optimization
Open

Disable delete optimization and exit ref loop faster#6219
ddanielr wants to merge 2 commits intoapache:2.1from
ddanielr:bugfix/disable-delete-optimization

Conversation

@ddanielr
Copy link
Contributor

This PR adds a property to skip scanning the metadata table for table deletes.

DeleteMarker Creation Optimization

When the manager deletes a table it performs an optimization step by creating a batch scanner with 8 threads (not configurable) to scan all the other table file references on the metadata table and ensure that no file references are found for the given table volume.

The manager then directly deletes the volumes as opposed to writing delete markers and allowing the GC to handle the tablet file deletions.

This is a nice optimization to have when dealing with a small static set of tables. However, when table creation is dynamic these scans can cause unnecessary delays and/or hanging fate processes as all metadata tablets must be scanned in order to process a single table delete.

Unnecessary File Ref Counting

The batch scanner only needs to produce a single shared file ref result (refCount) in order to trigger delete markers to be created as the code only checks if refCount is equal to zero.

However, the existing code needlessly counts all of the found refs first.
This is unnecessary and a fast break was added to the iterator loop for the batch scanner.

When delete table is called, the delete marker code checks to see if
any file references exist in other tables.
However, only a single reference has to exist for delete markers to be
created.

Added break out of for loop once a single entry was found.
Fixed log lines and removed a `getLogger` call.
Removed a nested try block in favor of a single try-with-resources
Adds a property to allow the scan of the metadata table to be skipped
for table deletes. This forces delete markers to always be created when
deleting tables instead of the manager deleting the volumes immediately.
@ddanielr ddanielr added this to the 2.1.5 milestone Mar 14, 2026
@ddanielr
Copy link
Contributor Author

@keith-turner & @ctubbsii I'm also leaning towards just removing this delete marker optimization entirely.

I wanted to be a bit more careful about it in 2.1 but if you don't see a reason to keep this code around then I'll push a change to just remove the scan and volume deletion shortcut entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant