To reduce the size and number of files in a Delta Parquet table with a long version history and a six-month retention policy, which maintenance operation should be run?

Prepare for the Fabric Analytics Engineer Associate Test with comprehensive materials. Explore flashcards, multiple choice questions, and detailed explanations. Get ready for your success!

Multiple Choice

To reduce the size and number of files in a Delta Parquet table with a long version history and a six-month retention policy, which maintenance operation should be run?

Explanation:
In Delta Lake, every write can create new data files, and updates or deletes often leave behind older files. A long version history plus a six-month retention policy means a lot of files may accumulate, some of which are no longer needed for current queries or for time travel beyond the retention window. The maintenance operation that directly addresses this is designed to physically remove those obsolete files that are no longer referenced by any active table version and that are older than the retention period. Running this operation under Maintenance reclaims storage by deleting unneeded files while still preserving the ability to time travel within the six-month window. Other options target different goals: organizing data for faster queries or compacting small files to improve performance, but they don’t prune older, unreferenced files to reduce size. Deleting the whole table would remove all data, which isn’t appropriate here.

In Delta Lake, every write can create new data files, and updates or deletes often leave behind older files. A long version history plus a six-month retention policy means a lot of files may accumulate, some of which are no longer needed for current queries or for time travel beyond the retention window. The maintenance operation that directly addresses this is designed to physically remove those obsolete files that are no longer referenced by any active table version and that are older than the retention period. Running this operation under Maintenance reclaims storage by deleting unneeded files while still preserving the ability to time travel within the six-month window.

Other options target different goals: organizing data for faster queries or compacting small files to improve performance, but they don’t prune older, unreferenced files to reduce size. Deleting the whole table would remove all data, which isn’t appropriate here.

Subscribe

Get the latest from Passetra

You can unsubscribe at any time. Read our privacy policy