Very delayed actions (30+ seconds to delete a set with only 2 or 3 images, multiple “pinwheel” moments, navigating between duplicate groups results in multiple pauses mid-scroll) when navigating the Duplicate cleanup process
Expected behavior
Responsive and informative user experiences
Steps required to reproduce
Load a large library containing duplicates (mine is >2TB. with ~500K photos)
Select “Find Duplicate Photos” from the Library menu
Click “Delete This Set”
Observe the time it takes to complete the delete process
Observe the lag effect while moving to the next duplicate group
Operating system/Hardware used
MacBook Pro M1 Max
64 GB Memory
Macos 15.4 (24E248)
…
Note that I have a short video demostrating the reported problems. At 480P it is only 1.5MB but i cannot upload as mp4 is not an accepted extension
Quick update, this may be related to the “Analyzing images” resource usage and not specific to duplicates. The application as a whole is very sluggish. I will post another update when Analyzing Images completes
The just released 1.0.0-rc.16 should improve this quite a bit, although, for large libraries, there are still some hiccups while the library catalog is being stored, as well as while a new revision is stored, which is getting triggered by the duplicate removal.
Sorry for the delayed response. Here are my observations:
RC15 - I noticed when I paused the analysis task the UI was dramatically more responsive, including navigating duplicates. However, even when paused there was a significant “think time” after selecting delete for any given duplicate
RC16 - Analysis had completed by the time I updated to RC16 so I cannot comment on the improvements but the aforementioned think time remains.
Naive Suggestions:
Analysis - could this be spun out to a separete, low-priority process? That would prevent it from cannibalizing resources from the higher priority UI thread but obviously comes with IPC and maintanence overhead…
Duplicate Think Time - This may have to do with the workspace refresh behind the duplicates dialog. Could the workspace/library refresh be paused until after all duplicate processing has been completed?
I’ll have to look into the performance behavior of the image analysis process in more detail – on the machines that I tested on, the only notable effect is that thumbnails are loading slower, but the overall UI remains responsive. For the record, the process is already running in low-priority worker thread(s) (one thread as log as the application is in active use and one thread per hardware thread otherwise). So there must be some resource contention with the main thread going on – this could also be the garbage collector performing collections, which starts to get noticeable for large libraries.
This may have to do with the workspace refresh behind the duplicates dialog
You are spot on here, this is what I’ve observed on RC-15 to be the main factor. In my case the performance improvements in RC-16 made the removal process fast enough to not trigger a refresh between each deleted image, but at the end of the whole set instead. However, that was probably just because the test images were particularly small.
The library organization process, as well as creating a new revision, is also triggered eventually and also contributes to the sluggishness. I spent the last two days tracking down places where the main thread can gets blocked for more than a few ms and have eliminated most of the main offenders, so that should also improve this a lot.
There is also some other housekeeping overhead, such as eliminating any references to the removed files and updating the keyword tree. I’ll see if there are some low hanging fruits in terms of optimization here.
Could the workspace/library refresh be paused until after all duplicate processing has been completed?
Even with the optimizations, I’d agree that this is necessary to properly fix the issue. Depending on the storage device and file size, just the process of performing a full hashing of the contents prior to deleting a file can take long enough to trigger most of the mentioned activities.
1.0.0-rc.17 has this fixed now. There will be a view update and a library organization triggered after processing a complete batch, but not between each file. Also, some of the other overhead has been reduced (e.g. searching the library for references to the deleted duplicates and replacing them with references to a remaining copy of the file).