I’m not sure if this is a feature bug, a documentation bug or simple user error but I have yet to successfully sync files from my MBP M1 Max to my Debian server. I’ve tried many different iterations starting as far back as preview 41 and as recent as RC9. I can get the machines to connect and accept the library invitation but the only thing that happens on the server is that the library folder and .aspectnode file are created. On the MBP it immediately goes to “Up to date”. My library is ~2TB so should definitely take more than a sec to verify everything is copied over.
What have I tried (Always starting from the library on the MBP):
Scenario 1: Preload the files on the server in the library path, exactly matching the MBP structure then add a photo on a. the MBP (didn’t sync to server) or b. the server (didn’t sync to the MBP)
Scenario 2: Start from a completely empty directory on the server (no files sync’d)
I’ve tried both scenarios running natively as root, natively as myself and natively as an aspect-specific user always with the same results
I’ve also tried both scenarios running in a custom docker container that I’ve built, again using different users, but again nothing.
I seem to have the server up and working correctly as the systems do connect and I see the library folder/file appear on the server. I just can’t get it to actually do anything.
Please let me know what I might be doing wrong or what other information I can provide to help understand what I’m experiencing
The current web UI doesn’t display the clone progress, yet*, and it takes a while until the actual files will be copied (the first two phases are copying library revisions and cached metadata/thumbnails). So my guess would be that it was still in one of the two earlier stages, especially if the library is rather large.
* The output (stdout/stderr) of the server process should show the current progress, though.
My most recent attempt has been running since yesterday. It failed with a segmentation fault so I restarted aspect-web on my server and aspect on my MBP but all I’ve seen in stdout/err is a warning and an error. Here’s my console output since initially connecting and accepting the invitation:
Listening for requests on https://0.0.0.0:37629/
Listening for requests on http://0.0.0.0:8083/
Recived request showInvitation library='Family Aspect Library'.
Request accepted.
Clone library to '/home/chris/.local/lib/aspect-web/libraries/Family Aspect Library'
Cloning library revisions count: 0/0 bytes 0/0
[main(Kmud) ERR] Failed to test library instance membership: Unknown library instance ID: 01JNVJ3BDRJR8CBQEEAKPYCH94
Storing library structure...
done.
Acquiring library lock...
Loading cache...
Loading revisions...
Loading library...
Loading existing library in file:///home/chris/.local/lib/aspect-web/libraries/Family%20Aspect%20Library
Preloading cached metadata...
Loading missing metadata...
Loading weak checksums...
Setting up duplicate detector...
Computing reduced initial file relations...
Triggering full initial file relations...
Cloning library revisions count: 0/0 bytes 0/0
Cloning library revisions count: 0/26 bytes 0/0
Segmentation fault
chris@nas-02:~/aspect/aspect-web$ ./aspect-web -p ./public -b 0.0.0.0:8083 --user=aspect --group=media
Listening for requests on https://0.0.0.0:44201/
Acquiring library lock...
Loading cache...
Loading revisions...
[main(iC2f) WRN] No HEAD revision found in /home/chris/.local/lib/aspect-web/libraries/Family Aspect Library/.revs! Assuming that no change history exists.
Loading library...
Loading existing library in file:///home/chris/.local/lib/aspect-web/libraries/Family%20Aspect%20Library
Preloading cached metadata...
Loading missing metadata...
Loading weak checksums...
Setting up duplicate detector...
Computing reduced initial file relations...
Triggering full initial file relations...
Listening for requests on http://0.0.0.0:8083/
[main(M6t+) WRN] Failed to get remote address for TCP connection
[main(M6t+) ERR] HTTP connection handler has thrown at the peer <UNSPEC>: Accepting SSL tunnel returned an error: non-recoverable socket I/O error: 0 (Success)
[main(IJg9) WRN] Failed to get remote address for TCP connection
[main(IJg9) ERR] HTTP connection handler has thrown at the peer <UNSPEC>: Accepting SSL tunnel returned an error: non-recoverable socket I/O error: 0 (Success)
edit: I just quit aspect-web to restart again and this was written to output
^CReceived signal 2. Shutting down.
Stopped to listen for HTTP requests on 0.0.0.0:8083
Shutting down sync server...
Stopped to listen for HTTPS requests on 0.0.0.0:44201
Suspending organization activities...
Unloading all libraries...
Storing library structure...
done.
closing request queue...
joining load workers...
load workers done
shutting down request queue...
destroying metadata cache...
destroying thumbnail cache...
image cache shut down.
Waiting for running library stats tasks...
Shutting down organization activities...
Local libraries dispose complete.
closing request queue...
joining load workers...
load workers done
shutting down request queue...
destroying metadata cache...
destroying thumbnail cache...
image cache shut down.
Warning (thread: main): leaking eventcore driver because there are still active handles
FD 165 (streamSocket)
Use '-debug=EventCoreLeakTrace' to show where the instantiation happened
Warning (thread: main): leaking eventcore driver because there are still active handles
FD 165 (streamSocket)
Use '-debug=EventCoreLeakTrace' to show where the instantiation happened
Did you possible encounter a crash on the server side? I tested with a larger library and encountered a crash right after accepting the library invitation (fixed for the next release). This caused the cloned library to be left in a rudimentary state where the synchronization settings hadn’t been set up, yet. Restarting the server would then successfully load the library, but would never start to synchronize anything.
I can’t say for sure, but based on the timing at least it seems very likely.
In addition to the crash, there also turned out to be another issue that resulted in newly added files not getting synchronized to the server. After fixing that, I’ve taken the opportunity to also add activities to the web UI, so that the clone progress, as well as later synchronization activity is now also visible there.
Following up here, sync is starting so that’s a big step forward! Thanks for addressing things so promptly. Unfortunately I still have not managed to sync to my server. I’ve made it to varying stages of the sync process, but every time the cloning gets interrupted. Mostly it’s been Killed during the process of Cloning library revisions count but this last time it completed revisions and moved on to metadata. Unfortunately it just stopped without any indication of error while Cloning library metadata count specifically 11055/174053.
The issue I am experiencing is that when the aspect-web service stops, the cloning process doesn’t appear to resume or recover where it left off. See below for complete stderr once service restarted. Note that nothing is written to the server’s file system after service is started again.
Listening for requests on https://0.0.0.0:41823/
Acquiring library lock...
Loading cache...
Loading revisions...
Loading library...
Loading existing library in file:///home/chris/.local/lib/aspect-web/libraries/Family%20Aspect%20Library
Preloading cached metadata...
Loading missing metadata...
Loading weak checksums...
Setting up duplicate detector...
Computing reduced initial file relations...
Triggering full initial file relations...
ENABLE FOR file:///home/chris/.local/lib/aspect-web/libraries/Family%20Aspect%20Library
Listening for requests on http://0.0.0.0:8083/
Updating export collections...
Edit: Just to mention my main concern here is that I can’t get the process to resume. I’ve tried restarting the service, selecting Synchronize Now in Aspect and even adding a new image to the library. Nothing seems to trigger any new activity on the server
I just realized that a fix that I made for the event that a crash happens during synchronization doesn’t apply to the initial synchronization during the clone process. With that extended to the initial clone, at least it should continue to synchronize after the restart.
A killed process usually means that the system’s out-of-memory killer was active. I’ve generated a library with 200k images for testing and observed a huge memory usage during the clone process (up to around 30GB), so that was probably indeed the case. An in-depth analysis brought up two places that accumulated a lot of memory, as well as a very sneaky memory leak caused by a bad interaction of a low-level library and the garbage collector. With all of those fixed, the memory usage now stays around 3 GB after the clone process, which is still more than it should be, but it’s a lot closer to the ideal.
The speed at which revisions get transferred has also been considerably improved along the way - still not really fast, but that is because all revisions are getting validated during the transfer. Eventually, this additional check will be removed from the release version, or at least replaced with a cheaper verification, which should then speed up the process by a lot.
Took me awhile to get back to a point of testing the latest changes. Seems something deleted all but 60 of my photos so had to rebuild the disk and recreate the library.
I have my laptop and server syncing now (currently on metadata phase) but I notice there’s a fairly large discrepancy in numbers.
as you can see from the screenshots, the “master” library on my MBP reports 174,164 photos totaling 1.4TB in the photo stream while the server only lists 106851 totaling 197MB.
While the server is still syncing, perhaps explaining the size discrepancy, I would expect the total count of photos to match. Am I misunderstanding the meaning of these numbers?
Seems something deleted all but 60 of my photos so had to rebuild the disk and recreate the library.
Could it be that at some point a lot of photos have been detected as “removed from the file system” and you removed them from the catalog? In that case it would synchronize the removal from the catalog with other instances and the removed files would be moved to the “remotely-deleted” folder within the library folder.
In general, I really hope that the application did not actually delete any files, the system is designed so that it only actually deletes files when explicitly requested to do so (e.g. using Edit → Delete).
While the server is still syncing, perhaps explaining the size discrepancy, I would expect the total count of photos to match. Am I misunderstanding the meaning of these numbers?
The statistics haven’t been updated in all cases where they should have. 1.0.0-rc.13 fixes this, although an update can take up to 2 minutes.
But the numbers should behave like you described. The number is the number of files in the library catalog, while the size is the size of the files that are actually stored locally. Eventually this is going to be a bit more transparent, displaying something like “130 GB of 1.6 TB”.
By the way, there has been another issue that is now fixed, where downloading files hasn’t been re-triggered properly after the remote peer went offline, which might explain that the clone progress got stuck.
Following up here now that I’ve had a chance to play with rc14 a bit. Unfortunately the latest build didn’t seem to fix any of my issues: aspect-web still closes with exit code 0, Image shaders not initialized. Falling back to CPU, Checksum for JIT cache file not found, as well as a few new ones Failed to load metadata for getFileInformation API: Failed to locate file aspect-file, Failed to load metadata for 'aspect-file:, etc.
Thinking the instability may have resulted from the previous runs on older versions, I deleted the entire aspect-web library, re-invited the server to join the library and attempted to sync again. I’m not sure if it was this action or something else but I noticed that nearly all image files were deleted from my “master” aspect library again.
I’m currently rebuilding my image drive to start over from scratch. Will report back with updates once I kick that off again.
I still didn’t find a place where the server could exit with code 0 without performing a proper shutdown and also no way how the command line usage hint would print without exiting immediately afterwards, so this part still remains a mystery. Is it possible that the output might somehow come from two different processes?
Image shaders not initialized. Falling back to CPU: Do you have set up any export collections? I just realized that export collections are currently also processed on the server, which really is unnecessary. This shouldn’t really be an issue, though, except for the performance degradation.
Failed to load metadata for getFileInformation API: Failed to locate file aspect-file and Failed to load metadata for 'aspect-file:: These are (unnecessary) warnings that can be caused by files that are in the catalog, but are missing in the file system (either missing properly or not present due to sync storage settings). This will be gone from the log in the next release, since it doesn’t add any value.
One thing in the current synchronization behavior of the server version that might lead to files being removed in the desktop library instance is that files deleted from the file system will automatically be removed from the catalog. If, for example, the contents of the library folder would get deleted while aspect-web is still running, this could result in all files being considered deleted. However, in this case, the files should still be found in the remotely-deleted folder on the desktop, since files will never actually be deleted during the synchronization process as a safety measure*.
But looking at this again, even though this also used to be the default behavior of the desktop version up to 1.0.0-rc.6, it doesn’t really make sense for the server version, since modifying the library on the server within the file system is not really considered a use case and the possibility to accidentally remove files this way is a rather strong argument against it. I’ll change this for the next release to always retain catalog entries.
* Having the files in remotely-deleted will also allow to implement the possibility to revert changes where files have been deleted (regarding the revision history feature request)
I’m not sure what an export collection is but I don’t think i have anything like that set up.
As far as the log, I’m realizing now that the sample I attached was very misleading. I’m very sorry and hope I didn’t waste too much of your time. The help print out was the result of my incorrectly specifying the verbosity flag; I corrected it but failed to delete the log file before restarting. Attached is a more representative version from the rc14 build before I completely deleted the library. Unfortunately it doesn’t have any detailed logs but you can see the repeating pattern and the healthy exit. aspect-web.txt (8.0 KB)
As for the deleting files:
I can confirm that aspect-web was not running when I deleted the library (I rm -rf’d the entire folder within the libraries root directory) but I did restart it afterwards to invite it to join again. Worth emphasizing, I reused the previously established connection to the server and simply sent the library invitation again. Not sure if that is a contributor. I’m happy to clarify if I’m not using the right terminology here.
Unfortunately I only thought to check the source “aspect” library for files after reconnecting, as no images were syncing
a. at that point there were still a handful of raw (*.NEF) files in the source library but they did not sync over. i didn’t think to see if there were any notable patterns for these surviving files, neither indicating why they weren’t deleted nor why they weren’t picked up for syncing. This could be a symptom of a different, and alarming, issue whereby these files were never added to the library to begin with.
I’ve considered creating a new issue for this but it all seems interrelated so I’m adding to this existing issue. If you’d prefer I report a new issue for tracking please let me know.
Using RC14 I’ve now wiped and reimaged my external drive 3 times and created a new aspect library from scratch each time. Unfortunately I continue to get errors syncing to the server (see logs from latest attempt attached). This last time I attempted to close aspect normally as aspect-web could no longer communicate but, after waiting several hours, I ended up force closing it. When I reopened the library it all seemed ok as I saw the usual images at the top. However, after giving it a few minutes to load, the library had completely emptied itself and moved all* image and movie files to the remotely-deleted folder.
* I still see quite a few .tmp and even some .NEF and .HEIC files left behind, though I also notice the original folder (before Aspect) has a ton of non-image files in it, including a .photoslibrary folder. I wonder if that could be a contributor to the stability problems I’ve experienced.
Okay, that should be fine, the server version doesn’t keep track of the list of libraries, but will simply search through all sub folders of the “libraries” folder at each start, so that it can’t affect the library in any way at that point.
Very strange, I really wish there was some hint as to how this can be reproduced. One thing that could shed some light on this would be to take a look at the .revs folder to pinpoint on which side the files have been removed from the catalog. I’ll also include the current revision browser in the next release, which can then be enabled using View → Open Console and entering invoke("app.showRevisionBrowser").
I’d agree that this all seems to be related and it makes sense to keep it together.
It would be very interesting to see which activity was running on each side. It sounds like there may either be a place where a communication failure isn’t being handled correctly and leads to the server erroneously detecting affected files to have been actively deleted from the file system (combined with the existing behavior to then remove them from the catalog). Or there might be some part of the algorithm where terminating it at just the right time would leave a lot of files in a state where they would be recognized as actively deleted.
Both possibilities should be fixed with the change to default to keeping catalog entries for deleted files, but I will set up some automated tests where the synchronization process gets terminated randomly mid-way to ensure that this is handled robustly in all cases.
When synchronized from macOS, it should ignore the .photoslibrary folder as long as it is recognized as a bundle by the OS, but I’ll experiment with that locally to see what happens.