You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a problem with the current version of iiifify.sh. The problem is that:
it is intended to be run concurrently (thus the “run once a minute in cron” instruction)
its for-loop builds a list of multiple barcodes, but it only can process each barcode one at a time
This leads dibsiiif.py to throw errors in cases where it shouldn't. Assume you have 3 books to process and it takes longer than a minute to process a book. Now suppose one iiifify.sh job runs in cron and decides to loop over books 1, 2, and 3, and a minute later a second iiifify.sh job runs in cron and decides to loop over books 2 and 3.
When the first job finishes book 1 and tries to start book 2, dibsiiif.py will throw an error (because the 2-processing file already exists) and it will create a 2-problem file, which means that users of the app will see a red exclamation mark icon in the dibs item listing for book 2. The first job will then move on to book 3 (assuming no other job has picked up book 3 by now).
When the second job finishes book 2 and tries to start book 3, some other job may have already created the 3-processing file. In that situation, dibsiiif.py will throw an error (because the 3-processing file already exists) and create a 3-problem file, which means that users of the app will see a red exclamation mark icon in the dibs item listing for book 3.
As a result of this, users of the dibs web app then need to ask their system administrators to go in and remove the problem and processing files for those items in order to proceed.
The concurrency solution
We propose the following changes to the way dibsiiif processes multiple barcodes, in a PR coming soon:
each invocation of iiifify.sh via cron processes only a single barcode directory
the single barcode chosen will be the oldest barcode-initiated file by timestamp in the status/files location to ensure that the directories are processed in the order requested instead of alphabetically
The text was updated successfully, but these errors were encountered:
The concurrency problem
There is a problem with the current version of
iiifify.sh
. The problem is that:This leads
dibsiiif.py
to throw errors in cases where it shouldn't. Assume you have 3 books to process and it takes longer than a minute to process a book. Now suppose oneiiifify.sh
job runs incron
and decides to loop over books 1, 2, and 3, and a minute later a secondiiifify.sh
job runs incron
and decides to loop over books 2 and 3.When the first job finishes book 1 and tries to start book 2,
dibsiiif.py
will throw an error (because the2-processing
file already exists) and it will create a2-problem
file, which means that users of the app will see a red exclamation mark icon in thedibs
item listing for book 2. The first job will then move on to book 3 (assuming no other job has picked up book 3 by now).When the second job finishes book 2 and tries to start book 3, some other job may have already created the
3-processing
file. In that situation,dibsiiif.py
will throw an error (because the 3-processing file already exists) and create a3-problem
file, which means that users of the app will see a red exclamation mark icon in thedibs
item listing for book 3.As a result of this, users of the
dibs
web app then need to ask their system administrators to go in and remove the problem and processing files for those items in order to proceed.The concurrency solution
We propose the following changes to the way
dibsiiif
processes multiple barcodes, in a PR coming soon:iiifify.sh
via cron processes only a single barcode directoryThe text was updated successfully, but these errors were encountered: