Conversation
…e islandora model in the model map.
ctgraham
left a comment
There was a problem hiding this comment.
Initial readthrough with some questions and suggestions.
| logger.info(f"File is model: {imodel}, TID: {field_model}") | ||
|
|
||
| # Process any .tif files. | ||
| if (file_ext.lower() == ".tif"): |
There was a problem hiding this comment.
Special processing by type would be a good candidate to break out into separate functions for readability.
There was a problem hiding this comment.
The pattern of "Handle top level files" ... "Build row data" is also heavily repeated here.
There was a problem hiding this comment.
Agreed with separate functions for readability. This would probably be a next step as I was building these out as I went along.
| 'resouce_type': 'Text', | ||
| 'child': 'File', | ||
| }, | ||
| 'Publication Issue 1': { |
There was a problem hiding this comment.
Can "1" and "2" be given semantically meaningful names?
There was a problem hiding this comment.
Would love to, have any suggestions? Maybe "Publication Issue Paged" vs "Publication Issue PDF"? Not sure if MAD would want different names.
| "field_weight","field_model","model","field_resource_type","transcript"] | ||
|
|
||
| # Global file patterns to skip over. | ||
| globals()['skip'] = ["ignore",".jp2",".metadata","meta",".opex",".fits", |
There was a problem hiding this comment.
Are these skip patterns documented outside of this code?
There was a problem hiding this comment.
Probably not yet. Was thinking on adding the list to the config file to allow customization.
…as being the first value from the return column.
This PR addresses issue #1.
This is a full rewrite of the original scan-batch-dir script to allow it to be more modular so that changes needed in the future should be easier to implement. This also added the ability to use PDF files as newspaper issues.