Conversation
|
Nice, that is a useful fallback, as the other fallback (to search based on the filesystem and standard naming scheme) is really hacky.. If this works fine then I would suggest removing the other fallback.. Btw these imaris datasets with no OME metadata were from how Shaz was running the stitching, and hopefully shouldn't come up in the future, but good to have a patch for this.. |
There was a problem hiding this comment.
Pull request overview
Adds a third, fallback metadata-extraction strategy for Imaris .ims files by reading native HDF5 attributes when OME-XML (in-file) and associated .ome.tif metadata are unavailable.
Changes:
- Added
h5_attr_to_str()and a new native-Imaris fallbackbuild_bids_metadata_from_native_imaris()usingExtMin/ExtMax+X/Y/Zto compute voxel sizes. - Refactored extraction flow into three explicit strategies (OME-in-file → tif fallback → native HDF5), replacing bare
except:with more specific exception handling.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def h5_attr_to_str(attr): | ||
| """Convert an HDF5 byte-array attribute to a Python string. | ||
|
|
||
| Imaris stores attributes as arrays of single-byte values, e.g. | ||
| [b'4', b'.', b'0'] → '4.0' | ||
| """ | ||
| if attr is None: | ||
| return None | ||
| return "".join(b.decode("utf-8") for b in attr) |
There was a problem hiding this comment.
h5_attr_to_str assumes the attribute is an iterable of 1-byte values (each with .decode()), but h5py attributes are often returned as a scalar bytes/str, a NumPy scalar (e.g. np.bytes_), or a NumPy array of uint8. In those cases this will raise (e.g. iterating a bytes yields ints). Consider making this helper handle: str (return as-is), bytes/np.bytes_ (single decode), NumPy ndarray (decode bytes dtype or convert uint8 via .tobytes()), and strip any trailing \x00 padding if present.
| for attr_name in ["Name", "LSMExcitationWavelength", "LSMEmissionWavelength", | ||
| "Color", "Description"]: |
There was a problem hiding this comment.
The for attr_name in [...] list is formatted in a way that likely won’t pass the repo’s Black formatting checks (the line is far beyond typical line length and Black will reflow it). Please reformat this list in Black-compliant style (e.g., one item per line) to avoid CI/lint failures.
| for attr_name in ["Name", "LSMExcitationWavelength", "LSMEmissionWavelength", | |
| "Color", "Description"]: | |
| for attr_name in [ | |
| "Name", | |
| "LSMExcitationWavelength", | |
| "LSMEmissionWavelength", | |
| "Color", | |
| "Description", | |
| ]: |
| custom_attrs = xml_dict["OME"]["Image"]["ca:CustomAttributes"] | ||
| print(custom_attrs) | ||
| bids_metadata = build_bids_metadata(custom_attrs) | ||
| except (IndexError, KeyError, TypeError, ValueError) as e: |
There was a problem hiding this comment.
Strategy 2’s tif fallback only catches IndexError, KeyError, TypeError, ValueError, but opening/reading a found .ome.tif can also raise OSError/FileNotFoundError (e.g., corrupt file, permissions). In that case the script will crash instead of continuing to the native Imaris HDF5 fallback. Consider including OSError (and/or FileNotFoundError) in this except clause so Strategy 3 can still run when the tif path exists but can’t be read.
| except (IndexError, KeyError, TypeError, ValueError) as e: | |
| except (IndexError, KeyError, TypeError, ValueError, OSError) as e: |
| except (KeyError, TypeError, ValueError) as e: | ||
| print( | ||
| f"Warning: cannot find OME metadata from imaris file ({e}), " | ||
| "trying tif fallback..." | ||
| ) |
There was a problem hiding this comment.
Strategy 1 only catches KeyError, TypeError, ValueError, but reading the embedded OME dataset (hdf5_file[...][:]) can also raise OSError (e.g., dataset exists but can’t be read). If you want to continue on to strategies 2/3 in that case, consider adding OSError to this except list.
Added a fallback strategy for extracting metadata from Imaris native HDF5 attributes. The existing two strategies are unchanged and still tried first. The new fallback only runs if both fail. Also cleans up the exception handling (proper
try/exceptwith specific exception types instead of bareexcept:).Tried on
/nfs/trident3/lightsheet/prado/mouse_app_vaccine_batch/bids/sub-AS177M4/using: