-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Describe the bug
When size is unspecified and using the OpenSlide backend, WSIReader.get_data() assumes by default that the size of the slide at a given magnification level is 2**level times smaller than the full-resolution size (see here). However, the magnification factors are not necessarily consecutive powers of 2, in which case the reader can return arrays of vastly incorrect size.
For example, I've encountered this issue with slides from the TCGA dataset. At the highest level, the image is downsampled by 32x instead of the assumed 8x, meaning the reader returns an image array that is 4x larger than necessary in each dimension (filled with 1/16th data, the rest padded with zeros).
To Reproduce
- Download a pathology slide file with inhomogeneous downsampling levels (e.g. file "TCGA-2A-A8VL-01Z-00-DX1.2C2BD6EF-EC17-4117-AE89-A22B67AFB233.svs" from TCGA, available here on the GDC portal; has [1x, 4x, 16x, 32x] downsampling).
- Install OpenSlide and its Python wrapper.
- Instantiate
WSIReader(backend='openslide')
from monai.data.image_reader import WSIReader
reader = WSIReader(backend='openslide')
slide = reader.read("<your slide file>")
level = slide.level_count - 1 # pick highest level for demonstration
image = reader.get_data(slide, level=level)
# image is 4x4 times larger than expected and filled with 15/16=93.8% zerosExpected behavior
The WSIReader should consider the actual magnification levels and corresponding dimensions encoded in the file headers. This information is readily available in the OpenSlide object's level_dimensions attribute (docs).
Environment
monaiv0.6.0 (but same behaviour is still implemented in latest v0.8)openslidev3.4.1openslide-pythonv1.1.2