Datasets API

`amid.amos.dataset.AMOS`

AMOS provides 500 CT and 100 MRI scans collected from multi-center, multi-vendor, multi-modality, multi-phase, multi-disease patients, each with voxel-level annotations of 15 abdominal organs, providing challenging examples and test-bed for studying robust segmentation algorithms under diverse targets and scenarios. [1]

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	Absolute path to the root containing the downloaded archive and meta. If not provided, the cache is assumed to be already populated.	required

Notes

Download link: https://zenodo.org/record/7262581/files/amos22.zip

Examples:

>>> # Download the archive and meta to any folder and pass the path to the constructor:
>>> ds = AMOS(root='/path/to/the/downloaded/files')
>>> print(len(ds.ids))
# 961
>>> print(ds.image(ds.ids[0]).shape)
# (768, 768, 90)
>>> print(ds.mask(ds.ids[26]).shape)
# (512, 512, 124)

References

.. [1] JI YUANFENG. (2022). Amos: A large-scale abdominal multi-organ benchmark for versatile medical image segmentation [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7262581

`birth_date(id: str)`

`sex(id: str)`

`age(id: str)`

`manufacturer_model(id: str)`

`manufacturer(id: str)`

`acquisition_date(id: str)`

`site(id: str)`

`ids()`

`image(id: str)`

Corresponding 3D image.

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`mask(id: str)`

`image_modality(id: str)`

Returns image modality, CT or MRI.

`amid.bimcv.BIMCVCovid19`

BIMCV COVID-19 Dataset, CT-images only It includes BIMCV COVID-19 positive partition (https://arxiv.org/pdf/2006.01174.pdf) and negative partion (https://ieee-dataport.org/open-access/bimcv-covid-19-large-annotated-dataset-rx-and-ct-images-covid-19-patients-0)

PCR tests are not used

GitHub page: https://github.com/BIMCV-CSUSP/BIMCV-COVID-19

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the downloaded and parsed data.	required

Notes

Dataset has 2 partitions: bimcv-covid19-positive and bimcv-covid19-positive Each partition is spread over the 81 different tgz archives. The archives includes metadata about subject, sessions, and labels. Also there are some tgz archives for nifty images in nii.gz format

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = BIMCVCovid19(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 201
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 163)
>>> print(ds.is_positive(ds.ids[0]))
# True
>>> print(ds.subject_info[80])
# {'modality_dicom': "['CT']",
#  'body_parts': "[['chest']]",
#  'age': '[80]',
#  'gender': 'M'}

References

.. [1] Maria De La Iglesia Vayá, Jose Manuel Saborit, Joaquim Angel Montell, Antonio Pertusa, Aurelia Bustos, Miguel Cazorla, Joaquin Galant, Xavier Barber, Domingo Orozco-Beltrán, Francisco Garcia, Marisa Caparrós, Germán González, and Jose María Salinas. BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients. arXiv:2006.01174, 2020. .. [2] Maria de la Iglesia Vayá, Jose Manuel Saborit-Torres, Joaquim Angel Montell Serrano, Elena Oliver-Garcia, Antonio Pertusa, Aurelia Bustos, Miguel Cazorla, Joaquin Galant, Xavier Barber, Domingo Orozco-Beltrán, Francisco García-García, Marisa Caparrós, Germán González, Jose María Salinas, 2021. BIMCV COVID-19-: a large annotated dataset of RX and CT images from COVID-19 patients. Available at: https://dx.doi.org/10.21227/m4j2-ap59.

`ids()`

`session_id(id: str)`

`subject_id(id: str)`

`is_positive(id: str)`

`image(id: str)`

`affine(id: str)`

`tags(id: str) -> dict`

dicom tags

`label_info(id: str) -> dict`

labelCUIS, Report, LocalizationsCUIS etc.

`subject_info(id: str) -> dict`

modality_dicom (=[CT]), body_parts(=[chest]), age, gender

`age(id: str) -> int`

Minimum of (possibly two) available ages. The maximum difference between max and min age for every patient is 1 year.

`sex(id: str) -> str`

`session_info(id: str) -> dict`

study_date, medical_evaluation

`amid.brats2021.BraTS2021`

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Download links: 2021: http://www.braintumorsegmentation.org/

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = BraTS2021(root='/path/to/archives/root')
>>> print(len(ds.ids))
# 5880
>>> print(ds.image(ds.ids[0]).shape)
# (240, 240, 155)

References

`ids()`

`fold(id: str)`

`mapping21_17(id: str) -> pd.DataFrame`

`subject_id(id: str) -> str`

`modality(id: str) -> str`

`image(id: str)`

`mask(id: str)`

`spacing(id: str)`

Returns the voxel spacing along axes (x, y, z).

`affine(id: str)`

Returns 4x4 matrix that gives the image's spatial orientation.

`amid.cc359.dataset.CC359`

A (C)algary-(C)ampinas public brain MR dataset with (359) volumetric images [1]_.

There are three segmentation tasks on this dataset: (i) brain, (ii) hippocampus, and (iii) White-Matter (WM), Gray-Matter (WM), and Cerebrospinal Fluid (CSF) segmentation.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

homepage (upd): https://sites.google.com/view/calgary-campinas-dataset/home homepage (old): https://miclab.fee.unicamp.br/calgary-campinas-359-updated-05092017

To obtain MR images and brain and hippocampus segmentation masks, please, follow the instructions at the download platform: https://portal.conp.ca/dataset?id=projects/calgary-campinas.

Via datalad lib you need to download three zip archives: - Original.zip (the original MR images) - hippocampus_staple.zip (Silver-standard hippocampus masks generated using STAPLE) - Silver-standard-machine-learning.zip (Silver-standard brain masks generated using a machine learning method)

To the current date, WM, GM, and CSF mask could be downloaded only from the google drive: https://drive.google.com/drive/u/0/folders/0BxLb0NB2MjVZNm9JY1pWNFp6WTA?resourcekey=0-2sXMr8q-n2Nn6iY3PbBAdA.

Here you need to manually download a folder (from the google drive root above) CC359/Reconstructed/CC359/WM-GM-CSF/

So the root folder to pass to this dataset class should contain four objects: - three zip archives (Original.zip, hippocampus_staple.zip, and Silver-standard-machine-learning.zip) - one folder WM-GM-CSF with the original structure: <...>/WM-GM-CSF/CC0319_ge_3_45_M.nii.gz <...>/WM-GM-CSF/CC0324_ge_3_56_M.nii.gz ...

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> cc359 = CC359(root='/path/to/downloaded/data/folder/')
>>> print(len(cc359.ids))
# 359
>>> print(cc359.image(cc359.ids[0]).shape)
# (171, 256, 256)
>>> print(cc359.wm_gm_csf(cc359.ids[80]).shape)
# (180, 240, 240)

References

.. [1] Souza, Roberto, et al. "An open, multi-vendor, multi-field-strength brain MR dataset and analysis of publicly available skull stripping methods agreement." NeuroImage 170 (2018): 482-494. https://www.sciencedirect.com/science/article/pii/S1053811917306687

`ids()`

`vendor(id: str)`

`field(id: str)`

`age(id: str)`

`sex(id: str)`

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`voxel_spacing(id: str)`

`spacing(id: str)`

Returns voxel spacing along axes (x, y, z).

`brain(id: str)`

`hippocampus(id: str)`

`wm_gm_csf(id: str)`

`amid.cl_detection.CLDetection2023`

The data for the "Cephalometric Landmark Detection in Lateral X-ray Images" Challenge, held with the MICCAI-2023 conference.

Notes

The data can only be obtained by contacting the organizers by email. See the challenge home page for details.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded and unarchived data. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = CLDetection2023(root='/path/to/data/root/folder')
>>> print(len(ds.ids))
# 400
>>> print(ds.image(ds.ids[0]).shape)
# (2400, 1935)

`ids()`

`image(id: str)`

`points(id: str)`

`spacing(id: str)`

`amid.crlm.CRLM`

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Download links: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=89096268#89096268b2cc35fce0664a2b875b5ec675ba9446

This collection consists of DICOM images and DICOM Segmentation Objects (DSOs) for 197 patients with Colorectal Liver Metastases (CRLM). Comprised of Original DICOM CTs and Segmentations for each subject. The segmentations include 'Liver', 'Liver_Remnant' (liver that will remain after surgery based on a preoperative CT plan), 'Hepatic' and 'Portal' veins, and 'Tumor_x', where 'x' denotes the various tumor occurrences in the case

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = CRLM(root='/path/to/archives/root')
>>> print(len(ds.ids))
# 197
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 52)

References

`ids()`

`image(id: str)`

`mask(id: str) -> Dict[str, np.ndarray]`

Returns dict: {'liver': ..., 'hepatic': ..., 'tumor_x': ...}

`spacing(id: str)`

Returns the voxel spacing along axes (x, y, z).

`slice_locations(id: str)`

`affine(id: str)`

Returns 4x4 matrix that gives the image's spatial orientation.

`amid.ct_ich.CT_ICH`

(C)omputed (T)omography Images for (I)ntracranial (H)emorrhage Detection and (S)egmentation.

This dataset contains 75 head CT scans including 36 scans for patients diagnosed with intracranial hemorrhage with the following types: Intraventricular, Intraparenchymal, Subarachnoid, Epidural and Subdural.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Data can be downloaded here: https://physionet.org/content/ct-ich/1.3.1/. Then, the folder with raw downloaded data should contain folders ct_scans and masks along with other files.

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = CT_ICH(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 75
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 39)
>>> print(ds.mask(ds.ids[0]).shape)
# (512, 512, 39)

`ids()`

`image(id: str)`

`mask(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`voxel_spacing(id: str)`

`spacing(id: str)`

Returns voxel spacing along axes (x, y, z).

`age(id: str) -> float`

`sex(id: str) -> str`

`intraventricular_hemorrhage(id: str)`

Returns True if hemorrhage exists and its type is intraventricular.

`intraparenchymal_hemorrhage(id: str)`

Returns True if hemorrhage was diagnosed and its type is intraparenchymal.

`subarachnoid_hemorrhage(id: str)`

Returns True if hemorrhage was diagnosed and its type is subarachnoid.

`epidural_hemorrhage(id: str)`

Returns True if hemorrhage was diagnosed and its type is epidural.

`subdural_hemorrhage(id: str)`

Returns True if hemorrhage was diagnosed and its type is subdural.

`fracture(id: str)`

Returns True if skull fracture was diagnosed.

`notes(id: str)`

Returns special notes if they exist.

`hemorrhage_diagnosis_raw_metadata(id: str)`

`amid.crossmoda.CrossMoDA`

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Download links: 2021 & 2022: https://zenodo.org/record/6504722#.YsgwnNJByV4

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = CrossMoDA(root='/path/to/archives/root')
>>> print(len(ds.ids))
# 484
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 214)

References

`ids()`

`train_source_df(id: str)`

`image(id: str) -> Union[np.ndarray, None]`

`pixel_spacing(id: str)`

`spacing(id: str)`

Returns pixel spacing along axes (x, y, z)

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation

`split(id: str) -> str`

The split in which this entry is contained: training_source, training_target, validation

`year(id: str) -> int`

The year in which this entry was published: 2021 or 2022

`masks(id: str) -> Union[np.ndarray, None]`

Combined mask of schwannoma and cochlea (1 and 2 respectively)

`koos_grade(id: str)`

VS Tumour characteristic according to Koos grading scale: [1..4] or (-1 - post operative)

`amid.deeplesion.DeepLesion`

DeepLesion is composed of 33,688 bookmarked radiology images from 10,825 studies of 4,477 unique patients. For every bookmarked image, a bound- ing box is created to cover the target lesion based on its measured diameters [1].

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing `DL_info.csv` file and a subfolder `Images_nifti` with 20094 nii.gz files.	required

Notes

Dataset is available at https://nihcc.app.box.com/v/DeepLesion

To download the data we recommend using a Python script provided by the authors batch_download_zips.py. Once you download the data and unarchive all 56 zip archives, you should run DL_save_nifti.py provided by the authors to convert 2D PNGs into 20094 nii.gz files.

Example

ds = DeepLesion(root='/path/to/folder') print(len(ds.ids))

20094

References

.. [1] Yan, Ke, Xiaosong Wang, Le Lu, and Ronald M. Summers. "Deeplesion: Automated deep mining, categorization and detection of significant radiology image findings using large-scale clinical lesion annotations." arXiv preprint arXiv:1710.01766 (2017).

`ids()`

`patient_id(id: str)`

`study_id(id: str)`

`series_id(id: str)`

`sex(id: str)`

`age(id: str)`

Patient Age might be different for different studies (dataset contains longitudinal records).

`ct_window(id: str)`

CT window extracted from DICOMs. Recall, that it is min-max values for windowing, not width-level.

`affine(id: str)`

`spacing(id: str)`

`image(id: str)`

Some 3D volumes are stored as separate subvolumes, e.g. ds.ids[15000] and ds.ids[15001].

`train_val_test(id: str)`

Authors' defined randomly generated patient-level data split, train=1, validation=2, test=3, 70/15/15 ratio.

`lesion_position(id: str)`

Lesion measurements as it appear in DL_info.csv, for details see https://nihcc.app.box.com/v/DeepLesion/file/306056134060 .

`mask(id: str)`

Mask of provided bounding boxes. Recall that bboxes annotation is very coarse, it only covers a single 2D slice.

`amid.egd.EGD`

The Erasmus Glioma Database (EGD): Structural MRI scans, WHO 2016 subtypes, and segmentations of 774 patients with glioma [1]_.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

The access to the dataset could be requested at XNAT portal [https://xnat.bmia.nl/data/archive/projects/egd].

To download the data in the compatible structure we recommend to use egd-downloader script [https://zenodo.org/record/4761089#.YtZpLtJBxhF]. Please, refer to its README for further information.

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> egd = EGD(root='/path/to/downloaded/data/folder/')
>>> print(len(egd.ids))
# 774
>>> print(egd.t1gd(egd.ids[215]).shape)
# (197, 233, 189)
>>> print(egd.manufacturer(egd.ids[444]))
# Philips Medical Systems

References

.. [1] van der Voort, Sebastian R., et al. "The Erasmus Glioma Database (EGD): Structural MRI scans, WHO 2016 subtypes, and segmentations of 774 patients with glioma." Data in brief 37 (2021): 107191. https://www.sciencedirect.com/science/article/pii/S2352340921004753

`ids()`

`brain_mask(id: str)`

`deface_mask(id: str)`

`modality(id: str)`

`subject_id(id: str)`

`affine(id: str)`

`voxel_spacing(id: str)`

`spacing(id: str)`

`image(id: str)`

`genetic_and_histological_label_idh(id: str)`

`genetic_and_histological_label_1p19q(id: str)`

`genetic_and_histological_label_grade(id: str)`

`age(id: str)`

`sex(id: str)`

`observer(id: str)`

`original_scan(id: str)`

`manufacturer(id: str)`

`system(id: str)`

`field(id: str)`

`mask(id: str)`

`amid.flare2022.FLARE2022`

An abdominal organ segmentation dataset for semi-supervised learning [1]_.

The dataset was used at the MICCAI FLARE 2022 challenge.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required

Notes

Download link: https://flare22.grand-challenge.org/Dataset/

The root folder should contain the two downloaded folders, namely: "Training" and "Validation".

Examples:

>>> # Place the downloaded folders in any folder and pass the path to the constructor:
>>> ds = FLARE2022(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 2100
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 110)
>>> print(ds.mask(ds.ids[25]).shape)
# (512, 512, 104)

References

.. [1] Ma, Jun, et al. "Fast and Low-GPU-memory abdomen CT organ segmentation: The FLARE challenge." Medical Image Analysis 82 (2022): 102616.

`ids()`

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation

`mask(id: str)`

`amid.hcp.HCP`

`ids()`

`image(id: str)`

`affine(id: str)`

`spacing(id: str)`

`amid.kits.KiTS23`

Kidney and Kidney Tumor Segmentation Challenge, The 2023 Kidney and Kidney Tumor Segmentation challenge (abbreviated KiTS23) is a competition in which teams compete to develop the best system for automatic semantic segmentation of kidneys, renal tumors, and renal cysts.

Competition page is https://kits-challenge.org/kits23/, official competition repository is https://github.com/neheller/kits23/.

For usage, clone the repository https://github.com/neheller/kits23/, install and run kits23_download_data.

Parameters:

Name	Type	Description	Default
`root`			required

Example

`ids()`

`image(id: str)`

`mask(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`amid.lidc.dataset.LIDC`

The (L)ung (I)mage (D)atabase (C)onsortium image collection (LIDC-IDRI) [1]_ consists of diagnostic and lung cancer screening thoracic computed tomography (CT) scans with marked-up annotated lesions and lung nodules segmentation task. Scans contains multiple expert annotations.

Number of CT scans: 1018.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Follow the download instructions at https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=1966254.

Then, the folder with raw downloaded data should contain folder LIDC-IDRI, which contains folders LIDC-IDRI-*.

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = LIDC(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 1018
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 194)
>>> print(ds.cancer(ds.ids[0]).shape)
# (512, 512, 194)

References

.. [1] Armato III, McLennan, et al. "The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans." Medical physics 38(2) (2011): 915–931. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041807/

`ids()`

`image(id: str)`

`study_uid(id: str)`

`series_uid(id: str)`

`patient_id(id: str)`

`sop_uids(id: str)`

`pixel_spacing(id: str)`

`slice_locations(id: str)`

`voxel_spacing(id: str)`

Returns voxel spacing along axes (x, y, z).

`spacing(id: str)`

Volumetric spacing of the image. The maximum relative difference in slice_locations < 1e-3 (except 4 images listed below), so we allow ourselves to use the common spacing for the whole 3D image.

Note

The slice_locations attribute typically (but not always!) has the constant step. In LIDC dataset, only 4 images have difference in slice_locations > 1e-3: 1.3.6.1.4.1.14519.5.2.1.6279.6001.526570782606728516388531252230 1.3.6.1.4.1.14519.5.2.1.6279.6001.329334252028672866365623335798 1.3.6.1.4.1.14519.5.2.1.6279.6001.245181799370098278918756923992 1.3.6.1.4.1.14519.5.2.1.6279.6001.103115201714075993579787468219 And these differences appear in the maximum of 3 slices. Therefore, we consider their impact negligible.

`contrast_used(id: str)`

If the DICOM file for the scan had any Contrast tag, this is marked as True.

`is_from_initial(id: str)`

Indicates whether or not this PatientID was tagged as part of the initial 399 release.

`orientation_matrix(id: str)`

`sex(id: str)`

`age(id: str)`

`conv_kernel(id: str)`

`kvp(id: str)`

`tube_current(id: str)`

`study_date(id: str)`

`accession_number(id: str)`

`nodules(id: str)`

`nodules_masks(id: str)`

`cancer(id: str)`

`amid.lits.dataset.LiTS`

A (Li)ver (T)umor (S)egmentation dataset [1] from Medical Segmentation Decathlon [2]

There are two segmentation tasks on this dataset: liver and liver tumor segmentation.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Follow the download instructions at https://competitions.codalab.org/competitions/17094.

Then, the folder with raw downloaded data should contain two zip archives with the train data (Training_Batch1.zip and Training_Batch2.zip) and a folder with the test data (LITS-Challenge-Test-Data).

The folder with test data should have original structure: <...>/LITS-Challenge-Test-Data/test-volume-0.nii <...>/LITS-Challenge-Test-Data/test-volume-1.nii ...

P.S. Organs boxes are also provided from a separate source https://github.com/superxuang/caffe_3d_faster_rcnn.

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = LiTS(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 201
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 163)
>>> print(ds.tumor_mask(ds.ids[80]).shape)
# (512, 512, 771)

References

.. [1] Bilic, Patrick, et al. "The liver tumor segmentation benchmark (lits)." arXiv preprint arXiv:1901.04056 (2019). .. [2] Antonelli, Michela, et al. "The medical segmentation decathlon." arXiv preprint arXiv:2106.05735 (2021).

`ids()`

`fold(id: str)`

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`voxel_spacing(id: str)`

`spacing(id: str)`

Returns voxel spacing along axes (x, y, z).

`mask(id: str)`

`amid.liver_medseg.LiverMedseg`

LiverMedseg is a public CT segmentation dataset with 50 annotated images. Case collection of 50 livers with their segments. Images obtained from Decathlon Medical Segmentation competition

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Download links: https://www.medseg.ai/database/liver-segments-50-cases

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = LiverMedseg(root='/path/to/archives/root')
>>> print(len(ds.ids))
# 50
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 38)

References

`ids()`

`image(id: str) -> np.ndarray`

`affine(id: str) -> np.ndarray`

The 4x4 matrix that gives the image's spatial orientation.

`voxel_spacing(id: str) -> tuple`

`spacing(id: str) -> tuple`

`mask(id: str) -> np.ndarray`

`amid.midrc.MIDRC`

MIDRC-RICORD dataset 1a is a public COVID-19 CT segmentation dataset with 120 scans.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Follow the download instructions at https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=80969742 Download both Images and Annotations to the same folder

Then, the folder with downloaded data should contain two paths with the data

The folder should have this structure: <...>//MIDRC-RICORD-1A <...>//MIDRC-RICORD-1a_annotations_labelgroup_all_2020-Dec-8.json

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = MIDRC(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
 155
>>> print(ds.image(ds.ids[0]).shape)
 (512, 512, 112)
>>> print(ds.mask(ds.ids[80]).shape)
 (6, 512, 512, 450)

References

`ids()`

`image(id: str)`

`image_meta(id: str)`

`spacing(id: str)`

`labels(id: str)`

`mask(id: str)`

`amid.mood.MOOD`

A (M)edival (O)ut-(O)f-(D)istribution analysis challenge [1]_

This dataset contains raw brain MRI and abdominal CT images.

Number of training samples: - Brain: 800 scans ( 256 x 256 x 256 ) - Abdominal: 550 scans ( 512 x 512 x 512 )

For each setup there are 4 toy test samples with OOD cases.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Follow the download instructions at https://www.synapse.org/#!Synapse:syn21343101/wiki/599515.

Then, the folder with raw downloaded data should contain four zip archives with data (abdom_toy.zip, abdom_train.zip, brain_toy.zip and brain_train.zip).

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = MOOD(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 1358
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 512)
>>> print(ds.pixel_label(ds.ids[0]).shape)
# (512, 512, 512)

References

.. [1] Zimmerer, Petersen, et al. "Medical Out-of-Distribution Analysis Challenge 2022." doi: 10.5281/zenodo.6362313 (2022).

`ids()`

`fold(id: str)`

Returns fold: train or toy (test).

`task(id: str)`

Returns task: brain (MRI) or abdominal (CT).

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`voxel_spacing(id: str)`

`spacing(id: str)`

Returns voxel spacing along axes (x, y, z).

`sample_label(id: str)`

Returns sample-level OOD score for toy examples and None otherwise. 0 indicates no abnormality and 1 indicates abnormal input.

`pixel_label(id: str)`

Returns voxel-level OOD scores for toy examples and None otherwise. 0 indicates no abnormality and 1 indicates abnormal input.

`amid.msd.MSD`

MSD is a Medical Segmentaton Decathlon Challenge with 10 tasks.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Data can be downloaded here:http://medicaldecathlon.com/ or here: https://msd-for-monai.s3-us-west-2.amazonaws.com/ or here: https://drive.google.com/drive/folders/1HqEgzS8BV2c7xYNrZdEAnrHk7osJJ--2/ Then, the folder with raw downloaded data should contain tar archive with data and masks (Task03_Liver.tar).

`ids() -> tuple`

`train_test(id: str) -> str`

`task(id: str) -> str`

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`image_modality(id: str) -> str`

`segmentation_labels(id: str) -> dict`

Returns segmentation labels for the task

`mask(id: str)`

`amid.mslub.dataset.MSLUB`

`ids()`

`image(id: str)`

`mask(id: str)`

`patient(id: str)`

`affine(id: str)`

`amid.medseg9.Medseg9`

Medseg9 is a public COVID-19 CT segmentation dataset with 9 annotated images.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Data can be downloaded here: http://medicalsegmentation.com/covid19/.

Then, the folder with raw downloaded data should contain three zip archives with data and masks (rp_im.zip, rp_lung_msk.zip, rp_msk.zip).

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = Medseg9(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 9
>>> print(ds.image(ds.ids[0]).shape)
# (630, 630, 45)
>>> print(ds.covid(ds.ids[0]).shape)
# (630, 630, 45)

`ids()`

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`voxel_spacing(id: str)`

`spacing(id: str)`

Returns voxel spacing along axes (x, y, z).

`lungs(id: str)`

`covid(id: str)`

int16 mask. 0 - normal, 1 - ground-glass opacities (матовое стекло), 2 - consolidation (консолидация).

`amid.cancer_500.dataset.MoscowCancer500`

The Moscow Radiology Cancer-500 dataset.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded files. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Download links: https://mosmed.ai/en/datasets/mosmeddata-kt-s-priznakami-raka-legkogo-tip-viii/ After pressing the download button you will have to provide an email address to which further instructions will be sent.

Examples:

>>> # Place the downloaded files in any folder and pass the path to the constructor:
>>> ds = MoscowCancer500(root='/path/to/files/root')
>>> print(len(ds.ids))
# 979
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 67)

`ids()`

`image(id: str)`

`study_uid(id: str)`

`series_uid(id: str)`

`sop_uids(id: str)`

`pixel_spacing(id: str)`

`slice_locations(id: str)`

`orientation_matrix(id: str)`

`instance_numbers(id: str)`

`conv_kernel(id: str)`

`kvp(id: str)`

`patient_id(id: str)`

`study_date(id: str)`

`accession_number(id: str)`

`nodules(id: str)`

`amid.covid_1110.MoscowCovid1110`

The Moscow Radiology COVID-19 dataset.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded files. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Download links: https://mosmed.ai/en/datasets/covid191110/

Examples:

>>> # Place the downloaded files in any folder and pass the path to the constructor:
>>> ds = MoscowCovid1110(root='/path/to/files/root')
>>> print(len(ds.ids))
# 1110
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 43)

`ids()`

`image(id: str)`

`affine(id: str)`

`label(id: str)`

`mask(id: str)`

`amid.nlst.NLST`

Dataset with low-dose CT scans of 26,254 patients acquired during National Lung Screening Trial.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder (usually called NLST) containing the patient subfolders (like 101426). If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Follow the download instructions at https://wiki.cancerimagingarchive.net/display/NLST/National+Lung+Screening+Trial. The dicoms should be placed under the following folders' structure: <...>//////*.dcm

Examples:

>>> ds = NLST(root='/path/to/NLST/')
>>> print(len(ds.ids))
 ...
>>> print(ds.image(ds.ids[0]).shape)
 ...
>>> print(ds.mask(ds.ids[80]).shape)
 ...

References

`ids()`

`image(id: str)`

`study_uid(id: str)`

`series_uid(id: str)`

`sop_uids(id: str)`

`pixel_spacing(id: str)`

`slice_locations(id: str)`

`orientation_matrix(id: str)`

`conv_kernel(id: str)`

`kvp(id: str)`

`patient_id(id: str)`

`study_date(id: str)`

`accession_number(id: str)`

`amid.nsclc.NSCLC`

NSCLC-Radiomics is a public cell lung cancer segmentation dataset with 422 patients.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Follow the download instructions at https://wiki.cancerimagingarchive.net/display/Public/NSCLC-Radiomics

The folder with downloaded data should contain two paths

The folder should have this structure: <...>//NSCLC-Radiomics/LUNG1-XXX

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = NSCLC(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
 422
>>> print(ds.image(ds.ids[0]).shape)
 (512, 512, 134)
>>> print(ds.mask(ds.ids[80]).shape)
 (512, 512, 108)

References

`ids()`

`image(id: str)`

`image_meta(id: str)`

`sex(id: str) -> str`

Sex of the patient.

`age(id: str) -> Union[int, None]`

Age of the patient, dataset contains 97 patients with unknown Age.

`spacing(id: str)`

`mask(id: str)`

`lung_left(id: str)`

`lung_right(id: str)`

`lungs_total(id: str)`

`heart(id: str)`

`esophagus(id: str)`

`spinal_cord(id: str)`

`amid.rsna_bc.dataset.RSNABreastCancer`

`site_id(id: str)`

`patient_id(id: str)`

`image_id(id: str)`

`laterality(id: str)`

`view(id: str)`

`age(id: str)`

`cancer(id: str)`

`biopsy(id: str)`

`invasive(id: str)`

`BIRADS(id: str)`

`implant(id: str)`

`density(id: str)`

`machine_id(id: str)`

`prediction_id(id: str)`

`difficult_negative_case(id: str)`

`ids()`

`image(id: str)`

`padding_value(id: str)`

`intensity_sign(id: str)`

`amid.ribfrac.dataset.RibFrac`

RibFrac dataset is a benchmark for developping algorithms on rib fracture detection, segmentation and classification. We hope this large-scale dataset could facilitate both clinical research for automatic rib fracture detection and diagnoses, and engineering research for 3D detection, segmentation and classification.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required

Notes

Data downloaded from here: https://doi.org/10.5281/zenodo.3893507 -- train Part1 (300 images) https://doi.org/10.5281/zenodo.3893497 -- train Part2 (120 images) https://doi.org/10.5281/zenodo.3893495 -- val (80 images) https://zenodo.org/record/3993380 -- test (160 images without annotation)

References

Jiancheng Yang, Liang Jin, Bingbing Ni, & Ming Li. (2020). RibFrac Dataset: A Benchmark for Rib Fracture Detection, Segmentation and Classification

`ids()`

`image(id: str)`

`label(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation

`amid.stanford_coca.StanfordCoCa`

A Stanford AIMI's Co(ronary) Ca(lcium) dataset.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Follow the download instructions at https://stanfordaimi.azurewebsites.net/datasets/e8ca74dc-8dd4-4340-815a-60b41f6cb2aa. You'll need to register and accept the terms of use. After that, copy the files from Azure:

azcopy copy 'some-generated-access-link' /path/to/downloaded/data/ --recursive=true

Then, the folder with raw downloaded data should contain two subfolders - a subset with gated coronary CT scans and corresponding coronary calcium segmentation masks (Gated_release_final) and a folder with the non-gated CT scans with corresponding coronary with coronary artery calcium scores (deidentified_nongated).

The folder with gated data should have original structure: ./Gated_release_final/patient/0/folder-with-dcms/ ./Gated_release_final/calcium_xml/0.xml ...

The folder with nongated data should have original structure: ./deidentified_nongated/0/folder-with-dcms/ ...

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = StanfordCoCa(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 971
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 57)

`ids()`

`image(id: str)`

`series_uid(id: str)`

`study_uid(id: str)`

`pixel_spacing(id: str)`

`slice_locations(id: str)`

`orientation_matrix(id: str)`

`calcifications(id: str)`

Returns list of Calcifications

`score(id: str)`

`amid.tbad.TBAD`

A dataset of 3D Computed Tomography (CT) images for Type-B Aortic Dissection segmentation.

Notes

The data can only be obtained by contacting the authors by email. See the dataset home page for details.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded files. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Examples:

>>> # Place the downloaded files in any folder and pass the path to the constructor:
>>> ds = TBAD(root='/path/to/files/root')
>>> print(len(ds.ids))
# 100
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 327)

References

.. [1] Yao, Zeyang & Xie, Wen & Zhang, Jiawei & Dong, Yuhao & Qiu, Hailong & Haiyun, Yuan & Jia, Qianjun & Tianchen, Wang & Shi, Yiyi & Zhuang, Jian & Que, Lifeng & Xu, Xiaowei & Huang, Meiping. (2021). ImageTBAD: A 3D Computed Tomography Angiography Image Dataset for Automatic Segmentation of Type-B Aortic Dissection. Frontiers in Physiology. 12. 732711. 10.3389/fphys.2021.732711.

`ids()`

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation.

`mask(id: str)`

`amid.totalsegmentator.dataset.Totalsegmentator`

In 1204 CT images we segmented 104 anatomical structures (27 organs, 59 bones, 10 muscles, 8 vessels) covering a majority of relevant classes for most use cases.

The CT images were randomly sampled from clinical routine, thus representing a real world dataset which generalizes to clinical application.

The dataset contains a wide range of different pathologies, scanners, sequences and institutions. [1]

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	absolute path to the downloaded archive. If not provided, the cache is assumed to be already populated.	required

Notes

Download link: https://zenodo.org/record/6802614/files/Totalsegmentator_dataset.zip

Examples:

>>> # Download the archive to any folder and pass the path to the constructor:
>>> ds = Totalsegmentator(root='/path/to/the/downloaded/archive')
>>> print(len(ds.ids))
# 1204
>>> print(ds.image(ds.ids[0]).shape)
# (294, 192, 179)
>>> print(ds.aorta(ds.ids[25]).shape)
# (320, 320, 145)

References

.. [1] Jakob Wasserthal (2022) Dataset with segmentations of 104 important anatomical structures in 1204 CT images. Available at: https://zenodo.org/record/6802614#.Y6M2MxXP1D8

`ids()`

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation

`amid.upenn_gbm.upenn_gbm.UPENN_GBM`

Multi-parametric magnetic resonance imaging (mpMRI) scans for de novo Glioblastoma (GBM) patients from the University of Pennsylvania Health System (UPENN-GBM). Dataset contains 630 patients.

All samples are registered to a common atlas (SRI) using a uniform preprocessing and the segmentation are aligned with them.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required

Notes

Follow the download instructions at https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70225642 Download to the root folder nifti images and metadata. Organise folder as folows:

<...>//NIfTI-files/images_segm/UPENN-GBM-00054_11_segm.nii.gz <...>//NIfTI-files/...

<...>//UPENN-GBM_clinical_info_v1.0.csv <...>//UPENN-GBM_acquisition.csv

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = UPENN_GBM(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 671
>>> print(ds.image(ds.ids[215]).shape)
# (4, 240, 240, 155)
>>> print(d.acqusition_info(d.ids[215]).manufacturer)
# SIEMENS

References

.. [1] Bakas, S., Sako, C., Akbari, H., Bilello, M., Sotiras, A., Shukla, G., Rudie, J. D., Flores Santamaria, N., Fathi Kazerooni, A., Pati, S., Rathore, S., Mamourian, E., Ha, S. M., Parker, W., Doshi, J., Baid, U., Bergman, M., Binder, Z. A., Verma, R., … Davatzikos, C. (2021). Multi-parametric magnetic resonance imaging (mpMRI) scans for de novo Glioblastoma (GBM) patients from the University of Pennsylvania Health System (UPENN-GBM) (Version 2) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.709X-DN49

`ids()`

`modalities(id: str)`

`dsc_modalities(id: str)`

`dti_modalities(id: str)`

`mask(id: str)`

`is_mask_automated(id: str)`

`image(id: str)`

`image_unstripped(id: str)`

`image_DTI(id: str)`

`image_DSC(id: str)`

`clinical_info(id: str) -> ClinicalInfo`

`acqusition_info(id: str) -> AcquisitionInfo`

`subject_id(id: str)`

`affine(id: str)`

`spacing(id: str)`

`amid.vs_seg.dataset.VSSEG`

Segmentation of vestibular schwannoma from MRI, an open annotated dataset ... (VS-SEG) [1]_.

The dataset contains 250 pairs of T1c and T2 images of the brain with the vestibular schwannoma segmentation task.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

The dataset and corresponding metadata could be downloaded at the TCIA page: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70229053.

To download DICOM images using .tcia file, we used public build of TCIA downloader: https://github.com/ygidtu/NBIA_data_retriever_CLI.

Then, download the rest of metadata from TCIA page: - DirectoryNamesMappingModality.csv - Vestibular-Schwannoma-SEG_matrices Mar 2021.zip - Vestibular-Schwannoma-SEG contours Mar 2021.zip

and unzip the latter two .zip archives.

So the root folder should contain 3 folders and 1 .csv file: <...>/DirectoryNamesMappingModality.csv <...>/Vestibular-Schwannoma-SEG/ ├── VS-SEG-001/... ├── VS-SEG-002/... └── ... <...>/contours/ <...>/registration_matrices/

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = VSSEG(root='/path/to/downloaded/data/folder/')
>>> print(len(ds.ids))
# 484
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 120)
>>> print(ds.schwannoma(ds.ids[1]).shape)
# (384, 384, 80)

References

.. [1] Shapey, Jonathan, et al. "Segmentation of vestibular schwannoma from MRI, an open annotated dataset and baseline algorithm." Scientific Data 8.1 (2021): 1-6. https://www.nature.com/articles/s41597-021-01064-w

`ids()`

`modality(id: str)`

`subject_id(id: str)`

`image(id: str)`

`spacing(id: str)`

The maximum relative difference in slice_locations < 1e-12, so we allow ourselves to use the common spacing for the whole 3D image.

`schwannoma(id: str)`

`cochlea(id: str)`

`meningioma(id: str)`

`study_uid(id: str)`

`series_uid(id: str)`

`patient_id(id: str)`

`study_date(id: str)`

`amid.verse.VerSe`

A Vertebral Segmentation Dataset with Fracture Grading [1]_

The dataset was used in the MICCAI-2019 and MICCAI-2020 Vertebrae Segmentation Challenges.

Parameters:

Name	Type	Description	Default
`root`	`(str, Path)`	path to the folder containing the raw downloaded archives. If not provided, the cache is assumed to be already populated.	required
`version`	`str`	the data version. Only has effect if the library was installed from a cloned git repository.	required

Notes

Download links: 2019: https://osf.io/jtfa5/ 2020: https://osf.io/4skx2/

Examples:

>>> # Place the downloaded archives in any folder and pass the path to the constructor:
>>> ds = VerSe(root='/path/to/archives/root')
>>> print(len(ds.ids))
# 374
>>> print(ds.image(ds.ids[0]).shape)
# (512, 512, 214)

References

.. [1] Löffler MT, Sekuboyina A, Jacob A, et al. A Vertebral Segmentation Dataset with Fracture Grading. Radiol Artif Intell. 2020;2(4):e190138. Published 2020 Jul 29. doi:10.1148/ryai.2020190138

`ids()`

`image(id: str)`

`affine(id: str)`

The 4x4 matrix that gives the image's spatial orientation

`split(id: str)`

The split in which this entry is contained: training, validate, test

`patient(id: str)`

The unique patient id

`year(id: str)`

The year in which this entry was published: 2019, 2020

`centers(id: str)`

Vertebrae centers in format {label: [x, y, z]}

`masks(id: str) -> Union[np.ndarray, None]`

Vertebrae masks