pax_global_header 0000666 0000000 0000000 00000000064 14715167373 0014530 g ustar 00root root 0000000 0000000 52 comment=8a063dd1a55467da732c9cbdadfbc892239d5ea7
heudiconv-1.3.2/ 0000775 0000000 0000000 00000000000 14715167373 0013517 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/LICENSE 0000664 0000000 0000000 00000001425 14715167373 0014526 0 ustar 00root root 0000000 0000000 Copyright [2014-2024] [HeuDiConv developers]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Some parts of the codebase/documentation are borrowed from other sources:
- HeuDiConv tutorial from https://bitbucket.org/dpat/neuroimaging_core_docs/src
Copyright 2023 Dianne Patterson
heudiconv-1.3.2/PKG-INFO 0000664 0000000 0000000 00000002736 14715167373 0014624 0 ustar 00root root 0000000 0000000 Metadata-Version: 2.1
Name: heudiconv
Version: 1.3.2
Summary: Heuristic DICOM Converter
Author: HeuDiConv team and contributors
License: Apache 2.0
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Typing :: Typed
Requires-Python: >=3.9
License-File: LICENSE
Requires-Dist: dcmstack>=0.8
Requires-Dist: etelemetry
Requires-Dist: filelock>=3.0.12
Requires-Dist: nibabel
Requires-Dist: nipype>=1.2.3
Requires-Dist: pydicom>=1.0.0
Provides-Extra: tests
Requires-Dist: pytest; extra == "tests"
Requires-Dist: tinydb; extra == "tests"
Requires-Dist: inotify; extra == "tests"
Provides-Extra: extras
Requires-Dist: duecredit; extra == "extras"
Provides-Extra: datalad
Requires-Dist: datalad>=0.13.0; extra == "datalad"
Provides-Extra: all
Requires-Dist: pytest; extra == "all"
Requires-Dist: tinydb; extra == "all"
Requires-Dist: inotify; extra == "all"
Requires-Dist: duecredit; extra == "all"
Requires-Dist: datalad>=0.13.0; extra == "all"
Convert DICOM dirs based on heuristic info - HeuDiConv
uses the dcmstack package and dcm2niix tool to convert DICOM directories or
tarballs into collections of NIfTI files following pre-defined heuristic(s).
heudiconv-1.3.2/README.rst 0000664 0000000 0000000 00000015236 14715167373 0015215 0 ustar 00root root 0000000 0000000 =============
**HeuDiConv**
=============
`a heuristic-centric DICOM converter`
.. image:: https://joss.theoj.org/papers/10.21105/joss.05839/status.svg
:target: https://doi.org/10.21105/joss.05839
:alt: JOSS Paper
.. image:: https://img.shields.io/badge/docker-nipy/heudiconv:latest-brightgreen.svg?logo=docker&style=flat
:target: https://hub.docker.com/r/nipy/heudiconv/tags/
:alt: Our Docker image
.. image:: https://github.com/nipy/heudiconv/actions/workflows/test.yml/badge.svg?event=push
:target: https://github.com/nipy/heudiconv/actions/workflows/test.yml
:alt: GitHub Actions (test)
.. image:: https://codecov.io/gh/nipy/heudiconv/branch/master/graph/badge.svg
:target: https://codecov.io/gh/nipy/heudiconv
:alt: CodeCoverage
.. image:: https://readthedocs.org/projects/heudiconv/badge/?version=latest
:target: http://heudiconv.readthedocs.io/en/latest/?badge=latest
:alt: Readthedocs
.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.1012598.svg
:target: https://doi.org/10.5281/zenodo.1012598
:alt: Zenodo (latest)
.. image:: https://repology.org/badge/version-for-repo/debian_unstable/heudiconv.svg?header=Debian%20Unstable
:target: https://repology.org/project/heudiconv/versions
:alt: Debian Unstable
.. image:: https://repology.org/badge/version-for-repo/gentoo_ovl_science/python:heudiconv.svg?header=Gentoo%20%28%3A%3Ascience%29
:target: https://repology.org/project/python:heudiconv/versions
:alt: Gentoo (::science)
.. image:: https://repology.org/badge/version-for-repo/pypi/python:heudiconv.svg?header=PyPI
:target: https://repology.org/project/python:heudiconv/versions
:alt: PyPI
.. image:: https://img.shields.io/badge/RRID-SCR__017427-blue
:target: https://identifiers.org/RRID:SCR_017427
:alt: RRID
About
-----
``heudiconv`` is a flexible DICOM converter for organizing brain imaging data
into structured directory layouts.
- It allows flexible directory layouts and naming schemes through customizable heuristics implementations.
- It only converts the necessary DICOMs and ignores everything else in a directory.
- You can keep links to DICOM files in the participant layout.
- Using `dcm2niix `_ under the hood, it's fast.
- It can track the provenance of the conversion from DICOM to NIfTI in W3C PROV format.
- It provides assistance in converting to `BIDS `_.
- It integrates with `DataLad `_ to place converted and original data under git/git-annex
version control while automatically annotating files with sensitive information (e.g., non-defaced anatomicals, etc).
Heudiconv can be inserted into your workflow to provide automatic conversion as part of a data acquisition pipeline, as seen in the figure below:
.. image:: figs/environment.png
Installation
------------
See our `installation page `_
on heudiconv.readthedocs.io .
HOWTO 101
---------
In a nutshell -- ``heudiconv`` is given a file tree of DICOMs, and it produces a restructured file tree of NifTI files (conversion handled by `dcm2niix`_) with accompanying metadata files.
The input and output structure is as flexible as your data, which is accomplished by using a Python file called a ``heuristic`` that knows how to read your input structure and decides how to name the resultant files.
You can run your conversion automatically (which will produce a ``.heudiconv`` directory storing the used parameters), or generate the default parameters, edit them to customize file naming, and continue conversion via an additional invocation of `heudiconv`:
.. image:: figs/workflow.png
``heudiconv`` comes with `existing heuristics `_ which can be used as is, or as examples.
For instance, the Heuristic `convertall `_ extracts standard metadata from all matching DICOMs.
``heudiconv`` creates mapping files, ``.edit.text`` which lets researchers simply establish their own conversion mapping.
In most use-cases of retrospective study data conversion, you would need to create your custom heuristic following the examples and the `"Heuristic" section `_ in the documentation.
**Note** that `ReproIn heuristic `_ is
generic and powerful enough to be adopted virtually for *any* study: For prospective studies, you would just need
to name your sequences following the `ReproIn convention `_, and for
retrospective conversions, you often would be able to create a new versatile heuristic by simply providing
remappings into ReproIn as shown in `this issue (documentation is coming) `_.
Having decided on a heuristic, you could use the command line::
heudiconv -f HEURISTIC-FILE-OR-NAME -o OUTPUT-PATH --files INPUT-PATHs
with various additional options (see ``heudiconv --help`` or
`"Usage" in documentation `__) to tune its behavior to
convert your data.
For detailed examples and guides, please check out `ReproIn conversion invocation examples `_
and the `user tutorials `_ in the documentation.
How to cite
-----------
Please use `Zenodo record `_ for
your specific version of HeuDiConv. We also support gathering
all relevant citations via `DueCredit `_.
How to contribute
-----------------
For a detailed into, see our `contributing guide `_.
Our releases are packaged using Intuit auto, with the corresponding workflow including
Docker image preparation being found in ``.github/workflows/release.yml``.
3-rd party heuristics
---------------------
- https://github.com/courtois-neuromod/ds_prep/blob/main/mri/convert/heuristics_unf.py
Support
-------
All bugs, concerns and enhancement requests for this software can be submitted here:
https://github.com/nipy/heudiconv/issues.
If you have a problem or would like to ask a question about how to use ``heudiconv``,
please submit a question to `NeuroStars.org `_ with a ``heudiconv`` tag.
NeuroStars.org is a platform similar to StackOverflow but dedicated to neuroinformatics.
All previous ``heudiconv`` questions are available here:
http://neurostars.org/tags/heudiconv/
heudiconv-1.3.2/heudiconv.egg-info/ 0000775 0000000 0000000 00000000000 14715167373 0017175 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv.egg-info/PKG-INFO 0000664 0000000 0000000 00000002736 14715167373 0020302 0 ustar 00root root 0000000 0000000 Metadata-Version: 2.1
Name: heudiconv
Version: 1.3.2
Summary: Heuristic DICOM Converter
Author: HeuDiConv team and contributors
License: Apache 2.0
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Typing :: Typed
Requires-Python: >=3.9
License-File: LICENSE
Requires-Dist: dcmstack>=0.8
Requires-Dist: etelemetry
Requires-Dist: filelock>=3.0.12
Requires-Dist: nibabel
Requires-Dist: nipype>=1.2.3
Requires-Dist: pydicom>=1.0.0
Provides-Extra: tests
Requires-Dist: pytest; extra == "tests"
Requires-Dist: tinydb; extra == "tests"
Requires-Dist: inotify; extra == "tests"
Provides-Extra: extras
Requires-Dist: duecredit; extra == "extras"
Provides-Extra: datalad
Requires-Dist: datalad>=0.13.0; extra == "datalad"
Provides-Extra: all
Requires-Dist: pytest; extra == "all"
Requires-Dist: tinydb; extra == "all"
Requires-Dist: inotify; extra == "all"
Requires-Dist: duecredit; extra == "all"
Requires-Dist: datalad>=0.13.0; extra == "all"
Convert DICOM dirs based on heuristic info - HeuDiConv
uses the dcmstack package and dcm2niix tool to convert DICOM directories or
tarballs into collections of NIfTI files following pre-defined heuristic(s).
heudiconv-1.3.2/heudiconv.egg-info/SOURCES.txt 0000664 0000000 0000000 00000004534 14715167373 0021067 0 ustar 00root root 0000000 0000000 LICENSE
README.rst
pyproject.toml
setup.py
heudiconv/__init__.py
heudiconv/_version.py
heudiconv/bids.py
heudiconv/convert.py
heudiconv/dicoms.py
heudiconv/due.py
heudiconv/info.py
heudiconv/main.py
heudiconv/parser.py
heudiconv/py.typed
heudiconv/queue.py
heudiconv/utils.py
heudiconv.egg-info/PKG-INFO
heudiconv.egg-info/SOURCES.txt
heudiconv.egg-info/dependency_links.txt
heudiconv.egg-info/entry_points.txt
heudiconv.egg-info/requires.txt
heudiconv.egg-info/top_level.txt
heudiconv/cli/__init__.py
heudiconv/cli/monitor.py
heudiconv/cli/run.py
heudiconv/external/__init__.py
heudiconv/external/dlad.py
heudiconv/external/tests/__init__.py
heudiconv/external/tests/test_dlad.py
heudiconv/heuristics/__init__.py
heudiconv/heuristics/banda-bids.py
heudiconv/heuristics/bids_ME.py
heudiconv/heuristics/bids_PhoenixReport.py
heudiconv/heuristics/bids_with_ses.py
heudiconv/heuristics/cmrr_heuristic.py
heudiconv/heuristics/convertall.py
heudiconv/heuristics/convertall_custom.py
heudiconv/heuristics/example.py
heudiconv/heuristics/multires_7Tbold.py
heudiconv/heuristics/reproin.py
heudiconv/heuristics/studyforrest_phase2.py
heudiconv/heuristics/test_b0dwi_for_fmap.py
heudiconv/heuristics/test_reproin.py
heudiconv/heuristics/uc_bids.py
heudiconv/tests/__init__.py
heudiconv/tests/anonymize_script.py
heudiconv/tests/conftest.py
heudiconv/tests/test_archives.py
heudiconv/tests/test_bids.py
heudiconv/tests/test_convert.py
heudiconv/tests/test_dicoms.py
heudiconv/tests/test_heuristics.py
heudiconv/tests/test_main.py
heudiconv/tests/test_monitor.py
heudiconv/tests/test_queue.py
heudiconv/tests/test_regression.py
heudiconv/tests/test_tarballs.py
heudiconv/tests/test_utils.py
heudiconv/tests/utils.py
heudiconv/tests/data/MRI_102TD_PHA_S.MR.Chen_Matthews_1.3.1.2022.11.16.15.50.20.357.31204541.dcm
heudiconv/tests/data/axasc35.dcm
heudiconv/tests/data/phantom.dcm
heudiconv/tests/data/sample_nifti.nii.gz
heudiconv/tests/data/sample_nifti_params.txt
heudiconv/tests/data/01-anat-scout/0001.dcm
heudiconv/tests/data/01-fmap_acq-3mm/1.3.12.2.1107.5.2.43.66112.2016101409263663466202201.dcm
heudiconv/tests/data/Phoenix/01+AA/01+AA+00001.dcm
heudiconv/tests/data/Phoenix/99+PhoenixDocument/99+PhoenixDocument+00001.dcm
heudiconv/tests/data/b0dwiForFmap/b0dwi_for_fmap+00001.dcm
heudiconv/tests/data/b0dwiForFmap/b0dwi_for_fmap+00002.dcm
heudiconv/tests/data/b0dwiForFmap/b0dwi_for_fmap+00003.dcm heudiconv-1.3.2/heudiconv.egg-info/dependency_links.txt 0000664 0000000 0000000 00000000001 14715167373 0023243 0 ustar 00root root 0000000 0000000
heudiconv-1.3.2/heudiconv.egg-info/entry_points.txt 0000664 0000000 0000000 00000000144 14715167373 0022472 0 ustar 00root root 0000000 0000000 [console_scripts]
heudiconv = heudiconv.cli.run:main
heudiconv_monitor = heudiconv.cli.monitor:main
heudiconv-1.3.2/heudiconv.egg-info/requires.txt 0000664 0000000 0000000 00000000324 14715167373 0021574 0 ustar 00root root 0000000 0000000 dcmstack>=0.8
etelemetry
filelock>=3.0.12
nibabel
nipype>=1.2.3
pydicom>=1.0.0
[all]
pytest
tinydb
inotify
duecredit
datalad>=0.13.0
[datalad]
datalad>=0.13.0
[extras]
duecredit
[tests]
pytest
tinydb
inotify
heudiconv-1.3.2/heudiconv.egg-info/top_level.txt 0000664 0000000 0000000 00000000012 14715167373 0021720 0 ustar 00root root 0000000 0000000 heudiconv
heudiconv-1.3.2/heudiconv/ 0000775 0000000 0000000 00000000000 14715167373 0015503 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/__init__.py 0000664 0000000 0000000 00000000351 14715167373 0017613 0 ustar 00root root 0000000 0000000 import logging
from ._version import __version__
from .info import __packagename__
__all__ = ["__packagename__", "__version__"]
lgr = logging.getLogger(__name__)
lgr.debug("Starting the abomination") # just to "run-test" logging
heudiconv-1.3.2/heudiconv/_version.py 0000664 0000000 0000000 00000000026 14715167373 0017677 0 ustar 00root root 0000000 0000000 __version__ = "1.3.2"
heudiconv-1.3.2/heudiconv/bids.py 0000664 0000000 0000000 00000126021 14715167373 0017000 0 ustar 00root root 0000000 0000000 """Handle BIDS specific operations"""
from __future__ import annotations
__docformat__ = "numpy"
from collections import OrderedDict
import csv
import errno
from glob import glob
import hashlib
import logging
import os
import os.path as op
from pathlib import Path
import re
from typing import Any, Optional
import warnings
import numpy as np
import pydicom as dcm
from . import __version__, dicoms
from .parser import find_files
from .utils import (
create_file_if_missing,
is_readonly,
json_dumps,
load_json,
remove_prefix,
remove_suffix,
save_json,
set_readonly,
strptime_bids,
update_json,
)
lgr = logging.getLogger(__name__)
# Fields to be populated in _scans files. Order matters
SCANS_FILE_FIELDS = OrderedDict(
[
("filename", OrderedDict([("Description", "Name of the nifti file")])),
(
"acq_time",
OrderedDict(
[
("LongName", "Acquisition time"),
("Description", "Acquisition time of the particular scan"),
]
),
),
("operator", OrderedDict([("Description", "Name of the operator")])),
(
"randstr",
OrderedDict(
[("LongName", "Random string"), ("Description", "md5 hash of UIDs")]
),
),
]
)
#: JSON Key where we will embed our version in the newly produced .json files
HEUDICONV_VERSION_JSON_KEY = "HeudiconvVersion"
class BIDSError(Exception):
pass
BIDS_VERSION = "1.8.0"
# List defining allowed parameter matching for fmap assignment:
SHIM_KEY = "ShimSetting"
AllowedFmapParameterMatching = [
"Shims",
"ImagingVolume",
"ModalityAcquisitionLabel",
"CustomAcquisitionLabel",
"PlainAcquisitionLabel",
"Force",
]
# Key info returned by get_key_info_for_fmap_assignment when
# matching_parameter = "Force"
KeyInfoForForce = "Forced"
# List defining allowed criteria to assign a given fmap to a non-fmap run
# among the different fmaps with matching parameters:
AllowedCriteriaForFmapAssignment = [
"First",
"Closest",
]
def maybe_na(val: Any) -> str:
"""Return 'n/a' if non-None value represented as str is not empty
Primarily for the consistent use of lower case 'n/a' so 'N/A' and 'NA'
are also treated as 'n/a'
"""
if val is not None:
valstr = str(val).strip()
return "n/a" if (not valstr or valstr in ("N/A", "NA")) else valstr
else:
return "n/a"
def treat_age(age: str | float | None) -> str | None:
"""Age might encounter 'Y' suffix or be a float"""
if age is None:
return None # might be converted to N/A by maybe_na
agestr = str(age)
if agestr.endswith("M"):
agestr = agestr.rstrip("M")
ageflt = float(agestr) / 12
agestr = ("%.2f" if ageflt != int(ageflt) else "%d") % ageflt
else:
agestr = agestr.rstrip("Y")
if agestr:
# strip all leading 0s but allow to scan a newborn (age 0Y)
agestr = "0" if not agestr.lstrip("0") else agestr.lstrip("0")
if agestr.startswith("."):
# we had float point value, let's prepend 0
agestr = "0" + agestr
return agestr
def populate_bids_templates(
path: str, defaults: Optional[dict[str, Any]] = None
) -> None:
"""Premake BIDS text files with templates"""
lgr.info("Populating template files under %s", path)
descriptor = op.join(path, "dataset_description.json")
if defaults is None:
defaults = {}
if not op.lexists(descriptor):
save_json(
descriptor,
OrderedDict(
[
("Name", "TODO: name of the dataset"),
("BIDSVersion", BIDS_VERSION),
(
"License",
defaults.get(
"License",
"TODO: choose a license, e.g. PDDL "
"(http://opendatacommons.org/licenses/pddl/)",
),
),
(
"Authors",
defaults.get(
"Authors", ["TODO:", "First1 Last1", "First2 Last2", "..."]
),
),
(
"Acknowledgements",
defaults.get(
"Acknowledgements", "TODO: whom you want to acknowledge"
),
),
(
"HowToAcknowledge",
"TODO: describe how to acknowledge -- either cite a "
"corresponding paper, or just in acknowledgement "
"section",
),
("Funding", ["TODO", "GRANT #1", "GRANT #2"]),
("ReferencesAndLinks", ["TODO", "List of papers or websites"]),
("DatasetDOI", "TODO: eventually a DOI for the dataset"),
]
),
)
sourcedata_README = op.join(path, "sourcedata", "README")
if op.exists(op.dirname(sourcedata_README)):
create_file_if_missing(
sourcedata_README,
(
"TODO: Provide description about source data, e.g. \n"
"Directory below contains DICOMS compressed into tarballs per "
"each sequence, replicating directory hierarchy of the BIDS dataset"
" itself."
),
)
create_file_if_missing(
op.join(path, "CHANGES"),
"0.0.1 Initial data acquired\n"
"TODOs:\n\t- verify and possibly extend information in participants.tsv"
" (see for example http://datasets.datalad.org/?dir=/openfmri/ds000208)"
"\n\t- fill out dataset_description.json, README, sourcedata/README"
" (if present)\n\t- provide _events.tsv file for each _bold.nii.gz with"
" onsets of events (see '8.5 Task events' of BIDS specification)",
)
create_file_if_missing(
op.join(path, "README"),
"TODO: Provide description for the dataset -- basic details about the "
"study, possibly pointing to pre-registration (if public or embargoed)",
)
create_file_if_missing(
op.join(path, "scans.json"), json_dumps(SCANS_FILE_FIELDS, sort_keys=False)
)
create_file_if_missing(op.join(path, ".bidsignore"), ".duecredit.p")
if op.lexists(op.join(path, ".git")):
create_file_if_missing(op.join(path, ".gitignore"), ".duecredit.p")
populate_aggregated_jsons(path)
def populate_aggregated_jsons(path: str) -> None:
"""Aggregate across the entire BIDS dataset ``.json``\\s into top level ``.json``\\s
Top level .json files would contain only the fields which are
common to all ``subject[/session]/type/*_modality.json``\\s.
ATM aggregating only for ``*_task*_bold.json`` files. Only the task- and
OPTIONAL _acq- field is retained within the aggregated filename. The other
BIDS _key-value pairs are "aggregated over".
Parameters
----------
path: str
Path to the top of the BIDS dataset
"""
# TODO: collect all task- .json files for func files to
tasks = {}
# way too many -- let's just collect all which are the same!
# FIELDS_TO_TRACK = {'RepetitionTime', 'FlipAngle', 'EchoTime',
# 'Manufacturer', 'SliceTiming', ''}
for fpath in find_files(
r".*_task-.*\_bold\.json",
topdir=glob(op.join(path, "sub-*")),
exclude_vcs=True,
exclude=r"/\.(datalad|heudiconv)/",
):
#
# According to BIDS spec I think both _task AND _acq (may be more?
# _rec, _dir, ...?) should be retained?
# TODO: if we are to fix it, then old ones (without _acq) should be
# removed first
task = re.sub(r".*_(task-[^_\.]*(_acq-[^_\.]*)?)_.*", r"\1", fpath)
json_ = load_json(fpath, retry=100)
if task not in tasks:
tasks[task] = json_
else:
rec = tasks[task]
# let's retain only those fields which have the same value
for field in sorted(rec):
if field not in json_ or json_[field] != rec[field]:
del rec[field]
# create a stub onsets file for each one of those
suf = "_bold.json"
assert fpath.endswith(suf)
# specify the name of the '_events.tsv' file:
if "_echo-" in fpath:
# multi-echo sequence: bids (1.1.0) specifies just one '_events.tsv'
# file, common for all echoes. The name will not include _echo-.
# TODO: RF to use re.match for better readability/robustness
# So, find out the echo number:
fpath_split = fpath.split("_echo-", 1) # split fpath using '_echo-'
fpath_split_2 = fpath_split[1].split(
"_", 1
) # split the second part of fpath_split using '_'
echoNo = fpath_split_2[0] # get echo number
if echoNo == "1":
if len(fpath_split_2) != 2:
raise ValueError("Found no trailer after _echo-")
# we modify fpath to exclude '_echo-' + echoNo:
fpath = fpath_split[0] + "_" + fpath_split_2[1]
else:
# for echoNo greater than 1, don't create the events file, so go to
# the next for loop iteration:
continue
events_file = remove_suffix(fpath, suf) + "_events.tsv"
# do not touch any existing thing, it may be precious
if not op.lexists(events_file):
lgr.debug("Generating %s", events_file)
with open(events_file, "w") as fp:
fp.write(
"onset\tduration\ttrial_type\tresponse_time\tstim_file"
"\tTODO -- fill in rows and add more tab-separated "
"columns if desired"
)
# extract tasks files stubs
for task_acq, fields in tasks.items():
task_file = op.join(path, task_acq + "_bold.json")
# Since we are pulling all unique fields we have to possibly
# rewrite this file to guarantee consistency.
# See https://github.com/nipy/heudiconv/issues/277 for a usecase/bug
# when we didn't touch existing one.
# But the fields we enter (TaskName and CogAtlasID) might need need
# to be populated from the file if it already exists
placeholders = {
"TaskName": (
"TODO: full task name for %s" % task_acq.split("_")[0].split("-")[1]
),
"CogAtlasID": "http://www.cognitiveatlas.org/task/id/TODO",
}
if op.lexists(task_file):
j = load_json(task_file, retry=100)
# Retain possibly modified placeholder fields
for f in placeholders:
if f in j:
placeholders[f] = j[f]
act = "Regenerating"
else:
act = "Generating"
lgr.debug("%s %s", act, task_file)
fields.update(placeholders)
save_json(task_file, fields, sort_keys=True, pretty=True)
def tuneup_bids_json_files(json_files: list[str]) -> None:
"""Given a list of BIDS .json files, e.g."""
if not json_files:
return
# Harmonize generic .json formatting
for jsonfile in json_files:
json_ = load_json(jsonfile)
# sanitize!
for f1 in ["Acquisition", "Study", "Series"]:
for f2 in ["DateTime", "Date"]:
json_.pop(f1 + f2, None)
# TODO: should actually be placed into series file which must
# go under annex (not under git) and marked as sensitive
# MG - Might want to replace with flag for data sensitivity
# related - https://github.com/nipy/heudiconv/issues/92
if "Date" in str(json_):
# Let's hope no word 'Date' comes within a study name or smth like
# that
raise ValueError("There must be no dates in .json sidecar")
# Those files should not have our version field already - should have been
# freshly produced
assert HEUDICONV_VERSION_JSON_KEY not in json_
json_[HEUDICONV_VERSION_JSON_KEY] = str(__version__)
save_json(jsonfile, json_)
# Load the beast
seqtype = op.basename(op.dirname(jsonfile))
# MG - want to expand this for other _epi
# possibly add IntendedFor automatically as well?
if seqtype == "fmap":
json_basename = "_".join(jsonfile.split("_")[:-1])
# if we got by now all needed .json files -- we can fix them up
# unfortunately order of "items" is not guaranteed atm
json_phasediffname = json_basename + "_phasediff.json"
json_mag = json_basename + "_magnitude*.json"
if op.exists(json_phasediffname) and len(glob(json_mag)) >= 1:
json_ = load_json(json_phasediffname)
# TODO: we might want to reorder them since ATM
# the one for shorter TE is the 2nd one!
# For now just save truthfully by loading magnitude files
lgr.debug("Placing EchoTime fields into phasediff file")
for i in 1, 2:
try:
json_["EchoTime%d" % i] = load_json(
json_basename + "_magnitude%d.json" % i
)["EchoTime"]
except IOError as exc:
lgr.error("Failed to open magnitude file: %s", exc)
# might have been made R/O already, but if not -- it will be set
# only later in the pipeline, so we must not make it read-only yet
was_readonly = is_readonly(json_phasediffname)
if was_readonly:
set_readonly(json_phasediffname, False)
save_json(json_phasediffname, json_)
if was_readonly:
set_readonly(json_phasediffname)
def add_participant_record(
studydir: str, subject: str, age: str | None, sex: str | None
) -> None:
participants_tsv = op.join(studydir, "participants.tsv")
participant_id = "sub-%s" % subject
if not create_file_if_missing(
participants_tsv, "\t".join(["participant_id", "age", "sex", "group"]) + "\n"
):
# check if may be subject record already exists
with open(participants_tsv) as f:
f.readline()
known_subjects = {ln.split("\t")[0] for ln in f.readlines()}
if participant_id in known_subjects:
return
else:
# Populate participants.json (an optional file to describe column names in
# participant.tsv). This auto generation will make BIDS-validator happy.
participants_json = op.join(studydir, "participants.json")
if not op.lexists(participants_json):
save_json(
participants_json,
OrderedDict(
[
(
"participant_id",
OrderedDict([("Description", "Participant identifier")]),
),
(
"age",
OrderedDict(
[
(
"Description",
"Age in years (TODO - verify) as in the initial"
" session, might not be correct for other sessions",
)
]
),
),
(
"sex",
OrderedDict(
[
(
"Description",
"self-rated by participant, M for male/F for "
"female (TODO: verify)",
)
]
),
),
(
"group",
OrderedDict(
[
(
"Description",
"(TODO: adjust - by default everyone is in "
"control group)",
)
]
),
),
]
),
sort_keys=False,
)
# Add a new participant
with open(participants_tsv, "a") as f:
f.write(
"\t".join(
map(
str,
[
participant_id,
maybe_na(treat_age(age)),
maybe_na(sex),
"control",
],
)
)
+ "\n"
)
def find_subj_ses(f_name: str) -> tuple[Optional[str], Optional[str]]:
"""Given a path to the bids formatted filename parse out subject/session"""
# we will allow the match at either directories or within filename
# assuming that bids layout is "correct"
regex = re.compile("sub-(?P[a-zA-Z0-9]*)([/_]ses-(?P[a-zA-Z0-9]*))?")
regex_res = regex.search(f_name)
res = regex_res.groupdict() if regex_res else {}
return res.get("subj", None), res.get("ses", None)
def save_scans_key(
item: tuple[str, tuple[str, ...], list[str]], bids_files: list[str]
) -> None:
"""
Parameters
----------
item:
bids_files: list of str
Returns
-------
"""
rows = {}
assert bids_files, "we do expect some files since it was called"
# we will need to deduce subject and session from the bids_filename
# and if there is a conflict, we would just blow since this function
# should be invoked only on a result of a single item conversion as far
# as I see it, so should have the same subject/session
subj: Optional[str] = None
ses: Optional[str] = None
for bids_file in bids_files:
# get filenames
f_name = "/".join(bids_file.split("/")[-2:])
f_name = f_name.replace("json", "nii.gz")
rows[f_name] = get_formatted_scans_key_row(item[-1][0])
subj_, ses_ = find_subj_ses(f_name)
if not subj_:
lgr.warning(
"Failed to detect fulfilled BIDS layout. "
"No scans.tsv file(s) will be produced for %s",
", ".join(bids_files),
)
return
if subj and subj_ != subj:
raise ValueError(
"We found before subject %s but now deduced %s from %s"
% (subj, subj_, f_name)
)
subj = subj_
if ses and ses_ != ses:
raise ValueError(
"We found before session %s but now deduced %s from %s"
% (ses, ses_, f_name)
)
ses = ses_
# where should we store it?
output_dir = op.dirname(op.dirname(bids_file))
# save
ses = "_ses-%s" % ses if ses else ""
add_rows_to_scans_keys_file(
op.join(output_dir, "sub-{0}{1}_scans.tsv".format(subj, ses)), rows
)
def add_rows_to_scans_keys_file(fn: str, newrows: dict[str, list[str]]) -> None:
"""Add new rows to the _scans file.
Parameters
----------
fn: str
filename
newrows: dict
extra rows to add (acquisition time, referring physician, random string)
"""
if op.lexists(fn):
with open(fn, "r") as csvfile:
reader = csv.reader(csvfile, delimiter="\t")
existing_rows = [row for row in reader]
# skip header
fnames2info = {row[0]: row[1:] for row in existing_rows[1:]}
newrows_key = newrows.keys()
newrows_toadd = list(set(newrows_key) - set(fnames2info.keys()))
for key_toadd in newrows_toadd:
fnames2info[key_toadd] = newrows[key_toadd]
# remove
os.unlink(fn)
else:
fnames2info = newrows
header = list(SCANS_FILE_FIELDS.keys())
# prepare all the data rows
data_rows = [[k] + v for k, v in fnames2info.items()]
# sort by the date/filename
try:
data_rows_sorted = sorted(data_rows, key=lambda x: (x[1], x[0]))
except TypeError as exc:
lgr.warning("Sorting scans by date failed: %s", str(exc))
data_rows_sorted = sorted(data_rows)
# save
with open(fn, "a") as csvfile:
writer = csv.writer(csvfile, delimiter="\t")
writer.writerows([header] + data_rows_sorted)
def get_formatted_scans_key_row(dcm_fn: str | Path) -> list[str]:
"""
Parameters
----------
dcm_fn: str
Returns
-------
row: list
[ISO acquisition time, performing physician name, random string]
"""
dcm_data = dcm.dcmread(dcm_fn, stop_before_pixels=True, force=True)
# we need to store filenames and acquisition datetimes
acq_datetime = dicoms.get_datetime_from_dcm(dcm_data=dcm_data)
# add random string
# But let's make it reproducible by using all UIDs
# (might change across versions?)
randcontent = "".join(
[getattr(dcm_data, f) or "" for f in sorted(dir(dcm_data)) if f.endswith("UID")]
)
randstr = hashlib.md5(randcontent.encode()).hexdigest()[:8]
try:
perfphys = dcm_data.PerformingPhysicianName
except AttributeError:
perfphys = ""
row = [acq_datetime.isoformat() if acq_datetime else "", perfphys, randstr]
# empty entries should be 'n/a'
# https://github.com/dartmouth-pbs/heudiconv/issues/32
row = ["n/a" if not str(e) else e for e in row]
return row
def convert_sid_bids(subject_id: str) -> str:
"""Shim for stripping any non-BIDS compliant characters within subject_id
Parameters
----------
subject_id : string
Returns
-------
sid : string
New subject ID
"""
warnings.warn(
"convert_sid_bids() is deprecated, please use sanitize_label() instead.",
DeprecationWarning,
stacklevel=2,
)
return sanitize_label(subject_id)
def get_shim_setting(json_file: str) -> Any:
"""
Gets the "ShimSetting" field from a json_file.
If no "ShimSetting" present, return error
Parameters
----------
json_file : str
Returns
-------
str with "ShimSetting" value
"""
data = load_json(json_file)
try:
shims = data[SHIM_KEY]
except KeyError:
lgr.error(
'File %s does not have "%s". '
'Please use a different "matching_parameters" in your heuristic file',
json_file,
SHIM_KEY,
)
raise
return shims
def find_fmap_groups(fmap_dir: str) -> dict[str, list[str]]:
"""
Finds the different fmap groups in a fmap directory.
By groups here we mean fmaps that are intended to go together
(with reversed PE polarity, magnitude/phase, etc.)
Parameters
----------
fmap_dir : str
path to the session folder (or to the subject folder, if there are no
sessions).
Returns
-------
fmap_groups : dict
key: prefix common to the group (e.g. no "dir" entity, "_phase"/"_magnitude", ...)
value: list of all fmap paths in the group
"""
if op.basename(fmap_dir) != "fmap":
lgr.error("%s is not a fieldmap folder", fmap_dir)
# Get a list of all fmap json files in the session:
fmap_jsons = sorted(glob(op.join(fmap_dir, "*.json")))
# RegEx to remove fmap-specific substrings from fmap file names
# "_phase[1,2]", "_magnitude[1,2]", "_phasediff", "_dir-", ...
fmap_regex = re.compile(
"(_dir-[0-9,a-z,A-Z]*)*" # for pepolar case
"(_phase[12])*" # for phase images
"(_phasediff)*" # for phasediff images
"(_magnitude[12])*" # for magnitude images
"(_fieldmap)*" # for actual fieldmap images
)
# Find the unique prefixes ('splitext' removes the extension):
prefixes = sorted(
set(
fmap_regex.sub("", remove_suffix(op.basename(fm), ".json"))
for fm in fmap_jsons
)
)
return {
k: [
fm
for fm in fmap_jsons
if fmap_regex.sub("", remove_suffix(op.basename(fm), ".json")) == k
]
for k in prefixes
}
def get_key_info_for_fmap_assignment(
json_file: str, matching_parameter: str
) -> list[Any]:
"""
Gets key information needed to assign fmaps to other modalities.
(Note: It is the responsibility of the calling function to make sure
the arguments are OK)
Parameters
----------
json_file : str
path to the json file
matching_parameter : str in AllowedFmapParameterMatching
matching_parameter that will be used to match runs
Returns
-------
key_info : list
part of the json file that will need to match between the fmap and
the other image
"""
if not op.exists(json_file):
raise FileNotFoundError(errno.ENOENT, os.strerror(errno.ENOENT), json_file)
# loop through the possible criteria and extract the info needed
if matching_parameter == "Shims":
key_info = [get_shim_setting(json_file)]
elif matching_parameter == "ImagingVolume":
from nibabel import load as nb_load
from nibabel.nifti1 import Nifti1Header
nifti_files = glob(remove_suffix(json_file, ".json") + ".nii*")
assert len(nifti_files) == 1
nifti_file = nifti_files[0]
nifti_header = nb_load(nifti_file).header
assert isinstance(nifti_header, Nifti1Header)
key_info = [nifti_header.get_best_affine(), nifti_header.get_data_shape()[:3]]
elif matching_parameter == "ModalityAcquisitionLabel":
# Check the acq label for the fmap and the modality for others:
modality = op.basename(op.dirname(json_file))
if modality == "fmap":
# extract the entity:
acq_label = BIDSFile.parse(op.basename(json_file))["acq"]
assert acq_label is not None
if any(s in acq_label.lower() for s in ["fmri", "bold", "func"]):
key_info = ["func"]
elif any(s in acq_label.lower() for s in ["diff", "dwi"]):
key_info = ["dwi"]
elif any(s in acq_label.lower() for s in ["anat", "struct"]):
key_info = ["anat"]
else:
key_info = [modality]
elif matching_parameter == "CustomAcquisitionLabel":
modality = op.basename(op.dirname(json_file))
if modality == "func":
# extract the entity:
custom_label = BIDSFile.parse(op.basename(json_file))["task"]
else:
# extract the entity:
custom_label = BIDSFile.parse(op.basename(json_file))["acq"]
# Get the custom acquisition label, acq_label is None if no custom field found
key_info = [custom_label]
elif matching_parameter == "PlainAcquisitionLabel":
# always base the decision on label
plain_label = BIDSFile.parse(op.basename(json_file))["acq"]
key_info = [plain_label]
elif matching_parameter == "Force":
# We want to force the matching, so just return some string
# regardless of the image
key_info = [KeyInfoForForce]
else:
# fallback:
key_info = []
return key_info
def find_compatible_fmaps_for_run(
json_file: str, fmap_groups: dict[str, list[str]], matching_parameters: list[str]
) -> dict[str, list[str]]:
"""
Finds compatible fmaps for a given run, for populate_intended_for.
(Note: It is the responsibility of the calling function to make sure
the arguments are OK)
Parameters
----------
json_file : str
path to the json file
fmap_groups : dict
key: prefix common to the group
value: list of all fmap paths in the group
matching_parameters : list of str from AllowedFmapParameterMatching
matching_parameters that will be used to match runs
Returns
-------
compatible_fmap_groups : dict
Subset of the fmap_groups which match json_file, according
to the matching_parameters.
key: prefix common to the group
value: list of all fmap paths in the group
"""
lgr.debug("Looking for fmaps for %s", json_file)
json_info = {}
for param in matching_parameters:
json_info[param] = get_key_info_for_fmap_assignment(json_file, param)
compatible_fmap_groups = {}
for fm_key, fm_group in fmap_groups.items():
# check the key_info (for all parameters) for one (the first) of
# the fmaps in the group:
compatible = False
for param in matching_parameters:
json_info_1st_item = json_info[param][0]
fm_info = get_key_info_for_fmap_assignment(fm_group[0], param)
# for the case in which key_info is a list of strings:
if isinstance(json_info_1st_item, str):
compatible = json_info[param] == fm_info
# for the case when no key info was found (e.g. "acq" field does not exist)
elif json_info_1st_item is None:
compatible = False
else:
# allow for tiny differences between the affines etc
compatible = all(
np.allclose(x, y) for x, y in zip(json_info[param], fm_info)
)
if not compatible:
continue # don't bother checking more params
if compatible:
compatible_fmap_groups[fm_key] = fm_group
return compatible_fmap_groups
def find_compatible_fmaps_for_session(
path_to_bids_session: str, matching_parameters: list[str]
) -> Optional[dict[str, dict[str, list[str]]]]:
"""
Finds compatible fmaps for all non-fmap runs in a session.
(Note: It is the responsibility of the calling function to make sure
the arguments are OK)
Parameters
----------
path_to_bids_session : str
path to the session folder (or to the subject folder, if there are no
sessions).
matching_parameters : list of str from AllowedFmapParameterMatching
matching_parameters that will be used to match runs
Returns
-------
compatible_fmap : dict
Dict of compatible_fmaps_groups (values) for each non-fmap run (keys)
"""
lgr.debug("Looking for fmaps for session: %s", path_to_bids_session)
# Resolve path (eliminate '..')
path_to_bids_session = op.abspath(path_to_bids_session)
# find the different groups of fmaps:
fmap_dir = op.join(path_to_bids_session, "fmap")
if not op.exists(fmap_dir):
lgr.warning(
"We cannot add the IntendedFor field: no fmap/ in %s", path_to_bids_session
)
return None
fmap_groups = find_fmap_groups(fmap_dir)
# Get a set with all non-fmap json files in the session (exclude SBRef files).
session_jsons = [
j
for j in glob(op.join(path_to_bids_session, "*/*.json"))
if not (
op.basename(op.dirname(j)) == "fmap"
or remove_suffix(j, ".json").endswith("_sbref")
)
]
# Loop through session_jsons and find the compatible fmap_groups for each
compatible_fmaps = {
j: find_compatible_fmaps_for_run(j, fmap_groups, matching_parameters)
for j in session_jsons
}
return compatible_fmaps
def select_fmap_from_compatible_groups(
json_file: str, compatible_fmap_groups: dict[str, list[str]], criterion: str
) -> Optional[str]:
"""
Selects the fmap that will be used to correct for distortions in json_file
from the compatible fmap_groups list, based on the given criterion
(Note: It is the responsibility of the calling function to make sure
the arguments are OK)
Parameters
----------
json_file : str
path to the json file
compatible_fmap_groups : dict
fmap_groups that are compatible with the specific json_file
criterion : str in ['First', 'Closest']
matching_parameters that will be used to decide which fmap to use
Returns
-------
selected_fmap_key : str
key from the compatible_fmap_groups for the selected fmap group
"""
if len(compatible_fmap_groups) == 0:
return None
# if compatible_fmap_groups has only one entry, that's it:
elif len(compatible_fmap_groups) == 1:
return list(compatible_fmap_groups.keys())[0]
# get the modality folders, then session folder:
modality_folders = set(
op.dirname(fmap) for v in compatible_fmap_groups.values() for fmap in v
) # there should be only one value, ending in 'fmap'
sess_folders = set(op.dirname(k) for k in modality_folders)
if len(sess_folders) > 1:
# for now, we only deal with single sessions:
raise RuntimeError
# if we made it here, we have only one session:
sess_folder = list(sess_folders)[0]
# get acquisition times from '_scans.tsv':
try:
scans_tsv = glob(op.join(sess_folder, "*_scans.tsv"))[0]
except IndexError:
raise FileNotFoundError("No '*_scans' file found for session %s" % sess_folder)
with open(scans_tsv) as f:
# read the contents, splitting by lines and by tab separators:
scans_tsv_content = [line.split("\t") for line in f.read().splitlines()]
# get column indices for filename and acq_time from the first line:
(fname_idx, time_idx) = (
scans_tsv_content[0].index(k) for k in ["filename", "acq_time"]
)
acq_times = {line[fname_idx]: line[time_idx] for line in scans_tsv_content[1:]}
# acq_times for the compatible fmaps:
acq_times_fmaps = {
k: acq_times[
# remove session folder and '.json', add '.nii.gz':
remove_suffix(remove_prefix(v[0], sess_folder + op.sep), ".json")
+ ".nii.gz"
]
for k, v in compatible_fmap_groups.items()
}
if criterion == "First":
# find the first acquired fmap_group from the compatible_fmap_groups:
first_acq_time = sorted(acq_times_fmaps.values())[0]
selected_fmap_key = [
k for k, v in acq_times_fmaps.items() if v == first_acq_time
][0]
elif criterion == "Closest":
json_acq_time = strptime_bids(
acq_times[
# remove session folder and '.json', add '.nii.gz':
remove_suffix(remove_prefix(json_file, sess_folder + op.sep), ".json")
+ ".nii.gz"
]
)
# differences in acquisition time (abs value):
diff_fmaps_acq_times = {
k: abs(strptime_bids(v) - json_acq_time) for k, v in acq_times_fmaps.items()
}
min_diff_acq_times = sorted(diff_fmaps_acq_times.values())[0]
selected_fmap_key = [
k for k, v in diff_fmaps_acq_times.items() if v == min_diff_acq_times
][0]
else:
raise ValueError(f"Invalid 'criterion' value: {criterion!r}")
return selected_fmap_key
def populate_intended_for(
path_to_bids_session: str, matching_parameters: str | list[str], criterion: str
) -> None:
"""
Adds the 'IntendedFor' field to the fmap .json files in a session folder.
It goes through the session folders and for every json file, it finds
compatible_fmaps: fmaps that have the same matching_parameters as the json
file (e.g., same 'Shims').
If there are more than one compatible_fmaps, it will use the criterion
specified by the user (default: 'Closest' in time).
Because fmaps come in groups (with reversed PE polarity, or magnitude/
phase), we work with fmap_groups.
Parameters
----------
path_to_bids_session : str
path to the session folder (or to the subject folder, if there are no
sessions).
matching_parameters : list of str from AllowedFmapParameterMatching
matching_parameters that will be used to match runs
criterion : str in ['First', 'Closest']
matching_parameters that will be used to decide which of the matching
fmaps to use
"""
if not isinstance(matching_parameters, list):
assert isinstance(matching_parameters, str), (
"matching_parameters must be a str or a list, got %s" % matching_parameters
)
matching_parameters = [matching_parameters]
for param in matching_parameters:
if param not in AllowedFmapParameterMatching:
raise ValueError("Fmap matching_parameter %s not allowed." % param)
if criterion not in AllowedCriteriaForFmapAssignment:
raise ValueError("Fmap assignment criterion '%s' not allowed." % criterion)
lgr.info('Adding "IntendedFor" to the fieldmaps in %s.', path_to_bids_session)
# Resolve path (eliminate '..')
path_to_bids_session = op.abspath(path_to_bids_session)
# Get the subject folder (if "path_to_bids_session" includes the session,
# remove it). "IntendedFor" paths will be relative to it.
if op.basename(path_to_bids_session).startswith("ses-"):
subj_folder = op.dirname(path_to_bids_session)
else:
subj_folder = path_to_bids_session
fmap_dir = op.join(path_to_bids_session, "fmap")
if not op.exists(fmap_dir):
lgr.warning(
"We cannot add the IntendedFor field: no fmap/ in %s", path_to_bids_session
)
return
compatible_fmaps = find_compatible_fmaps_for_session(
path_to_bids_session, matching_parameters=matching_parameters
)
assert compatible_fmaps is not None
selected_fmaps = {}
for json_file, fmap_groups in compatible_fmaps.items():
if not op.dirname(json_file).endswith("fmap"):
selected_fmaps[json_file] = select_fmap_from_compatible_groups(
json_file, fmap_groups, criterion=criterion
)
# Loop through all the unique fmap_groups in compatible_fmaps:
unique_fmap_groups = {}
for cf in compatible_fmaps.values():
for key, values in cf.items():
if key not in unique_fmap_groups:
unique_fmap_groups[key] = values
for fmap_group in unique_fmap_groups:
intended_for = []
for json_file, selected_fmap_group in selected_fmaps.items():
if selected_fmap_group and (fmap_group in selected_fmap_group):
intended_for.append(
op.relpath(
remove_suffix(json_file, ".json") + ".nii.gz", start=subj_folder
)
)
if intended_for:
intended_for = sorted(str(f) for f in intended_for)
# Add this intended_for to all fmap files in the fmap_group:
for fm_json in unique_fmap_groups[fmap_group]:
update_json(fm_json, {"IntendedFor": intended_for}, pretty=True)
class BIDSFile:
"""as defined in https://bids-specification.readthedocs.io/en/stable/99-appendices/04-entity-table.html
which might soon become machine readable
order matters
"""
_known_entities = [
"sub",
"ses",
"task",
"acq",
"ce",
"rec",
"dir",
"run",
"mod",
"echo",
"flip",
"inv",
"mt",
"part",
"recording",
]
def __init__(
self, entities: dict[str, str], suffix: str, extension: Optional[str]
) -> None:
self._entities = entities
self._suffix = suffix
self._extension = extension
def __eq__(self, other: Any) -> bool:
if not isinstance(other, self.__class__):
return False
if (
all([other[k] == v for k, v in self._entities.items()])
and self.extension == other.extension
and self.suffix == other.suffix
):
return True
else:
return False
@classmethod
def parse(cls, filename: str) -> BIDSFile:
"""Parse the filename for BIDS entities, suffix and extension"""
# use re.findall to find all lower-case-letters + '-' + alphanumeric + '_' pairs:
entities_list = re.findall("([a-z]+)-([a-zA-Z0-9]+)[_]*", filename)
# keep only those in the _known_entities list:
entities = {k: v for k, v in entities_list if k in BIDSFile._known_entities}
# get whatever comes after the last key-value pair, and remove any '_' that
# might come in front:
ending = filename.split("-".join(entities_list[-1]))[-1]
ending = remove_prefix(ending, "_")
# the first dot ('.') separates the suffix from the extension:
if "." in ending:
suffix, extension = ending.split(".", 1)
else:
suffix, extension = ending, None
return BIDSFile(entities, suffix, extension)
def __str__(self) -> str:
"""reconstitute in a legit BIDS filename using the order from entity table"""
if "sub" not in self._entities:
raise ValueError("The 'sub-' entity is mandatory")
# reconstitute the ending for the filename:
suffix = "_" + self.suffix if self.suffix else ""
extension = "." + self.extension if self.extension else ""
return (
"_".join(
[
"-".join([e, self._entities[e]])
for e in self._known_entities
if e in self._entities
]
)
+ suffix
+ extension
)
def __getitem__(self, entity: str) -> Optional[str]:
return self._entities[entity] if entity in self._entities else None
def __setitem__(
self, entity: str, value: str
) -> None: # would puke with some exception if already known
return self.set(entity, value, overwrite=False)
def set(self, entity: str, value: str, overwrite: bool = True) -> None:
if entity not in self._entities:
# just set it; no complains here
self._entities[entity] = value
elif overwrite:
lgr.warning(
"Overwriting the entity %s from %s to %s for file %s",
str(entity),
str(self[entity]),
str(value),
self.__str__(),
)
self._entities[entity] = value
else:
# if it already exists, and overwrite is false:
lgr.warning(
"Setting the entity %s to %s for file %s failed",
str(entity),
str(value),
self.__str__(),
)
@property # as needed make them RW
def suffix(self) -> str:
return self._suffix
@property
def extension(self) -> Optional[str]:
return self._extension
def sanitize_label(label: str) -> str:
"""Strips any non-BIDS compliant characters within label
Parameters
----------
label : string
Returns
-------
clean_label : string
New, sanitized label
"""
clean_label = "".join(x for x in label if x.isalnum())
if not clean_label:
raise ValueError(
"Label became empty after cleanup. Please manually provide "
"a suitable alphanumeric label."
)
if clean_label != label:
lgr.warning(
"%r label contained non-alphanumeric character(s), it "
"was cleaned to be %r",
label,
clean_label,
)
return clean_label
heudiconv-1.3.2/heudiconv/cli/ 0000775 0000000 0000000 00000000000 14715167373 0016252 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/cli/__init__.py 0000664 0000000 0000000 00000000000 14715167373 0020351 0 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/cli/monitor.py 0000664 0000000 0000000 00000013417 14715167373 0020321 0 ustar 00root root 0000000 0000000 #!/usr/bin/env python
from __future__ import annotations
import argparse
from collections import OrderedDict
import json
import logging
import os
import os.path as op
from pathlib import Path
import re
import shlex
import subprocess
import time
from typing import Any, Optional
import inotify.adapters
from inotify.constants import IN_CREATE, IN_ISDIR, IN_MODIFY
from tinydb import TinyDB
_DEFAULT_LOG_FORMAT = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
_LOGGER = logging.getLogger(__name__)
MASK = IN_MODIFY | IN_CREATE
MASK_NEWDIR = IN_CREATE | IN_ISDIR
WAIT_TIME = 86400 # in seconds
# def _configure_logging():
_LOGGER.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
formatter = logging.Formatter(_DEFAULT_LOG_FORMAT)
ch.setFormatter(formatter)
_LOGGER.addHandler(ch)
def run_heudiconv(args: list[str]) -> tuple[str, dict[str, Any]]:
info_dict: dict[str, Any] = dict()
proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
cmd = " ".join(map(shlex.quote, args))
return_code = proc.wait()
if return_code == 0:
_LOGGER.info("Done running %s", cmd)
info_dict["success"] = 1
else:
_LOGGER.error("%s failed", cmd)
info_dict["success"] = 0
# get info on what we run
stdout = proc.communicate()[0].decode("utf-8")
match = re.match("INFO: PROCESSING STARTS: (.*)", stdout)
info_dict_ = json.loads(match.group(1) if match else "{}")
info_dict.update(info_dict_)
return stdout, info_dict
def process(
paths2process: dict[str, float],
db: Optional[TinyDB],
wait: int | float = WAIT_TIME,
logdir: str = "log",
) -> None:
# if paths2process and
# time.time() - os.path.getmtime(paths2process[0]) > WAIT_TIME:
processed: list[str] = []
for path, mod_time in paths2process.items():
if time.time() - mod_time > wait:
# process_me = paths2process.popleft().decode('utf-8')
process_me = path
process_dict: dict[str, Any] = {
"input_path": process_me,
"accession_number": op.basename(process_me),
}
print("Time to process {0}".format(process_me))
stdout, run_dict = run_heudiconv(["ls", "-l", process_me])
process_dict.update(run_dict)
if db is not None:
db.insert(process_dict)
# save log
log = Path(logdir, process_dict["accession_number"] + ".log")
log.write_text(stdout)
# if we processed it, or it failed,
# we need to remove it to avoid running it again
processed.append(path)
for processed_path in processed:
del paths2process[processed_path]
def monitor(
topdir: str = "/tmp/new_dir",
check_ptrn: str = "/20../../..",
db: Optional[TinyDB] = None,
wait: int | float = WAIT_TIME,
logdir: str = "log",
) -> None:
# make logdir if not existent
try:
os.makedirs(logdir)
except OSError:
pass
# paths2process = deque()
paths2process: dict[str, float] = OrderedDict()
# watch only today's folder
path_re = re.compile("(%s%s)/?$" % (topdir, check_ptrn))
i = inotify.adapters.InotifyTree(topdir.encode()) # , mask=MASK)
for event in i.event_gen():
if event is not None:
(header, type_names, watch_path, filename) = event
_LOGGER.info(
"WD=(%d) MASK=(%d) COOKIE=(%d) LEN=(%d) MASK->NAMES=%s"
" WATCH-PATH=[%s] FILENAME=[%s]",
header.wd,
header.mask,
header.cookie,
header.len,
type_names,
watch_path.decode("utf-8"),
filename.decode("utf-8"),
)
if header.mask == MASK_NEWDIR and path_re.match(watch_path.decode("utf-8")):
# we got our directory, now let's do something on it
newpath2process = op.join(watch_path, filename).decode("utf-8")
# paths2process.append(newpath2process)
# update time
paths2process[newpath2process] = time.time()
print(newpath2process, time.time())
# check if we need to update the time
for path in paths2process.keys():
if path in watch_path.decode("utf-8"):
paths2process[path] = time.time()
print("Updating {0}: {1}".format(path, paths2process[path]))
# check if there's anything to process
process(paths2process, db, wait=wait, logdir=logdir)
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
prog="monitor.py",
description=(
"Small monitoring script to detect new directories and " "process them"
),
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
)
parser.add_argument("path", help="Which directory to monitor")
parser.add_argument(
"--check_ptrn",
"-p",
help="regexp pattern for which subdirectories to check",
default="/20../../..",
)
parser.add_argument(
"--database", "-d", help="database location", default="database.json"
)
parser.add_argument(
"--wait_time",
"-w",
help="After how long should we start processing datasets? (in seconds)",
default=86400,
type=float,
)
parser.add_argument(
"--logdir", "-l", help="Where should we save the logs?", default="log"
)
return parser.parse_args()
def main() -> None:
parsed = parse_args()
print("Got {0}".format(parsed))
# open database
db = TinyDB(parsed.database, default_table="heudiconv")
monitor(
parsed.path, parsed.check_ptrn, db, wait=parsed.wait_time, logdir=parsed.logdir
)
if __name__ == "__main__":
main()
heudiconv-1.3.2/heudiconv/cli/run.py 0000664 0000000 0000000 00000016077 14715167373 0017443 0 ustar 00root root 0000000 0000000 #!/usr/bin/env python
from __future__ import annotations
from argparse import ArgumentParser
import logging
import os
import sys
from typing import Optional
from .. import __version__
from ..main import workflow
lgr = logging.getLogger(__name__)
def main(argv: Optional[list[str]] = None) -> None:
logging.basicConfig(
format="%(levelname)s: %(message)s",
level=getattr(logging, os.environ.get("HEUDICONV_LOG_LEVEL", "INFO")),
)
parser = get_parser()
args = parser.parse_args(argv)
# exit if nothing to be done
if not args.files and not args.dicom_dir_template and not args.command:
lgr.warning("Nothing to be done - displaying usage help")
parser.print_help()
sys.exit(1)
kwargs = vars(args)
workflow(**kwargs)
def get_parser() -> ArgumentParser:
docstr = """Example:
heudiconv -d 'rawdata/{subject}' -o . -f heuristic.py -s s1 s2 s3"""
parser = ArgumentParser(description=docstr)
parser.add_argument("--version", action="version", version=__version__)
group = parser.add_mutually_exclusive_group()
group.add_argument(
"-d",
"--dicom_dir_template",
dest="dicom_dir_template",
help="Location of dicomdir that can be indexed with subject id "
"{subject} and session {session}. Tarballs (can be compressed) "
"are supported in addition to directory. All matching tarballs "
"for a subject are extracted and their content processed in a "
"single pass. If multiple tarballs are found, each is assumed to "
"be a separate session and the --ses argument is ignored. Note "
"that you might need to surround the value with quotes to avoid "
"{...} being considered by shell",
)
group.add_argument(
"--files",
nargs="*",
help="Files (tarballs, dicoms) or directories containing files to "
"process. Cannot be provided if using --dicom_dir_template.",
)
parser.add_argument(
"-s",
"--subjects",
dest="subjs",
type=str,
nargs="*",
help="List of subjects - required for dicom template. If not "
'provided, DICOMS would first be "sorted" and subject IDs '
"deduced by the heuristic.",
)
parser.add_argument(
"-c",
"--converter",
choices=("dcm2niix", "none"),
default="dcm2niix",
help='Tool to use for DICOM conversion. Setting to "none" disables '
"the actual conversion step -- useful for testing heuristics.",
)
parser.add_argument(
"-o",
"--outdir",
default=os.getcwd(),
help="Output directory for conversion setup (for further "
"customization and future reference. This directory will refer "
"to non-anonymized subject IDs.",
)
parser.add_argument(
"-l",
"--locator",
default=None,
help="Study path under outdir. If provided, it overloads the value "
"provided by the heuristic. If --datalad is enabled, every "
"directory within locator becomes a super-dataset thus "
'establishing a hierarchy. Setting to "unknown" will skip that '
"dataset.",
)
parser.add_argument(
"-a",
"--conv-outdir",
default=None,
help="Output directory for converted files. By default this is "
"identical to --outdir. This option is most useful in "
"combination with --anon-cmd.",
)
parser.add_argument(
"--anon-cmd",
default=None,
help="Command to run to convert subject IDs used for DICOMs to "
"anonymized IDs. Such command must take a single argument and "
"return a single anonymized ID. Also see --conv-outdir.",
)
parser.add_argument(
"-f",
"--heuristic",
dest="heuristic",
help="Name of a known heuristic or path to the Python script "
"containing heuristic.",
)
parser.add_argument(
"-p",
"--with-prov",
action="store_true",
help="Store additional provenance information. Requires python-rdflib.",
)
parser.add_argument(
"-ss",
"--ses",
dest="session",
default=None,
help="Session for longitudinal study_sessions. Default is None.",
)
parser.add_argument(
"-b",
"--bids",
nargs="*",
metavar=("BIDSOPTION1", "BIDSOPTION2"),
choices=["notop"],
dest="bids_options",
help="Flag for output into BIDS structure. Can also take BIDS-"
"specific options, e.g., --bids notop. The only currently "
'supported options is "notop", which skips creation of '
"top-level BIDS files. This is useful when running in batch "
"mode to prevent possible race conditions.",
)
parser.add_argument(
"--overwrite",
action="store_true",
default=False,
help="Overwrite existing converted files.",
)
parser.add_argument(
"--datalad",
action="store_true",
help="Store the entire collection as DataLad dataset(s). Small files "
"will be committed directly to git, while large to annex. New "
'version (6) of annex repositories will be used in a "thin" '
"mode so it would look to mortals as just any other regular "
"directory (i.e. no symlinks to under .git/annex). For now just "
"for BIDS mode.",
)
parser.add_argument(
"--dbg",
action="store_true",
dest="debug",
help="Do not catch exceptions and show exception traceback.",
)
parser.add_argument(
"--command",
choices=(
"heuristics",
"heuristic-info",
"ls",
"populate-templates",
"sanitize-jsons",
"treat-jsons",
"populate-intended-for",
),
help="Custom action to be performed on provided files instead of "
"regular operation.",
)
parser.add_argument(
"-g",
"--grouping",
default="studyUID",
choices=("studyUID", "accession_number", "all", "custom"),
help="How to group dicoms (default: by studyUID).",
)
parser.add_argument(
"--minmeta",
action="store_true",
help="Exclude dcmstack meta information in sidecar jsons.",
)
parser.add_argument(
"--random-seed", type=int, default=None, help="Random seed to initialize RNG."
)
parser.add_argument(
"--dcmconfig",
default=None,
help="JSON file for additional dcm2niix configuration.",
)
submission = parser.add_argument_group("Conversion submission options")
submission.add_argument(
"-q",
"--queue",
choices=("SLURM", None),
default=None,
help="Batch system to submit jobs in parallel.",
)
submission.add_argument(
"--queue-args",
dest="queue_args",
default=None,
help="Additional queue arguments passed as a single string of "
"space-separated Argument=Value pairs.",
)
return parser
if __name__ == "__main__":
main()
heudiconv-1.3.2/heudiconv/convert.py 0000664 0000000 0000000 00000112603 14715167373 0017540 0 ustar 00root root 0000000 0000000 from __future__ import annotations
__docformat__ = "numpy"
from collections.abc import Callable
import logging
import os
import os.path as op
import random
import re
import shutil
import sys
from types import ModuleType
from typing import TYPE_CHECKING, Any, List, Optional, cast
import filelock
from nipype import Node
from nipype.interfaces.base import TraitListObject
from .bids import (
BIDS_VERSION,
BIDSError,
add_participant_record,
populate_bids_templates,
populate_intended_for,
sanitize_label,
save_scans_key,
tuneup_bids_json_files,
)
from .dicoms import (
compress_dicoms,
embed_metadata_from_dicoms,
group_dicoms_into_seqinfos,
)
from .due import Doi, due
from .utils import (
SeqInfo,
TempDirs,
assure_no_file_exists,
clear_temp_dicoms,
file_md5sum,
load_json,
read_config,
safe_copyfile,
safe_movefile,
save_json,
set_readonly,
treat_infofile,
write_config,
)
if TYPE_CHECKING:
if sys.version_info >= (3, 8):
from typing import TypedDict
else:
from typing_extensions import TypedDict
class PopulateIntendedForOpts(TypedDict, total=False):
matching_parameters: str | list[str]
criterion: str
LOCKFILE = "heudiconv.lock"
DW_IMAGE_IN_FMAP_FOLDER_WARNING = (
"Diffusion-weighted image saved in non dwi folder ({folder})"
)
lgr = logging.getLogger(__name__)
def conversion_info(
subject: str,
outdir: str,
info: dict[tuple[str, tuple[str, ...], None], list],
filegroup: dict[str, list[str]],
ses: Optional[str],
) -> list[tuple[str, tuple[str, ...], list[str]]]:
convert_info: list[tuple[str, tuple[str, ...], list[str]]] = []
for key, items in info.items():
if not items:
continue
template, outtype = key[0], key[1]
# So no annotation_classes of any kind! so if not used -- what was the
# intention???? XXX
outpath = outdir
for idx, itemgroup in enumerate(items):
if not isinstance(itemgroup, list):
itemgroup = [itemgroup]
for subindex, item in enumerate(itemgroup):
parameters = {}
if isinstance(item, dict):
parameters = {k: v for k, v in item.items()}
item = parameters["item"]
del parameters["item"]
# some helper meta-varaibles
parameters.update(
dict(
item=idx + 1,
subject=subject,
seqitem=item,
subindex=subindex + 1,
session="ses-" + str(ses),
bids_subject_session_prefix="sub-%s" % subject
+ (("_ses-%s" % ses) if ses else ""),
bids_subject_session_dir="sub-%s" % subject
+ (("/ses-%s" % ses) if ses else ""),
# referring_physician_name
# study_description
)
)
try:
files = filegroup[item]
except KeyError:
files = filegroup[str(item)]
outprefix = template.format(**parameters)
convert_info.append((op.join(outpath, outprefix), outtype, files))
return convert_info
def prep_conversion(
sid: Optional[str],
dicoms: Optional[list[str]],
outdir: str,
heuristic: ModuleType,
converter: str,
anon_sid: Optional[str],
anon_outdir: Optional[str],
with_prov: bool,
ses: Optional[str],
bids_options: Optional[str],
seqinfo: Optional[dict[SeqInfo, list[str]]],
min_meta: bool,
overwrite: bool,
dcmconfig: Optional[str],
grouping: str,
) -> None:
if dicoms:
lgr.info("Processing %d dicoms", len(dicoms))
elif seqinfo:
lgr.info("Processing %d pre-sorted seqinfo entries", len(seqinfo))
else:
raise ValueError("neither dicoms nor seqinfo dict was provided")
if bids_options is not None:
if not sid:
raise ValueError(
"BIDS requires alphanumeric subject ID. Got an empty value"
)
sid = sanitize_label(sid)
if ses:
ses = sanitize_label(ses)
if not anon_sid:
if sid is None:
raise ValueError("Neither 'sid' nor 'anon_sid' is true")
anon_sid = sid
if not anon_outdir:
anon_outdir = outdir
# Generate heudiconv info folder
idir = op.join(outdir, ".heudiconv", anon_sid)
if bids_options is not None and ses:
idir = op.join(idir, "ses-%s" % str(ses))
if anon_outdir == outdir:
idir = op.join(idir, "info")
if not op.exists(idir):
os.makedirs(idir)
ses_suffix = "_ses-%s" % ses if ses is not None else ""
info_file = op.join(idir, "%s%s.auto.txt" % (sid, ses_suffix))
edit_file = op.join(idir, "%s%s.edit.txt" % (sid, ses_suffix))
filegroup_file = op.join(idir, "filegroup%s.json" % ses_suffix)
# if conversion table(s) do not exist -- we need to prepare them
# (the *prepare* stage in https://github.com/nipy/heudiconv/issues/134)
# if overwrite - recalculate this anyways
reuse_conversion_table = op.exists(edit_file)
# We also might need to redo it if changes in the heuristic file
# detected
# ref: https://github.com/nipy/heudiconv/issues/84#issuecomment-330048609
# for more automagical wishes
target_heuristic_filename = op.join(idir, "heuristic.py")
# facilitates change - TODO: remove in 1.0
old_heuristic_filename = op.join(idir, op.basename(heuristic.filename))
if op.exists(old_heuristic_filename):
assure_no_file_exists(target_heuristic_filename)
safe_copyfile(old_heuristic_filename, target_heuristic_filename)
assure_no_file_exists(old_heuristic_filename)
# TODO:
# 1. add a test
# 2. possibly extract into a dedicated function for easier logic flow here
# and a dedicated unittest
if op.exists(target_heuristic_filename) and file_md5sum(
target_heuristic_filename
) != file_md5sum(heuristic.filename):
# remake conversion table
reuse_conversion_table = False
lgr.info(
"Will not reuse existing conversion table files because heuristic "
"has changed"
)
info: dict[tuple[str, tuple[str, ...], None], list]
if reuse_conversion_table:
lgr.info("Reloading existing filegroup.json " "because %s exists", edit_file)
info = read_config(edit_file)
filegroup = load_json(filegroup_file)
# XXX Yarik finally understood why basedir was dragged along!
# So we could reuse the same PATHs definitions possibly consistent
# across re-runs... BUT that wouldn't work anyways if e.g.
# DICOMs dumped with SOP UUIDs thus differing across runs etc
# So either it would need to be brought back or reconsidered altogether
# (since no sample data to test on etc)
else:
assure_no_file_exists(target_heuristic_filename)
safe_copyfile(heuristic.filename, target_heuristic_filename)
if dicoms:
seqinfo = group_dicoms_into_seqinfos(
dicoms,
grouping,
file_filter=getattr(heuristic, "filter_files", None),
dcmfilter=getattr(heuristic, "filter_dicom", None),
flatten=True,
custom_grouping=getattr(heuristic, "grouping", None),
# callable which will be provided dcminfo and returned
# structure extend seqinfo
custom_seqinfo=getattr(heuristic, "custom_seqinfo", None),
)
elif seqinfo is None:
raise ValueError("Neither 'dicoms' nor 'seqinfo' is given")
seqinfo_list = list(seqinfo.keys())
filegroup = {si.series_id: x for si, x in seqinfo.items()}
dicominfo_file = op.join(idir, "dicominfo%s.tsv" % ses_suffix)
# allow to overwrite even if was present under git-annex already
assure_no_file_exists(dicominfo_file)
with open(dicominfo_file, "wt") as fp:
fp.write("\t".join(SeqInfo._fields) + "\n")
for seq in seqinfo_list:
fp.write("\t".join([str(val) for val in seq]) + "\n")
lgr.debug("Calling out to %s.infodict", heuristic)
info = heuristic.infotodict(seqinfo_list)
lgr.debug("Writing to {}, {}, {}".format(info_file, edit_file, filegroup_file))
assure_no_file_exists(info_file)
write_config(info_file, info)
assure_no_file_exists(edit_file)
write_config(edit_file, info)
save_json(filegroup_file, filegroup)
if bids_options is not None:
# the other portion of the path would mimic BIDS layout
# so we don't need to worry here about sub, ses at all
tdir = anon_outdir
else:
tdir = op.join(anon_outdir, anon_sid)
if converter.lower() != "none":
lgr.info("Doing conversion using %s", converter)
cinfo = conversion_info(anon_sid, tdir, info, filegroup, ses)
convert(
cinfo,
converter=converter,
scaninfo_suffix=getattr(heuristic, "scaninfo_suffix", ".json"),
custom_callable=getattr(heuristic, "custom_callable", None),
populate_intended_for_opts=getattr(
heuristic, "POPULATE_INTENDED_FOR_OPTS", None
),
with_prov=with_prov,
bids_options=bids_options,
outdir=tdir,
min_meta=min_meta,
overwrite=overwrite,
dcmconfig=dcmconfig,
)
for item_dicoms in filegroup.values():
clear_temp_dicoms(item_dicoms)
if bids_options is not None and "notop" not in bids_options:
lockfile = op.join(anon_outdir, LOCKFILE)
if op.exists(lockfile):
lgr.warning(
"Existing lockfile found in {0} - waiting for the "
"lock to be released. To set a timeout limit, set "
"the HEUDICONV_FILELOCK_TIMEOUT environmental variable "
"to a value in seconds. If this process hangs, it may "
"require a manual deletion of the {0}.".format(lockfile)
)
timeout = float(os.getenv("HEUDICONV_LOCKFILE_TIMEOUT", -1))
with filelock.SoftFileLock(lockfile, timeout=timeout):
if seqinfo:
keys = list(seqinfo)
add_participant_record(
anon_outdir, anon_sid, keys[0].patient_age, keys[0].patient_sex
)
populate_bids_templates(
anon_outdir, getattr(heuristic, "DEFAULT_FIELDS", {})
)
def update_complex_name(metadata: dict[str, Any], filename: str) -> str:
"""
Insert `_part-` entity into filename if data are from a
sequence with magnitude/phase part.
Parameters
----------
metadata : dict
Scan metadata dictionary from BIDS sidecar file.
filename : str
Incoming filename
Returns
-------
filename : str
Updated filename with part entity added in appropriate position.
"""
# Some scans separate magnitude/phase differently
# A small note: _phase is deprecated, but this may add part-mag to
# magnitude data while leaving phase data with a separate suffix,
# depending on how one sets up their heuristic.
unsupported_types = [
"_phase",
"_magnitude",
"_magnitude1",
"_magnitude2",
"_phasediff",
"_phase1",
"_phase2",
]
if any(ut in filename for ut in unsupported_types):
return filename
# Check to see if it is magnitude or phase part:
img_type = cast(List[str], metadata.get("ImageType", []))
if "M" in img_type:
mag_or_phase = "mag"
elif "P" in img_type:
mag_or_phase = "phase"
else:
raise RuntimeError("Data type could not be inferred from the metadata.")
# Determine scan suffix
filetype = "_" + filename.split("_")[-1]
# Insert part label
if not ("_part-%s" % mag_or_phase) in filename:
# If "_part-" is specified, prepend the 'mag_or_phase' value.
if "_part-" in filename:
raise BIDSError(
"Part label for images will be automatically set, "
"remove from heuristic"
)
# Insert it **before** the following string(s), whichever appears first.
# https://bids-specification.readthedocs.io/en/stable/99-appendices/09-entities.html
entities_after_part = [
"_proc",
"_hemi",
"_space",
"_split",
"_recording",
"_chunk",
"_res",
"_den",
"_label",
"_desc",
filetype,
]
for label in entities_after_part:
if (label == filetype) or (label in filename):
filename = filename.replace(label, "_part-%s%s" % (mag_or_phase, label))
break
return filename
def update_multiecho_name(
metadata: dict[str, Any], filename: str, echo_times: list[float]
) -> str:
"""
Insert `_echo-` entity into filename if data are from a multi-echo
sequence.
Parameters
----------
metadata : dict
Scan metadata dictionary from BIDS sidecar file.
filename : str
Incoming filename
echo_times : list
List of all echo times from scan. Used to determine the echo *number*
(i.e., index) if field is missing from metadata.
Returns
-------
filename : str
Updated filename with echo entity added, if appropriate.
"""
# Field maps separate echoes differently, so do not attempt to update any filenames with these
# suffixes
unsupported_types = [
"_magnitude",
"_magnitude1",
"_magnitude2",
"_phasediff",
"_phase1",
"_phase2",
"_fieldmap",
]
if any(ut in filename for ut in unsupported_types):
return filename
if not isinstance(echo_times, list):
raise TypeError(
f'Argument "echo_times" must be a list, not a {type(echo_times)}'
)
# Get the EchoNumber from json file info. If not present, use EchoTime.
if "EchoNumber" in metadata.keys():
echo_number = metadata["EchoNumber"]
assert isinstance(echo_number, int)
elif "EchoTime" in metadata.keys():
echo_number = echo_times.index(metadata["EchoTime"]) + 1
else:
raise KeyError(
'Either "EchoNumber" or "EchoTime" must be in metadata keys. '
f"Keys detected: {metadata.keys()}"
)
# Determine scan suffix
filetype = "_" + filename.split("_")[-1]
# Insert it **before** the following string(s), whichever appears first.
# https://bids-specification.readthedocs.io/en/stable/99-appendices/09-entities.html
entities_after_echo = [
"_flip",
"_inv",
"_mt",
"_part",
"_proc",
"_hemi",
"_space",
"_split",
"_recording",
"_chunk",
"_res",
"_den",
"_label",
"_desc",
filetype,
]
for label in entities_after_echo:
if (label == filetype) or (label in filename):
filename = filename.replace(label, "_echo-%s%s" % (echo_number, label))
break
return filename
def update_uncombined_name(
metadata: dict[str, Any], filename: str, channel_names: list[str]
) -> str:
"""
Insert `_ch-` entity into filename if data are from a sequence
with "save uncombined".
Parameters
----------
metadata : dict
Scan metadata dictionary from BIDS sidecar file.
filename : str
Incoming filename
channel_names : list
List of all channel names from scan. Used to determine the channel
*number* (i.e., index) if field is missing from metadata.
Returns
-------
filename : str
Updated filename with ch entity added, if appropriate.
"""
# In case any scan types separate channels differently
unsupported_types: list[str] = []
if any(ut in filename for ut in unsupported_types):
return filename
if not isinstance(channel_names, list):
raise TypeError(
f'Argument "channel_names" must be a list, not a {type(channel_names)}'
)
# Determine the channel number
coil_string = metadata["CoilString"]
assert isinstance(coil_string, str)
channel_number = "".join(c for c in coil_string if c.isdigit())
if not channel_number:
channel_number = str(channel_names.index(coil_string) + 1)
channel_number = channel_number.zfill(2)
# Determine scan suffix
filetype = "_" + filename.split("_")[-1]
# Insert it **before** the following string(s), whichever appears first.
# Choosing to put channel near the end since it's not in the specification yet.
# See https://bids-specification.readthedocs.io/en/stable/99-appendices/09-entities.html
entities_after_ch = [
"_proc",
"_hemi",
"_space",
"_split",
"_recording",
"_chunk",
"_res",
"_den",
"_label",
"_desc",
filetype,
]
for label in entities_after_ch:
if (label == filetype) or (label in filename):
filename = filename.replace(label, "_ch-%s%s" % (channel_number, label))
break
return filename
def convert(
items: list[tuple[str, tuple[str, ...], list[str]]],
converter: str,
scaninfo_suffix: str,
custom_callable: Optional[Callable[[str, tuple[str, ...], list[str]], Any]],
with_prov: bool,
bids_options: Optional[str],
outdir: str,
min_meta: bool,
overwrite: bool,
symlink: bool = True,
prov_file: Optional[str] = None,
dcmconfig: Optional[str] = None,
populate_intended_for_opts: Optional[PopulateIntendedForOpts] = None,
) -> None:
"""Perform actual conversion (calls to converter etc) given info from
heuristic's `infotodict`
"""
prov_files: list[str] = []
tempdirs = TempDirs()
if bids_options is not None:
due.cite(
# doi matches the BIDS_VERSION
Doi("10.5281/zenodo.4085321"),
description="Brain Imaging Data Structure (BIDS) Specification",
path="bids",
version=BIDS_VERSION,
tags=["implementation"],
)
due.cite(
Doi("10.1038/sdata.2016.44"),
description="Brain Imaging Data Structure (BIDS), Original paper",
path="bids",
tags=["documentation"],
)
for item in items:
prefix, outtypes, item_dicoms = item
if isinstance(outtypes, str): # type: ignore[unreachable]
lgr.warning( # type: ignore[unreachable]
"Provided output types %r of type 'str' instead "
"of a tuple for prefix %r. Likely need to fix-up your heuristic. "
"Meanwhile we are 'manually' converting to 'tuple'",
outtypes,
prefix,
)
outtypes = (outtypes,)
prefix_dirname = op.dirname(prefix)
outname_bids = prefix + ".json"
bids_outfiles = []
# set empty outname and scaninfo in case we only want dicoms
outname = ""
scaninfo = ""
lgr.info(
"Converting %s (%d DICOMs) -> %s . Converter: %s . Output types: %s",
prefix,
len(item_dicoms),
prefix_dirname,
converter,
outtypes,
)
# We want to create this dir only if we are converting it to nifti,
# or if we're using BIDS
dicom_only = outtypes == ("dicom",)
if not (dicom_only and (bids_options is not None)) and not op.exists(
prefix_dirname
):
os.makedirs(prefix_dirname)
for outtype in outtypes:
lgr.debug(
"Processing %d dicoms for output type %s. Overwrite=%s",
len(item_dicoms),
outtype,
overwrite,
)
lgr.debug("Includes the following dicoms: %s", item_dicoms)
if outtype == "dicom":
convert_dicom(
item_dicoms,
bids_options,
prefix,
outdir,
tempdirs,
symlink,
overwrite,
)
elif outtype in ["nii", "nii.gz"]:
assert converter == "dcm2niix", f"Invalid converter {converter}"
due.cite(
Doi("10.1016/j.jneumeth.2016.03.001"),
path="dcm2niix",
description="DICOM to NIfTI + .json sidecar conversion utility",
tags=["implementation"],
)
outname, scaninfo = (prefix + "." + outtype, prefix + scaninfo_suffix)
if not op.exists(outname) or overwrite:
tmpdir = tempdirs("dcm2niix")
# run conversion through nipype
res, prov_file = nipype_convert(
item_dicoms, prefix, with_prov, bids_options, tmpdir, dcmconfig
)
bids_outfiles = save_converted_files(
res,
item_dicoms,
bids_options,
outtype,
prefix,
outname_bids,
overwrite=overwrite,
)
# save acquisition time information if it's BIDS
# at this point we still have acquisition date
if bids_options is not None:
save_scans_key(item, bids_outfiles)
# Fix up and unify BIDS files
tuneup_bids_json_files(bids_outfiles)
if prov_file:
prov_files.append(prov_file)
tempdirs.rmtree(tmpdir)
else:
raise RuntimeError(
"was asked to convert into %s but destination already exists"
% (outname)
)
# add the taskname field to the json file(s):
add_taskname_to_infofile(bids_outfiles)
if len(bids_outfiles) > 1:
lgr.warning(
"For now not embedding BIDS and info generated "
".nii.gz itself since sequence produced "
"multiple files"
)
elif not bids_outfiles:
lgr.debug("No BIDS files were produced, nothing to embed to then")
elif outname and not min_meta:
embed_metadata_from_dicoms(
bids_options,
item_dicoms,
outname,
outname_bids,
prov_file,
scaninfo,
tempdirs,
with_prov,
)
if scaninfo and op.exists(scaninfo):
lgr.info("Post-treating %s file", scaninfo)
treat_infofile(scaninfo)
# this may not always be the case: ex. fieldmap1, fieldmap2
# will address after refactor
if outname and op.exists(outname):
set_readonly(outname)
if custom_callable is not None:
custom_callable(*item)
# Populate "IntendedFor" for fmap files if requested in heuristic
if populate_intended_for_opts is not None:
# Because fmap files can only be used to correct for distortions in images
# collected within the same scanning session, find unique subject/session
# combinations from the outname in each item:
outnames = [item[0] for item in items]
# - grab "sub-[/ses-]", and keep only unique ones:
sessions: set[str] = set()
for oname in outnames:
m = re.search(
"sub-(?P[a-zA-Z0-9]*)([{0}_]ses-(?P[a-zA-Z0-9]*))?".format(
op.sep
),
oname,
)
if m:
sessions.add(m.group(0))
else:
# "sub-[/ses-]" is not present, so this is not BIDS
# compliant and it doesn't make sense to add "IntendedFor":
sessions.clear()
break
for ses in sessions:
session_path = op.join(outdir, ses)
populate_intended_for(session_path, **populate_intended_for_opts)
def convert_dicom(
item_dicoms: list[str],
bids_options: Optional[str],
prefix: str,
outdir: str,
tempdirs: TempDirs,
_symlink: bool,
overwrite: bool,
) -> None:
"""Save DICOMs as output (default is by symbolic link)
Parameters
----------
item_dicoms : list of filenames
DICOMs to save
bids_options : str or None
If not None then save to BIDS format. String may be empty
or contain bids specific options
prefix : string
Conversion outname
outdir : string
Output directory
tempdirs : TempDirs instance
Object to handle temporary directories created
TODO: remove
symlink : bool
Create softlink to DICOMs - if False, create hardlink instead.
overwrite : bool
If True, allows overwriting of previous conversion
"""
if bids_options is not None:
# mimic the same hierarchy location as the prefix
# although it could all have been done probably
# within heuristic really
sourcedir = op.join(outdir, "sourcedata")
sourcedir_ = op.join(sourcedir, op.dirname(op.relpath(prefix, outdir)))
if not op.exists(sourcedir_):
os.makedirs(sourcedir_)
compress_dicoms(
item_dicoms, op.join(sourcedir_, op.basename(prefix)), tempdirs, overwrite
)
else:
dicomdir = prefix + "_dicom"
if op.exists(dicomdir):
lgr.info(
"Found existing DICOM directory {}, " "removing...".format(dicomdir)
)
shutil.rmtree(dicomdir)
os.mkdir(dicomdir)
for filename in item_dicoms:
outfile = op.join(dicomdir, op.basename(filename))
if not op.islink(outfile):
# TODO: add option to enable hardlink?
# if symlink:
# os.symlink(filename, outfile)
# else:
# os.link(filename, outfile)
shutil.copyfile(filename, outfile)
def nipype_convert(
item_dicoms: list[str],
prefix: str,
with_prov: bool,
bids_options: Optional[str],
tmpdir: str,
dcmconfig: Optional[str] = None,
) -> tuple[Node, Optional[str]]:
"""
Converts DICOMs grouped from heuristic using Nipype's Dcm2niix interface.
Parameters
----------
item_dicoms : list
DICOM files to convert
prefix : str
Heuristic output path
with_prov : bool
Store provenance information
bids_options : str or None
If not None then output BIDS sidecar JSONs
String may contain bids specific options
tmpdir : str
Conversion working directory
dcmconfig : str, optional
JSON file used for additional Dcm2niix configuration
"""
import nipype
if with_prov:
from nipype import config
config.enable_provenance()
from nipype.interfaces.dcm2nii import Dcm2niix
#
item_dicoms = list(map(op.abspath, item_dicoms)) # type: ignore[arg-type]
fromfile = dcmconfig if dcmconfig else None
if fromfile:
lgr.info("Using custom config file %s", fromfile)
convertnode = Node(Dcm2niix(from_file=fromfile), name="convert")
convertnode.base_dir = tmpdir
convertnode.inputs.source_names = item_dicoms
convertnode.inputs.out_filename = op.basename(
prefix
) + "_heudiconv%03d" % random.randint(0, 999)
prefix_dir = op.dirname(prefix)
# if provided prefix had a path in it -- pass is as output_dir instead of default curdir
if prefix_dir:
convertnode.inputs.output_dir = prefix_dir
if nipype.__version__.split(".")[0] == "0":
# deprecated since 1.0, might be needed(?) before
convertnode.inputs.terminal_output = "allatonce"
else:
convertnode.terminal_output = "allatonce"
convertnode.inputs.bids_format = bids_options is not None
eg = convertnode.run()
# prov information
prov_file = prefix + "_prov.ttl" if with_prov else None
if prov_file:
safe_movefile(
op.join(convertnode.base_dir, convertnode.name, "provenance.ttl"), prov_file
)
return eg, prov_file
def save_converted_files(
res: Node,
item_dicoms: list[str],
bids_options: Optional[str],
outtype: str,
prefix: str,
outname_bids: str,
overwrite: bool,
) -> list[str]:
"""Copy converted files from tempdir to output directory.
Will rename files if necessary.
Parameters
----------
res : Node
Nipype conversion Node with results
item_dicoms: list
Filenames of converted DICOMs
bids : list or None
If not list save to BIDS
List may contain bids specific options
prefix : str
Returns
-------
bids_outfiles
Converted BIDS files
"""
from nipype.interfaces.base import isdefined
prefix_dirname, prefix_basename = op.split(prefix)
bids_outfiles: list[str] = []
res_files = res.outputs.converted_files
if not len(res_files):
lgr.debug("DICOMs {} were not converted".format(item_dicoms))
return []
if isdefined(res.outputs.bvecs) and isdefined(res.outputs.bvals):
bvals, bvecs = res.outputs.bvals, res.outputs.bvecs
bvals = list(bvals) if isinstance(bvals, TraitListObject) else bvals
bvecs = list(bvecs) if isinstance(bvecs, TraitListObject) else bvecs
if prefix_dirname.endswith("dwi"):
outname_bvecs, outname_bvals = prefix + ".bvec", prefix + ".bval"
safe_movefile(bvecs, outname_bvecs, overwrite)
safe_movefile(bvals, outname_bvals, overwrite)
else:
if bvals_are_zero(bvals):
to_remove = bvals + bvecs if isinstance(bvals, list) else [bvals, bvecs]
for ftr in to_remove:
os.remove(ftr)
lgr.debug("%s and %s were removed since not dwi", bvecs, bvals)
else:
lgr.warning(
DW_IMAGE_IN_FMAP_FOLDER_WARNING.format(folder=prefix_dirname)
)
lgr.warning(
".bvec and .bval files will be generated. This is NOT BIDS compliant"
)
outname_bvecs, outname_bvals = prefix + ".bvec", prefix + ".bval"
safe_movefile(bvecs, outname_bvecs, overwrite)
safe_movefile(bvals, outname_bvals, overwrite)
if isinstance(res_files, list):
res_files = sorted(res_files)
# we should provide specific handling for fmap,
# dwi etc which might spit out multiple files
suffixes = (
[str(i + 1) for i in range(len(res_files))]
if (bids_options is not None)
else None
)
if not suffixes:
lgr.warning(
"Following series files likely have "
"multiple (%d) volumes (orientations?) "
"generated: %s ...",
len(res_files),
item_dicoms[0],
)
suffixes = [str(-i - 1) for i in range(len(res_files))]
# Also copy BIDS files although they might need to
# be merged/postprocessed later
bids_files = (
sorted(res.outputs.bids)
if len(res.outputs.bids) == len(res_files)
else [None] * len(res_files)
)
# preload since will be used in multiple spots
bids_metas = [load_json(b) for b in bids_files if b]
### Do we have a multi-echo series? ###
# Some Siemens sequences (e.g. CMRR's MB-EPI) set the label 'TE1',
# 'TE2', etc. in the 'ImageType' field. However, other seqs do not
# (e.g. MGH ME-MPRAGE). They do set a 'EchoNumber', but not for the
# first echo. To compound the problem, the echoes are NOT in order,
# so the first NIfTI file does not correspond to echo-1, etc. So, we
# need to know, beforehand, whether we are dealing with a multi-echo
# series. To do that, the most straightforward way is to read the
# echo times for all bids_files and see if they are all the same or not.
# Collect some metadata across all images
echo_times: set[float] = set()
channel_names: set[str] = set()
image_types: set[str] = set()
for metadata in bids_metas:
if not metadata:
continue
try:
echo_times.add(metadata["EchoTime"])
except KeyError:
pass
try:
channel_names.add(metadata["CoilString"])
except KeyError:
pass
try:
image_types.update(metadata["ImageType"])
except KeyError:
pass
is_multiecho = (
len(set(filter(bool, echo_times))) > 1
) # Check for varying echo times
is_uncombined = (
len(set(filter(bool, channel_names))) > 1
) # Check for uncombined data
is_complex = (
"M" in image_types and "P" in image_types
) # Determine if data are complex (magnitude + phase)
echo_times_lst = sorted(echo_times) # also converts to list
channel_names_lst = sorted(channel_names) # also converts to list
### Loop through the bids_files, set the output name and save files
for fl, suffix, bids_file, bids_meta in zip(
res_files, suffixes, bids_files, bids_metas
):
# TODO: monitor conversion duration
# set the prefix basename for this specific file (we'll modify it,
# and we don't want to modify it for all the bids_files):
this_prefix_basename = prefix_basename
# Update name for certain criteria
if bids_file:
if is_multiecho:
this_prefix_basename = update_multiecho_name(
bids_meta, this_prefix_basename, echo_times_lst
)
if is_complex:
this_prefix_basename = update_complex_name(
bids_meta, this_prefix_basename
)
if is_uncombined:
this_prefix_basename = update_uncombined_name(
bids_meta, this_prefix_basename, channel_names_lst
)
# Fallback option:
# If we have failed to modify this_prefix_basename, because it didn't fall
# into any of the options above, just add the suffix at the end:
if this_prefix_basename == prefix_basename:
this_prefix_basename += suffix
# Finally, form the outname by stitching the directory and outtype:
outname = op.join(prefix_dirname, this_prefix_basename)
outfile = outname + "." + outtype
# Write the files needed:
safe_movefile(fl, outfile, overwrite)
if bids_file:
outname_bids_file = "%s.json" % (outname)
safe_movefile(bids_file, outname_bids_file, overwrite)
bids_outfiles.append(outname_bids_file)
# res_files is not a list
else:
outname = "{}.{}".format(prefix, outtype)
safe_movefile(res_files, outname, overwrite)
if isdefined(res.outputs.bids):
try:
safe_movefile(res.outputs.bids, outname_bids, overwrite)
bids_outfiles.append(outname_bids)
except TypeError: ##catch lists
raise TypeError("Multiple BIDS sidecars detected.")
return bids_outfiles
def add_taskname_to_infofile(infofiles: str | list[str]) -> None:
"""Add the "TaskName" field to json files with _task- entity in the name.
Note: _task- entity could be present not only in functional data
but in many other modalities now.
Parameters
----------
infofiles: list or str
json filenames or a single filename.
"""
# in case they pass a string with a path:
if isinstance(infofiles, str):
infofiles = [infofiles]
for infofile in infofiles:
meta_info = load_json(infofile)
m = re.search(r"(?<=_task-)\w+", op.basename(infofile))
if m:
meta_info["TaskName"] = m.group(0).split("_")[0]
else:
# leave it to bids-validator to validate/inform about presence
# of required entities/fields.
continue
# write to outfile
save_json(infofile, meta_info)
def bvals_are_zero(bval_file: str | list) -> bool:
"""Checks if all entries in a bvals file are zero (or 5, for Siemens files).
Parameters
----------
bval_file : str
file with the bvals
Returns
-------
True if all are all 0 or 5; False otherwise.
"""
# GE hyperband multi-echo containing diffusion info
if isinstance(bval_file, list):
return all(map(bvals_are_zero, bval_file))
with open(bval_file) as f:
bvals = f.read().split()
bvals_unique = set(float(b) for b in bvals)
return bvals_unique == {0.0} or bvals_unique == {5.0}
heudiconv-1.3.2/heudiconv/dicoms.py 0000664 0000000 0000000 00000064621 14715167373 0017344 0 ustar 00root root 0000000 0000000 # dicom operations
from __future__ import annotations
from collections.abc import Callable
import datetime
import logging
import os
import os.path as op
from pathlib import Path
import sys
import tarfile
from typing import (
TYPE_CHECKING,
Any,
Dict,
Hashable,
List,
NamedTuple,
Optional,
Protocol,
Union,
overload,
)
from unittest.mock import patch
import warnings
import pydicom as dcm
from .utils import (
SeqInfo,
TempDirs,
get_typed_attr,
load_json,
set_readonly,
strptime_dcm_da_tm,
strptime_dcm_dt,
)
if TYPE_CHECKING:
if sys.version_info >= (3, 8):
from typing import Literal
else:
from typing_extensions import Literal
with warnings.catch_warnings():
warnings.simplefilter("ignore")
# suppress warning
import nibabel.nicom.dicomwrappers as dw
# TODO: remove the kludge whenever
# https://github.com/moloney/dcmstack/pull/90 is merged and released
if not hasattr(dcm, "read_file"):
dcm.read_file = dcm.dcmread
lgr = logging.getLogger(__name__)
total_files = 0
# Might be monkey patched by user heuristic to tune desired compression level.
# Preferably do not move/rename.
compresslevel = 9
class CustomSeqinfoT(Protocol):
def __call__(self, wrapper: dw.Wrapper, series_files: list[str]) -> Hashable:
...
def create_seqinfo(
mw: dw.Wrapper,
series_files: list[str],
series_id: str,
custom_seqinfo: CustomSeqinfoT | None = None,
) -> SeqInfo:
"""Generate sequence info
Parameters
----------
mw: Wrapper
series_files: list
series_id: str
"""
dcminfo = mw.dcm_data
accession_number = dcminfo.get("AccessionNumber")
# TODO: do not group echoes by default
size: list[int] = list(mw.image_shape) + [len(series_files)]
if len(size) < 4:
size.append(1)
# parse DICOM for seqinfo fields
TR = get_typed_attr(dcminfo, "RepetitionTime", float, -1000) / 1000
TE = get_typed_attr(dcminfo, "EchoTime", float, -1)
refphys = get_typed_attr(dcminfo, "ReferringPhysicianName", str, "")
image_type = get_typed_attr(dcminfo, "ImageType", tuple, ())
is_moco = "MOCO" in image_type
series_desc = get_typed_attr(dcminfo, "SeriesDescription", str, "")
protocol_name = get_typed_attr(dcminfo, "ProtocolName", str, "")
for k, m in (
([0x18, 0x24], "GE and Philips"),
([0x19, 0x109C], "Siemens"),
([0x18, 0x9005], "Siemens XA"),
):
if v := dcminfo.get(k):
sequence_name = v.value
lgr.debug(
"Identified sequence name as %s coming from the %r family of MR scanners",
sequence_name,
m,
)
break
else:
sequence_name = ""
# initialized in `group_dicoms_to_seqinfos`
global total_files
total_files += len(series_files)
custom_seqinfo_data = (
custom_seqinfo(wrapper=mw, series_files=series_files)
if custom_seqinfo
else None
)
try:
hash(custom_seqinfo_data)
except TypeError:
raise RuntimeError(
"Data returned by the heuristics custom_seqinfo is not hashable. "
"See https://heudiconv.readthedocs.io/en/latest/heuristics.html#custom_seqinfo for more "
"details."
)
return SeqInfo(
total_files_till_now=total_files,
example_dcm_file=op.basename(series_files[0]),
series_id=series_id,
dcm_dir_name=op.basename(op.dirname(series_files[0])),
series_files=len(series_files),
unspecified="",
dim1=size[0],
dim2=size[1],
dim3=size[2],
dim4=size[3],
TR=TR,
TE=TE,
protocol_name=protocol_name,
is_motion_corrected=is_moco,
is_derived="derived" in [x.lower() for x in image_type],
patient_id=dcminfo.get("PatientID"),
study_description=dcminfo.get("StudyDescription"),
referring_physician_name=refphys,
series_description=series_desc,
sequence_name=sequence_name,
image_type=image_type,
accession_number=accession_number,
# For demographics to populate BIDS participants.tsv
patient_age=dcminfo.get("PatientAge"),
patient_sex=dcminfo.get("PatientSex"),
date=dcminfo.get("AcquisitionDate"),
series_uid=dcminfo.get("SeriesInstanceUID"),
time=dcminfo.get("AcquisitionTime"),
custom=custom_seqinfo_data,
)
def validate_dicom(
fl: str, dcmfilter: Optional[Callable[[dcm.dataset.Dataset], Any]]
) -> Optional[tuple[dw.Wrapper, tuple[int, str], Optional[str]]]:
"""
Parse DICOM attributes. Returns None if not valid.
"""
mw = dw.wrapper_from_file(fl, force=True, stop_before_pixels=True)
# clean series signature
for sig in ("iop", "ICE_Dims", "SequenceName"):
try:
del mw.series_signature[sig]
except KeyError:
pass
# Workaround for protocol name in private siemens csa header
if not getattr(mw.dcm_data, "ProtocolName", "").strip():
mw.dcm_data.ProtocolName = (
parse_private_csa_header(mw.dcm_data, "ProtocolName", "tProtocolName")
if mw.is_csa
else ""
)
try:
protocol_name = mw.dcm_data.ProtocolName
assert isinstance(protocol_name, str)
series_id = (int(mw.dcm_data.SeriesNumber), protocol_name)
except AttributeError as e:
lgr.warning('Ignoring %s since not quite a "normal" DICOM: %s', fl, e)
return None
if dcmfilter is not None and dcmfilter(mw.dcm_data):
lgr.warning("Ignoring %s because of DICOM filter", fl)
return None
if mw.dcm_data[0x0008, 0x0016].repval in (
"Raw Data Storage",
"GrayscaleSoftcopyPresentationStateStorage",
):
return None
try:
file_studyUID = mw.dcm_data.StudyInstanceUID
assert isinstance(file_studyUID, str)
except AttributeError:
lgr.info("File {} is missing any StudyInstanceUID".format(fl))
file_studyUID = None
return mw, series_id, file_studyUID
class SeriesID(NamedTuple):
series_number: int
protocol_name: str
file_studyUID: Optional[str] = None
def __str__(self) -> str:
s = f"{self.series_number}-{self.protocol_name}"
if self.file_studyUID is not None:
s += f"-{self.file_studyUID}"
return s
@overload
def group_dicoms_into_seqinfos(
files: list[str],
grouping: str,
file_filter: Optional[Callable[[str], Any]] = None,
dcmfilter: Optional[Callable[[dcm.dataset.Dataset], Any]] = None,
flatten: Literal[False] = False,
custom_grouping: str
| Callable[
[list[str], Optional[Callable[[dcm.dataset.Dataset], Any]], type[SeqInfo]],
dict[SeqInfo, list[str]],
]
| None = None,
custom_seqinfo: CustomSeqinfoT | None = None,
) -> dict[Optional[str], dict[SeqInfo, list[str]]]:
...
@overload
def group_dicoms_into_seqinfos(
files: list[str],
grouping: str,
file_filter: Optional[Callable[[str], Any]] = None,
dcmfilter: Optional[Callable[[dcm.dataset.Dataset], Any]] = None,
*,
flatten: Literal[True],
custom_grouping: str
| Callable[
[list[str], Optional[Callable[[dcm.dataset.Dataset], Any]], type[SeqInfo]],
dict[SeqInfo, list[str]],
]
| None = None,
custom_seqinfo: CustomSeqinfoT | None = None,
) -> dict[SeqInfo, list[str]]:
...
def group_dicoms_into_seqinfos(
files: list[str],
grouping: str,
file_filter: Optional[Callable[[str], Any]] = None,
dcmfilter: Optional[Callable[[dcm.dataset.Dataset], Any]] = None,
flatten: Literal[False, True] = False,
custom_grouping: str
| Callable[
[list[str], Optional[Callable[[dcm.dataset.Dataset], Any]], type[SeqInfo]],
dict[SeqInfo, list[str]],
]
| None = None,
custom_seqinfo: CustomSeqinfoT | None = None,
) -> dict[Optional[str], dict[SeqInfo, list[str]]] | dict[SeqInfo, list[str]]:
"""Process list of dicoms and return seqinfo and file group
`seqinfo` contains per-sequence extract of fields from DICOMs which
will be later provided into heuristics to decide on filenames
Parameters
----------
files : list of str
List of files to consider
grouping : {'studyUID', 'accession_number', 'all', 'custom'}
How to group DICOMs for conversion. If 'custom', see `custom_grouping`
parameter.
file_filter : callable, optional
Applied to each item of filenames. Should return True if file needs to be
kept, False otherwise.
dcmfilter : callable, optional
If called on dcm_data and returns True, it is used to set series_id
flatten : bool, optional
Creates a flattened `seqinfo` with corresponding DICOM files. True when
invoked with `dicom_dir_template`.
custom_grouping: str or callable, optional
grouping key defined within heuristic. Can be a string of a
DICOM attribute, or a method that handles more complex groupings.
custom_seqinfo: callable, optional
A callable which will be provided MosaicWrapper giving possibility to
extract any custom DICOM metadata of interest.
Returns
-------
seqinfo : list of list
`seqinfo` is a list of info entries per each sequence (some entry
there defines a key for `filegrp`)
filegrp : dict
`filegrp` is a dictionary with files grouped per each sequence
"""
allowed_groupings = ["studyUID", "accession_number", "all", "custom"]
if grouping not in allowed_groupings:
raise ValueError("I do not know how to group by {0}".format(grouping))
per_studyUID = grouping == "studyUID"
# per_accession_number = grouping == 'accession_number'
lgr.info("Analyzing %d dicoms", len(files))
group_keys: list[SeriesID] = []
group_values: list[int] = []
mwgroup: list[dw.Wrapper] = []
studyUID: Optional[str] = None
if file_filter:
nfl_before = len(files)
files = list(filter(file_filter, files))
nfl_after = len(files)
lgr.info(
"Filtering out {0} dicoms based on their filename".format(
nfl_before - nfl_after
)
)
if grouping == "custom":
if custom_grouping is None:
raise RuntimeError("Custom grouping is not defined in heuristic")
if callable(custom_grouping):
return custom_grouping(files, dcmfilter, SeqInfo)
grouping = custom_grouping
study_customgroup = None
removeidx = []
for idx, filename in enumerate(files):
mwinfo = validate_dicom(filename, dcmfilter)
if mwinfo is None:
removeidx.append(idx)
continue
mw, series_id_, file_studyUID = mwinfo
series_id = SeriesID(series_id_[0], series_id_[1])
if per_studyUID:
series_id = series_id._replace(file_studyUID=file_studyUID)
if flatten:
if per_studyUID:
if studyUID is None:
studyUID = file_studyUID
assert (
studyUID == file_studyUID
), "Conflicting study identifiers found [{}, {}].".format(
studyUID, file_studyUID
)
elif custom_grouping:
file_customgroup = mw.dcm_data.get(grouping)
if study_customgroup is None:
study_customgroup = file_customgroup
assert (
study_customgroup == file_customgroup
), "Conflicting {0} found: [{1}, {2}]".format(
grouping, study_customgroup, file_customgroup
)
ingrp = False
# check if same series was already converted
for idx in range(len(mwgroup)):
if mw.is_same_series(mwgroup[idx]):
if grouping != "all":
assert (
mwgroup[idx].dcm_data.get("StudyInstanceUID") == file_studyUID
), "Same series found for multiple different studies"
ingrp = True
series_id = SeriesID(
mwgroup[idx].dcm_data.SeriesNumber,
mwgroup[idx].dcm_data.ProtocolName,
)
if per_studyUID:
series_id = series_id._replace(file_studyUID=file_studyUID)
group_keys.append(series_id)
group_values.append(idx)
if not ingrp:
mwgroup.append(mw)
group_keys.append(series_id)
group_values.append(len(mwgroup) - 1)
group_map = dict(zip(group_keys, group_values))
if removeidx:
# remove non DICOMS from files
for idx in sorted(removeidx, reverse=True):
del files[idx]
seqinfos: dict[Optional[str], dict[SeqInfo, list[str]]] = {}
flat_seqinfos: dict[SeqInfo, list[str]] = {}
# for the next line to make any sense the series_id needs to
# be sortable in a way that preserves the series order
for series_id, mwidx in sorted(group_map.items()):
mw = mwgroup[mwidx]
series_files = [files[i] for i, s in enumerate(group_keys) if s == series_id]
if per_studyUID:
studyUID = series_id.file_studyUID
series_id = series_id._replace(file_studyUID=None)
series_id_str = str(series_id)
if mw.image_shape is None:
# this whole thing has no image data (maybe just PSg DICOMs)
# If this is a Siemens PhoenixZipReport or PhysioLog, keep it:
if mw.dcm_data.get("SeriesDescription") == "PhoenixZIPReport":
# give it a dummy shape, so that we can continue:
mw.image_shape = (0, 0, 0)
else:
# nothing to see here, just move on
continue
seqinfo = create_seqinfo(mw, series_files, series_id_str, custom_seqinfo)
key: Optional[str]
if per_studyUID:
key = studyUID
elif grouping == "accession_number":
key = mw.dcm_data.get("AccessionNumber")
elif grouping == "all":
key = "all"
elif custom_grouping:
key = mw.dcm_data.get(custom_grouping)
else:
key = ""
lgr.debug(
"%30s %30s %27s %27s %5s nref=%-2d nsrc=%-2d %s"
% (
key,
seqinfo.series_id,
seqinfo.series_description,
mw.dcm_data.ProtocolName,
seqinfo.is_derived,
len(mw.dcm_data.get("ReferencedImageSequence", "")),
len(mw.dcm_data.get("SourceImageSequence", "")),
seqinfo.image_type,
)
)
if not flatten:
seqinfos.setdefault(key, {})[seqinfo] = series_files
else:
flat_seqinfos[seqinfo] = series_files
if not flatten:
entries = len(seqinfos)
subentries = sum(map(len, seqinfos.values()))
else:
entries = len(flat_seqinfos)
subentries = sum(map(len, flat_seqinfos.values()))
if per_studyUID:
lgr.info(
"Generated sequence info for %d studies with %d entries total",
entries,
subentries,
)
elif grouping == "accession_number":
lgr.info(
"Generated sequence info for %d accession numbers with %d entries total",
entries,
subentries,
)
else:
lgr.info("Generated sequence info with %d entries", entries)
if not flatten:
return seqinfos
else:
return flat_seqinfos
def get_reproducible_int(dicom_list: list[str]) -> int:
"""Get integer that can be used to reproducibly sort input DICOMs, which is based on when they were acquired.
Parameters
----------
dicom_list : list[str]
Paths to existing DICOM files
Returns
-------
int
An integer relating to when the DICOM was acquired
Raises
------
AssertionError
Notes
-----
1. When date and time for can be read (see :func:`get_datetime_from_dcm`), return
that value as time in seconds since epoch (i.e., Jan 1 1970).
2. In cases where a date/time/datetime is not available (e.g., anonymization stripped this info), return
epoch + AcquisitionNumber (in seconds), which is AcquisitionNumber as an integer
3. If 1 and 2 are not possible, then raise AssertionError and provide message about missing information
Cases are based on only the first element of the dicom_list.
"""
import calendar
dicom = dcm.dcmread(dicom_list[0], stop_before_pixels=True, force=True)
dicom_datetime = get_datetime_from_dcm(dicom)
if dicom_datetime:
return calendar.timegm(dicom_datetime.timetuple())
acquisition_number = dicom.get("AcquisitionNumber")
if acquisition_number:
return int(acquisition_number)
raise AssertionError(
"No metadata found that can be used to sort DICOMs reproducibly. Was header information erased?"
)
def get_datetime_from_dcm(dcm_data: dcm.FileDataset) -> Optional[datetime.datetime]:
"""Extract datetime from filedataset, or return None is no datetime information found.
Parameters
----------
dcm_data : dcm.FileDataset
DICOM with header, e.g., as ready by pydicom.dcmread.
Objects with __getitem__ and have those keys with values properly formatted may also work
Returns
-------
Optional[datetime.datetime]
One of several datetimes that are related to when the scan occurred, or None if no datetime can be found
Notes
------
The following fields are checked in order
1. AcquisitionDate & AcquisitionTime (0008,0022); (0008,0032)
2. AcquisitionDateTime (0008,002A);
3. SeriesDate & SeriesTime (0008,0021); (0008,0031)
"""
def check_tag(x: str) -> bool:
return x in dcm_data and dcm_data[x].value.strip()
if check_tag("AcquisitionDate") and check_tag("AcquisitionTime"):
return strptime_dcm_da_tm(dcm_data, "AcquisitionDate", "AcquisitionTime")
if check_tag("AcquisitionDateTime"):
return strptime_dcm_dt(dcm_data, "AcquisitionDateTime")
if check_tag("SeriesDate") and check_tag("SeriesTime"):
return strptime_dcm_da_tm(dcm_data, "SeriesDate", "SeriesTime")
return None
def compress_dicoms(
dicom_list: list[str], out_prefix: str, tempdirs: TempDirs, overwrite: bool
) -> Optional[str]:
"""Archives DICOMs into a tarball
Also tries to do it reproducibly, so takes the date for files
and target tarball based on the series time (within the first file)
Parameters
----------
dicom_list : list of str
list of dicom files
out_prefix : str
output path prefix, including the portion of the output file name
before .dicom.tgz suffix
tempdirs : TempDirs
TempDirs object to handle multiple tmpdirs
overwrite : bool
Overwrite existing tarfiles
Returns
-------
filename : str
Result tarball
"""
tmpdir = tempdirs(prefix="dicomtar")
outtar = out_prefix + ".dicom.tgz"
if op.exists(outtar) and not overwrite:
lgr.info("File {} already exists, will not overwrite".format(outtar))
return None
# tarfile encodes current time.time inside making those non-reproducible
# so we should choose which date to use.
# Solution from DataLad although ugly enough:
dicom_list = sorted(dicom_list)
dcm_time = get_reproducible_int(dicom_list)
def _assign_dicom_time(ti: tarfile.TarInfo) -> tarfile.TarInfo:
# Reset the date to match the one from the dicom, not from the
# filesystem so we could sort reproducibly
ti.mtime = dcm_time
return ti
with patch("time.time", lambda: dcm_time):
try:
if op.lexists(outtar):
os.unlink(outtar)
with tarfile.open(
outtar, "w:gz", compresslevel=compresslevel, dereference=True
) as tar:
for filename in dicom_list:
outfile = op.join(tmpdir, op.basename(filename))
if not op.islink(outfile):
os.symlink(op.realpath(filename), outfile)
# place into archive stripping any lead directories and
# adding the one corresponding to prefix
tar.add(
outfile,
arcname=op.join(op.basename(out_prefix), op.basename(outfile)),
recursive=False,
filter=_assign_dicom_time,
)
finally:
tempdirs.rmtree(tmpdir)
return outtar
# Note: This function is passed to nipype by `embed_metadata_from_dicoms()`,
# and nipype reparses the function source in a clean namespace that does not
# have `from __future__ import annotations` enabled. Thus, we need to use
# Python 3.7-compatible annotations on this function, and any non-builtin types
# used in the annotations need to be included by import statements passed to
# the `nipype.Function` constructor.
def embed_dicom_and_nifti_metadata(
dcmfiles: List[str],
niftifile: str,
infofile: Union[str, Path],
bids_info: Optional[Dict[str, Any]],
) -> None:
"""Embed metadata from nifti (affine etc) and dicoms into infofile (json)
`niftifile` should exist. Its affine's orientation information is used while
establishing new `NiftiImage` out of dicom stack and together with `bids_info`
(if provided) is dumped into json `infofile`
Parameters
----------
dcmfiles
niftifile
infofile
bids_info: dict
Additional metadata to be embedded. `infofile` is overwritten if exists,
so here you could pass some metadata which would overload (at the first
level of the dict structure, no recursive fancy updates) what is obtained
from nifti and dicoms
"""
# These imports need to be within the body of the function so that they
# will be available when executed by nipype:
import json
import os.path
import dcmstack as ds
import nibabel as nb
from heudiconv.utils import save_json
stack = ds.parse_and_stack(dcmfiles, force=True).values()
if len(stack) > 1:
raise ValueError("Found multiple series")
# may be odict now - iter to be safe
stack = next(iter(stack))
if not os.path.exists(niftifile):
raise NotImplementedError(
"%s does not exist. "
"We are not producing new nifti files here any longer. "
"Use dcm2niix directly or .convert.nipype_convert helper ." % niftifile
)
orig_nii = nb.load(niftifile)
aff = orig_nii.affine # type: ignore[attr-defined]
ornt = nb.orientations.io_orientation(aff)
axcodes = nb.orientations.ornt2axcodes(ornt)
new_nii = stack.to_nifti(voxel_order="".join(axcodes), embed_meta=True)
meta_info_str = ds.NiftiWrapper(new_nii).meta_ext.to_json()
meta_info = json.loads(meta_info_str)
assert isinstance(meta_info, dict)
if bids_info:
meta_info.update(bids_info)
# write to outfile
save_json(infofile, meta_info)
def embed_metadata_from_dicoms(
bids_options: Optional[str],
item_dicoms: list[str],
outname: str,
outname_bids: str,
prov_file: Optional[str],
scaninfo: str,
tempdirs: TempDirs,
with_prov: bool,
) -> None:
"""
Enhance sidecar information file with more information from DICOMs
Parameters
----------
bids_options
item_dicoms
outname
outname_bids
prov_file
scaninfo
tempdirs
with_prov
Returns
-------
"""
from nipype import Function, Node
tmpdir = tempdirs(prefix="embedmeta")
# We need to assure that paths are absolute if they are relative
#
item_dicoms = list(map(op.abspath, item_dicoms)) # type: ignore[arg-type]
embedfunc = Node(
Function(
input_names=[
"dcmfiles",
"niftifile",
"infofile",
"bids_info",
],
function=embed_dicom_and_nifti_metadata,
imports=[
"from pathlib import Path",
"from typing import Any, Dict, List, Optional, Union",
],
),
name="embedder",
)
embedfunc.inputs.dcmfiles = item_dicoms
embedfunc.inputs.niftifile = op.abspath(outname)
embedfunc.inputs.infofile = op.abspath(scaninfo)
embedfunc.inputs.bids_info = (
load_json(op.abspath(outname_bids)) if (bids_options is not None) else None
)
embedfunc.base_dir = tmpdir
cwd = os.getcwd()
lgr.debug(
"Embedding into %s based on dicoms[0]=%s for nifti %s",
scaninfo,
item_dicoms[0],
outname,
)
try:
if op.lexists(scaninfo):
# TODO: handle annexed file case
if not op.islink(scaninfo):
set_readonly(scaninfo, False)
res = embedfunc.run()
set_readonly(scaninfo)
if with_prov:
assert isinstance(prov_file, str)
g = res.provenance.rdf()
g.parse(prov_file, format="turtle")
g.serialize(prov_file, format="turtle")
set_readonly(prov_file)
except Exception as exc:
lgr.error("Embedding failed: %s", str(exc))
os.chdir(cwd)
def parse_private_csa_header(
dcm_data: dcm.dataset.Dataset,
_public_attr: str,
private_attr: str,
default: Optional[str] = None,
) -> str:
"""
Parses CSA header in cases where value is not defined publicly
Parameters
----------
dcm_data : pydicom Dataset object
DICOM metadata
public_attr : string
non-private DICOM attribute
private_attr : string
private DICOM attribute
default (optional)
default value if private_attr not found
Returns
-------
val (default: empty string)
private attribute value or default
"""
# TODO: provide mapping to private_attr from public_attr
import dcmstack.extract as dsextract
from nibabel.nicom import csareader
try:
# TODO: test with attr besides ProtocolName
csastr = csareader.get_csa_header(dcm_data, "series")["tags"][
"MrPhoenixProtocol"
]["items"][0]
csastr = csastr.replace("### ASCCONV BEGIN", "### ASCCONV BEGIN ### ")
parsedhdr = dsextract.parse_phoenix_prot("MrPhoenixProtocol", csastr)
val = parsedhdr[private_attr].replace(" ", "")
except Exception as e:
lgr.debug("Failed to parse CSA header: %s", str(e))
val = default or ""
assert isinstance(val, str)
return val
heudiconv-1.3.2/heudiconv/due.py 0000664 0000000 0000000 00000003745 14715167373 0016643 0 ustar 00root root 0000000 0000000 # emacs: at the end of the file
# ex: set sts=4 ts=4 sw=4 et:
# ## ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### #
"""
Stub file for a guaranteed safe import of duecredit constructs: if duecredit
is not available.
To use it, place it into your project codebase to be imported, e.g. copy as
cp stub.py /path/tomodule/module/due.py
Note that it might be better to avoid naming it duecredit.py to avoid shadowing
installed duecredit.
Then use in your code as
from .due import due, Doi, BibTeX, Text
See https://github.com/duecredit/duecredit/blob/master/README.md for examples.
Origin: Originally a part of the duecredit
Copyright: 2015-2019 DueCredit developers
License: BSD-2
"""
__version__ = "0.0.8"
class InactiveDueCreditCollector(object):
"""Just a stub at the Collector which would not do anything"""
def _donothing(self, *args, **kwargs):
"""Perform no good and no bad"""
pass
def dcite(self, *_args, **_kwargs):
"""If I could cite I would"""
def nondecorating_decorator(func):
return func
return nondecorating_decorator
active = False
activate = add = cite = dump = load = _donothing
def __repr__(self):
return self.__class__.__name__ + "()"
def _donothing_func(*args, **kwargs):
"""Perform no good and no bad"""
pass
try:
from duecredit import BibTeX, Doi, Text, Url, due
if "due" in locals() and not hasattr(due, "cite"):
raise RuntimeError("Imported due lacks .cite. DueCredit is now disabled")
except Exception as e:
if not isinstance(e, ImportError):
import logging
logging.getLogger("duecredit").error(
"Failed to import duecredit due to %s" % str(e)
)
# Initiate due stub
due = InactiveDueCreditCollector()
BibTeX = Doi = Url = Text = _donothing_func
# Emacs mode definitions
# Local Variables:
# mode: python
# py-indent-offset: 4
# tab-width: 4
# indent-tabs-mode: nil
# End:
heudiconv-1.3.2/heudiconv/external/ 0000775 0000000 0000000 00000000000 14715167373 0017325 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/external/__init__.py 0000664 0000000 0000000 00000000000 14715167373 0021424 0 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/external/dlad.py 0000664 0000000 0000000 00000016340 14715167373 0020607 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from glob import glob
import inspect
import logging
import os
import os.path as op
from typing import TYPE_CHECKING, Optional
from ..info import MIN_DATALAD_VERSION as MIN_VERSION
from ..utils import SeqInfo, create_file_if_missing
lgr = logging.getLogger(__name__)
if TYPE_CHECKING:
from datalad.api import Dataset
def prepare_datalad(
studydir: str,
outdir: str,
sid: Optional[str],
session: str | int | None,
seqinfo: Optional[dict[SeqInfo, list[str]]],
dicoms: Optional[list[str]],
bids: Optional[str],
) -> str:
"""Prepare data for datalad"""
from datalad.api import Dataset
datalad_msg_suf = " %s" % sid
if session:
datalad_msg_suf += ", session %s" % session
if seqinfo:
datalad_msg_suf += ", %d sequences" % len(seqinfo)
datalad_msg_suf += ", %d dicoms" % len(sum(seqinfo.values(), []))
else:
assert dicoms is not None
datalad_msg_suf += ", %d dicoms" % len(dicoms)
ds = Dataset(studydir)
if not op.exists(outdir) or not ds.is_installed():
add_to_datalad(
outdir, studydir, msg="Preparing for %s" % datalad_msg_suf, bids=bids
)
return datalad_msg_suf
def add_to_datalad(
topdir: str, studydir: str, msg: Optional[str], bids: Optional[str] # noqa: U100
) -> None:
"""Do all necessary preparations (if were not done before) and save"""
import datalad.api as dl
from datalad.api import Dataset
from datalad.support.annexrepo import AnnexRepo
from datalad.support.external_versions import external_versions
assert external_versions["datalad"] >= MIN_VERSION, "Need datalad >= {}".format(
MIN_VERSION
) # add to reqs
studyrelpath = op.relpath(studydir, topdir)
assert not studyrelpath.startswith(op.pardir) # so we are under
# now we need to test and initiate a DataLad dataset all along the path
curdir_ = topdir
superds = None
subdirs = [""] + [d for d in studyrelpath.split(op.sep) if d != os.curdir]
for isubdir, subdir in enumerate(subdirs):
curdir_ = op.join(curdir_, subdir)
ds = Dataset(curdir_)
if not ds.is_installed():
lgr.info("Initiating %s", ds)
# would require annex > 20161018 for correct operation on annex v6
# need to add .gitattributes first anyways
ds_ = dl.create(
curdir_,
dataset=superds,
force=True,
# initiate annex only at the bottom repository
annex=isubdir == (len(subdirs) - 1),
fake_dates=True,
# shared_access='all',
)
assert ds == ds_
assert ds.is_installed()
superds = ds
# TODO: we need a helper (in DataLad ideally) to ease adding such
# specifications
gitattributes_path = op.join(studydir, ".gitattributes")
# We will just make sure that all our desired rules are present in it
desired_attrs = """\
* annex.largefiles=(largerthan=100kb)
*.json annex.largefiles=nothing
*.txt annex.largefiles=nothing
*.tsv annex.largefiles=nothing
*.nii.gz annex.largefiles=anything
*.tgz annex.largefiles=anything
*_scans.tsv annex.largefiles=anything
"""
if op.exists(gitattributes_path):
with open(gitattributes_path, "rb") as f:
known_attrs = [line.decode("utf-8").rstrip() for line in f.readlines()]
else:
known_attrs = []
for attr in desired_attrs.split("\n"):
if attr not in known_attrs:
known_attrs.append(attr)
with open(gitattributes_path, "wb") as f:
f.write("\n".join(known_attrs).encode("utf-8"))
# ds might have memories of having ds.repo GitRepo
superds = Dataset(topdir)
assert op.realpath(ds.path) == op.realpath(studydir)
assert isinstance(ds.repo, AnnexRepo)
# Add doesn't have all the options of save such as msg and supers
ds.save(path=[".gitattributes"], message="Custom .gitattributes", to_git=True)
dsh = dsh_path = None
if op.lexists(op.join(ds.path, ".heudiconv")):
dsh_path = op.join(ds.path, ".heudiconv")
dsh = Dataset(dsh_path)
if not dsh.is_installed():
# Previously we did not have it as a submodule, and since no
# automagic migration is implemented, we just need to check first
# if any path under .heudiconv is already under git control
if any(x.startswith(".heudiconv/") for x in ds.repo.get_files()):
lgr.warning(
"%s has .heudiconv not as a submodule from previous"
" versions of heudiconv. No automagic migration is "
"yet provided",
ds,
)
else:
dsh = ds.create(
path=".heudiconv",
force=True,
# shared_access='all'
)
# Since .heudiconv could contain sensitive information
# we place all files under annex and then add
if create_file_if_missing(
op.join(dsh_path, ".gitattributes"), """* annex.largefiles=anything"""
):
ds.save(
".heudiconv/.gitattributes",
to_git=True,
message="Added gitattributes to place all .heudiconv content"
" under annex",
)
save_res = ds.save(
".",
recursive=True
# not in effect! ?
# annex_add_opts=['--include-dotfiles']
)
annexed_files = [sr["path"] for sr in save_res if sr.get("key", None)]
# Provide metadata for sensitive information
sensitive_patterns = [
"sourcedata/**",
"*_scans.tsv", # top level
"*/*_scans.tsv", # within subj
"*/*/*_scans.tsv", # within sess/subj
"*/anat/*", # within subj
"*/*/anat/*", # within ses/subj
]
for sp in sensitive_patterns:
mark_sensitive(ds, sp, annexed_files)
if dsh_path:
mark_sensitive(ds, ".heudiconv") # entire .heudiconv!
superds.save(path=ds.path, message=msg, recursive=True)
assert not ds.repo.dirty
# TODO: they are still appearing as native annex symlinked beasts
"""
TODOs:
it needs
- unlock (thin will be in effect)
- save/commit (does modechange 120000 => 100644
- could potentially somehow automate that all:
http://git-annex.branchable.com/tips/automatically_adding_metadata/
- possibly even make separate sub-datasets for originaldata, derivatives ?
"""
def mark_sensitive(ds: Dataset, path_glob: str, files: list[str] | None = None) -> None:
"""
Parameters
----------
ds : Dataset to operate on
path_glob : str
glob of the paths within dataset to work on
files : list[str]
subset of files to mark
Returns
-------
None
"""
paths = glob(op.join(ds.path, path_glob))
if files:
paths = [p for p in paths if p in files]
if not paths:
return
lgr.debug("Marking %d files with distribution-restrictions field", len(paths))
# set_metadata can be a bloody generator
res = ds.repo.set_metadata(
paths, add=dict([("distribution-restrictions", "sensitive")]), recursive=True
)
if inspect.isgenerator(res):
res = list(res)
heudiconv-1.3.2/heudiconv/external/tests/ 0000775 0000000 0000000 00000000000 14715167373 0020467 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/external/tests/__init__.py 0000664 0000000 0000000 00000000000 14715167373 0022566 0 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/external/tests/test_dlad.py 0000664 0000000 0000000 00000002657 14715167373 0023016 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from pathlib import Path
import pytest
from ..dlad import mark_sensitive
from ...utils import create_tree
dl = pytest.importorskip("datalad.api")
def test_mark_sensitive(tmp_path: Path) -> None:
ds = dl.Dataset(tmp_path).create(force=True)
create_tree(
str(tmp_path),
{
"f1": "d1",
"f2": "d2",
"g1": "d3",
"g2": "d1",
},
)
ds.save(".")
mark_sensitive(ds, "f*")
all_meta = dict(ds.repo.get_metadata("."))
target_rec = {"distribution-restrictions": ["sensitive"]}
# g2 since the same content
assert not all_meta.pop("g1", None) # nothing or empty record
assert all_meta == {"f1": target_rec, "f2": target_rec, "g2": target_rec}
def test_mark_sensitive_subset(tmp_path: Path) -> None:
ds = dl.Dataset(tmp_path).create(force=True)
create_tree(
str(tmp_path),
{
"f1": "d1",
"f2": "d2",
"g1": "d3",
"g2": "d1",
},
)
ds.save(".")
mark_sensitive(ds, "f*", [str(tmp_path / "f1")])
all_meta = dict(ds.repo.get_metadata("."))
target_rec = {"distribution-restrictions": ["sensitive"]}
# g2 since the same content
assert not all_meta.pop("g1", None) # nothing or empty record
assert not all_meta.pop("f2", None) # nothing or empty record
assert all_meta == {"f1": target_rec, "g2": target_rec}
heudiconv-1.3.2/heudiconv/heuristics/ 0000775 0000000 0000000 00000000000 14715167373 0017665 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/heuristics/__init__.py 0000664 0000000 0000000 00000000000 14715167373 0021764 0 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/heuristics/banda-bids.py 0000664 0000000 0000000 00000011355 14715167373 0022230 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz", "dicom"),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(seqinfo: list[SeqInfo]) -> dict[tuple[str, tuple[str, ...], None], list]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
t1 = create_key("sub-{subject}/anat/sub-{subject}_T1w")
t2 = create_key("sub-{subject}/anat/sub-{subject}_T2w")
rest = create_key("sub-{subject}/func/sub-{subject}_task-rest_run-{item:02d}_bold")
rest_sbref = create_key(
"sub-{subject}/func/sub-{subject}_task-rest_run-{item:02d}_sbref"
)
face = create_key("sub-{subject}/func/sub-{subject}_task-face_run-{item:02d}_bold")
face_sbref = create_key(
"sub-{subject}/func/sub-{subject}_task-face_run-{item:02d}_sbref"
)
gamble = create_key(
"sub-{subject}/func/sub-{subject}_task-gambling_run-{item:02d}_bold"
)
gamble_sbref = create_key(
"sub-{subject}/func/sub-{subject}_task-gambling_run-{item:02d}_sbref"
)
conflict = create_key(
"sub-{subject}/func/sub-{subject}_task-conflict_run-{item:02d}_bold"
)
conflict_sbref = create_key(
"sub-{subject}/func/sub-{subject}_task-conflict_run-{item:02d}_sbref"
)
dwi = create_key("sub-{subject}/dwi/sub-{subject}_run-{item:02d}_dwi")
dwi_sbref = create_key("sub-{subject}/dwi/sub-{subject}_run-{item:02d}_sbref")
fmap = create_key("sub-{subject}/fmap/sub-{subject}_dir-{dir}_run-{item:02d}_epi")
info: dict[tuple[str, tuple[str, ...], None], list] = {
t1: [],
t2: [],
rest: [],
face: [],
gamble: [],
conflict: [],
dwi: [],
rest_sbref: [],
face_sbref: [],
gamble_sbref: [],
conflict_sbref: [],
dwi_sbref: [],
fmap: [],
}
for s in seqinfo:
# T1 and T2 scans
if (s.dim3 == 208) and (s.dim4 == 1) and ("T1w" in s.protocol_name):
info[t1] = [s.series_id]
if (s.dim3 == 208) and ("T2w" in s.protocol_name):
info[t2] = [s.series_id]
# diffusion scans
if "dMRI_dir9" in s.protocol_name:
key = None
if s.dim4 >= 99:
key = dwi
elif (s.dim4 == 1) and ("SBRef" in s.series_description):
key = dwi_sbref
if key:
info[key].append({"item": s.series_id})
# functional scans
if "fMRI" in s.protocol_name:
tasktype = s.protocol_name.split("fMRI")[1].split("_")[1]
key = None
if s.dim4 in [420, 215, 338, 280]:
if "rest" in tasktype:
key = rest
if "face" in tasktype:
key = face
if "conflict" in tasktype:
key = conflict
if "gambling" in tasktype:
key = gamble
if (s.dim4 == 1) and ("SBRef" in s.series_description):
if "rest" in tasktype:
key = rest_sbref
if "face" in tasktype:
key = face_sbref
if "conflict" in tasktype:
key = conflict_sbref
if "gambling" in tasktype:
key = gamble_sbref
if key:
info[key].append({"item": s.series_id})
if (s.dim4 == 3) and ("SpinEchoFieldMap" in s.protocol_name):
dirtype = s.protocol_name.split("_")[-1]
info[fmap].append({"item": s.series_id, "dir": dirtype})
# You can even put checks in place for your protocol
msg = []
if len(info[t1]) != 1:
msg.append("Missing correct number of t1 runs")
if len(info[t2]) != 1:
msg.append("Missing correct number of t2 runs")
if len(info[dwi]) != 4:
msg.append("Missing correct number of dwi runs")
if len(info[rest]) != 4:
msg.append("Missing correct number of resting runs")
if len(info[face]) != 2:
msg.append("Missing correct number of faceMatching runs")
if len(info[conflict]) != 4:
msg.append("Missing correct number of conflict runs")
if len(info[gamble]) != 2:
msg.append("Missing correct number of gamble runs")
if msg:
raise ValueError("\n".join(msg))
return info
heudiconv-1.3.2/heudiconv/heuristics/bids_ME.py 0000664 0000000 0000000 00000003271 14715167373 0021544 0 ustar 00root root 0000000 0000000 """Heuristic demonstrating conversion of the Multi-Echo sequences.
It only cares about converting sequences which have _ME_ in their
series_description and outputs to BIDS.
"""
from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list[str]]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
bold = create_key("sub-{subject}/func/sub-{subject}_task-test_run-{item}_bold")
megre_mag = create_key("sub-{subject}/anat/sub-{subject}_part-mag_MEGRE")
megre_phase = create_key("sub-{subject}/anat/sub-{subject}_part-phase_MEGRE")
info: dict[tuple[str, tuple[str, ...], None], list[str]] = {
bold: [],
megre_mag: [],
megre_phase: [],
}
for s in seqinfo:
if "_ME_" in s.series_description:
info[bold].append(s.series_id)
if "GRE_QSM" in s.series_description:
if s.image_type[2] == "M":
info[megre_mag].append(s.series_id)
elif s.image_type[2] == "P":
info[megre_phase].append(s.series_id)
return info
heudiconv-1.3.2/heudiconv/heuristics/bids_PhoenixReport.py 0000664 0000000 0000000 00000003603 14715167373 0024050 0 ustar 00root root 0000000 0000000 """Heuristic demonstrating conversion of the PhoenixZIPReport from Siemens.
It only cares about converting a series with have PhoenixZIPReport in their
series_description and outputs **only to sourcedata**.
"""
from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list[dict[str, str]]]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
sbref = create_key(
"sub-{subject}/func/sub-{subject}_task-QA_sbref",
outtype=(
"nii.gz",
"dicom",
),
)
scout = create_key(
"sub-{subject}/anat/sub-{subject}_T1w",
outtype=(
"nii.gz",
"dicom",
),
)
phoenix_doc = create_key(
"sub-{subject}/misc/sub-{subject}_phoenix", outtype=("dicom",)
)
info: dict[tuple[str, tuple[str, ...], None], list[dict[str, str]]] = {
sbref: [],
scout: [],
phoenix_doc: [],
}
for s in seqinfo:
if (
"PhoenixZIPReport" in s.series_description
and s.image_type[3] == "CSA REPORT"
):
info[phoenix_doc].append({"item": s.series_id})
if "scout" in s.series_description.lower():
info[scout].append({"item": s.series_id})
return info
heudiconv-1.3.2/heudiconv/heuristics/bids_with_ses.py 0000664 0000000 0000000 00000007215 14715167373 0023072 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
session: scan index for longitudinal acq
"""
# for this example, we want to include copies of the DICOMs just for our T1
# and functional scans
outdicom = ("dicom", "nii.gz")
t1 = create_key(
"{bids_subject_session_dir}/anat/{bids_subject_session_prefix}_T1w",
outtype=outdicom,
)
t2 = create_key("{bids_subject_session_dir}/anat/{bids_subject_session_prefix}_T2w")
dwi_ap = create_key(
"{bids_subject_session_dir}/dwi/{bids_subject_session_prefix}_dir-AP_dwi"
)
dwi_pa = create_key(
"{bids_subject_session_dir}/dwi/{bids_subject_session_prefix}_dir-PA_dwi"
)
rs = create_key(
"{bids_subject_session_dir}/func/{bids_subject_session_prefix}_task-rest_run-{item:02d}_bold",
outtype=outdicom,
)
boldt1 = create_key(
"{bids_subject_session_dir}/func/{bids_subject_session_prefix}_task-bird1back_run-{item:02d}_bold",
outtype=outdicom,
)
boldt2 = create_key(
"{bids_subject_session_dir}/func/{bids_subject_session_prefix}_task-letter1back_run-{item:02d}_bold",
outtype=outdicom,
)
boldt3 = create_key(
"{bids_subject_session_dir}/func/{bids_subject_session_prefix}_task-letter2back_run-{item:02d}_bold",
outtype=outdicom,
)
info: dict[tuple[str, tuple[str, ...], None], list] = {
t1: [],
t2: [],
dwi_ap: [],
dwi_pa: [],
rs: [],
boldt1: [],
boldt2: [],
boldt3: [],
}
for s in seqinfo:
if (
(s.dim3 == 176 or s.dim3 == 352)
and (s.dim4 == 1)
and ("MEMPRAGE" in s.protocol_name)
):
info[t1] = [s.series_id]
elif (s.dim4 == 1) and ("MEMPRAGE" in s.protocol_name):
info[t1] = [s.series_id]
elif (
(s.dim3 == 176 or s.dim3 == 352)
and (s.dim4 == 1)
and ("T2_SPACE" in s.protocol_name)
):
info[t2] = [s.series_id]
elif (s.dim4 >= 70) and ("DIFFUSION_HighRes_AP" in s.protocol_name):
info[dwi_ap].append([s.series_id])
elif "DIFFUSION_HighRes_PA" in s.protocol_name:
info[dwi_pa].append([s.series_id])
elif (s.dim4 == 144) and ("resting" in s.protocol_name):
if not s.is_motion_corrected:
info[rs].append([(s.series_id)])
elif (s.dim4 == 183 or s.dim4 == 366) and ("localizer" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt1].append([s.series_id])
elif (s.dim4 == 227 or s.dim4 == 454) and ("transfer1" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt2].append([s.series_id])
elif (s.dim4 == 227 or s.dim4 == 454) and ("transfer2" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt3].append([s.series_id])
return info
heudiconv-1.3.2/heudiconv/heuristics/cmrr_heuristic.py 0000664 0000000 0000000 00000011053 14715167373 0023261 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz", "dicom"),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
t1 = create_key("anat/sub-{subject}_T1w")
t2 = create_key("anat/sub-{subject}_T2w")
rest = create_key("func/sub-{subject}_dir-{acq}_task-rest_run-{item:02d}_bold")
face = create_key("func/sub-{subject}_task-face_run-{item:02d}_acq-{acq}_bold")
gamble = create_key(
"func/sub-{subject}_task-gambling_run-{item:02d}_acq-{acq}_bold"
)
conflict = create_key(
"func/sub-{subject}_task-conflict_run-{item:02d}_acq-{acq}_bold"
)
dwi = create_key("dwi/sub-{subject}_dir-{acq}_run-{item:02d}_dwi")
fmap_rest = create_key(
"fmap/sub-{subject}_acq-func{acq}_dir-{dir}_run-{item:02d}_epi"
)
fmap_dwi = create_key(
"fmap/sub-{subject}_acq-dwi{acq}_dir-{dir}_run-{item:02d}_epi"
)
info: dict[tuple[str, tuple[str, ...], None], list] = {
t1: [],
t2: [],
rest: [],
face: [],
gamble: [],
conflict: [],
dwi: [],
fmap_rest: [],
fmap_dwi: [],
}
for idx, s in enumerate(seqinfo):
if (s.dim3 == 208) and (s.dim4 == 1) and ("T1w" in s.protocol_name):
info[t1] = [s.series_id]
if (s.dim3 == 208) and ("T2w" in s.protocol_name):
info[t2] = [s.series_id]
if (s.dim4 >= 99) and (
("dMRI_dir98_AP" in s.protocol_name) or ("dMRI_dir99_AP" in s.protocol_name)
):
acq = s.protocol_name.split("dMRI_")[1].split("_")[0] + "AP"
info[dwi].append({"item": s.series_id, "acq": acq})
if (s.dim4 >= 99) and (
("dMRI_dir98_PA" in s.protocol_name) or ("dMRI_dir99_PA" in s.protocol_name)
):
acq = s.protocol_name.split("dMRI_")[1].split("_")[0] + "PA"
info[dwi].append({"item": s.series_id, "acq": acq})
if (s.dim4 == 1) and (
("dMRI_dir98_AP" in s.protocol_name) or ("dMRI_dir99_AP" in s.protocol_name)
):
acq = s.protocol_name.split("dMRI_")[1].split("_")[0]
info[fmap_dwi].append({"item": s.series_id, "dir": "AP", "acq": acq})
if (s.dim4 == 1) and (
("dMRI_dir98_PA" in s.protocol_name) or ("dMRI_dir99_PA" in s.protocol_name)
):
acq = s.protocol_name.split("dMRI_")[1].split("_")[0]
info[fmap_dwi].append({"item": s.series_id, "dir": "PA", "acq": acq})
if (s.dim4 == 420) and ("rfMRI_REST_AP" in s.protocol_name):
info[rest].append({"item": s.series_id, "acq": "AP"})
if (s.dim4 == 420) and ("rfMRI_REST_PA" in s.protocol_name):
info[rest].append({"item": s.series_id, "acq": "PA"})
if (s.dim4 == 1) and ("rfMRI_REST_AP" in s.protocol_name):
if seqinfo[idx + 1][9] != 420:
continue
info[fmap_rest].append({"item": s.series_id, "dir": "AP", "acq": ""})
if (s.dim4 == 1) and ("rfMRI_REST_PA" in s.protocol_name):
info[fmap_rest].append({"item": s.series_id, "dir": "PA", "acq": ""})
if (s.dim4 == 346) and ("tfMRI_faceMatching_AP" in s.protocol_name):
info[face].append({"item": s.series_id, "acq": "AP"})
if (s.dim4 == 346) and ("tfMRI_faceMatching_PA" in s.protocol_name):
info[face].append({"item": s.series_id, "acq": "PA"})
if (s.dim4 == 288) and ("tfMRI_conflict_AP" in s.protocol_name):
info[conflict].append({"item": s.series_id, "acq": "AP"})
if (s.dim4 == 288) and ("tfMRI_conflict_PA" in s.protocol_name):
info[conflict].append({"item": s.series_id, "acq": "PA"})
if (s.dim4 == 223) and ("tfMRI_gambling_AP" in (s.protocol_name)):
info[gamble].append({"item": s.series_id, "acq": "AP"})
if (s.dim4 == 223) and ("tfMRI_gambling_PA" in s.protocol_name):
info[gamble].append({"item": s.series_id, "acq": "PA"})
return info
heudiconv-1.3.2/heudiconv/heuristics/convertall.py 0000664 0000000 0000000 00000003047 14715167373 0022414 0 ustar 00root root 0000000 0000000 from __future__ import annotations
import logging
from typing import Optional
from heudiconv.utils import SeqInfo
lgr = logging.getLogger("heudiconv")
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list[str]]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
data = create_key("run{item:03d}")
info: dict[tuple[str, tuple[str, ...], None], list[str]] = {data: []}
for s in seqinfo:
"""
The namedtuple `s` contains the following fields:
* total_files_till_now
* example_dcm_file
* series_id
* dcm_dir_name
* unspecified2
* unspecified3
* dim1
* dim2
* dim3
* dim4
* TR
* TE
* protocol_name
* is_motion_corrected
* is_derived
* patient_id
* study_description
* referring_physician_name
* series_description
* image_type
"""
info[data].append(s.series_id)
return info
heudiconv-1.3.2/heudiconv/heuristics/convertall_custom.py 0000664 0000000 0000000 00000002045 14715167373 0024003 0 ustar 00root root 0000000 0000000 """A demo convertall heuristic with custom_seqinfo extracting affine and sample DICOM path
This heuristic also demonstrates on how to create a "derived" heuristic which would augment
behavior of an already existing heuristic without complete rewrite. Such approach could be
useful for heuristic like reproin to overload mapping etc.
"""
from __future__ import annotations
from typing import Any
import nibabel.nicom.dicomwrappers as dw
from .convertall import * # noqa: F403
def custom_seqinfo(
series_files: list[str], wrapper: dw.Wrapper, **kw: Any # noqa: U100
) -> tuple[str | None, str]:
"""Demo for extracting custom header fields into custom_seqinfo field
Operates on already loaded DICOM data.
Origin: https://github.com/nipy/heudiconv/pull/333
"""
from nibabel.nicom.dicomwrappers import WrapperError
try:
affine = str(wrapper.affine)
except WrapperError:
lgr.exception("Errored out while obtaining/converting affine") # noqa: F405
affine = None
return affine, series_files[0]
heudiconv-1.3.2/heudiconv/heuristics/example.py 0000664 0000000 0000000 00000012471 14715167373 0021677 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
# Dictionary to specify options for the `populate_intended_for`.
# Valid options are defined in 'bids.py' (for 'matching_parameters':
# ['Shims', 'ImagingVolume',]; for 'criterion': ['First', 'Closest']
POPULATE_INTENDED_FOR_OPTS = {
"matching_parameters": "ImagingVolume",
"criterion": "Closest",
}
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
rs = create_key("rsfmri/rest_run{item:03d}/rest", outtype=("dicom", "nii.gz"))
boldt1 = create_key("BOLD/task001_run{item:03d}/bold")
boldt2 = create_key("BOLD/task002_run{item:03d}/bold")
boldt3 = create_key("BOLD/task003_run{item:03d}/bold")
boldt4 = create_key("BOLD/task004_run{item:03d}/bold")
boldt5 = create_key("BOLD/task005_run{item:03d}/bold")
boldt6 = create_key("BOLD/task006_run{item:03d}/bold")
boldt7 = create_key("BOLD/task007_run{item:03d}/bold")
boldt8 = create_key("BOLD/task008_run{item:03d}/bold")
fm1 = create_key("fieldmap/fm1_{item:03d}")
fm2 = create_key("fieldmap/fm2_{item:03d}")
fmrest = create_key("fieldmap/fmrest_{item:03d}")
dwi = create_key("dmri/dwi_{item:03d}", outtype=("dicom", "nii.gz"))
t1 = create_key("anatomy/T1_{item:03d}")
asl = create_key("rsfmri/asl_run{item:03d}/asl")
aslcal = create_key("rsfmri/asl_run{item:03d}/cal_{subindex:03d}")
info: dict[tuple[str, tuple[str, ...], None], list] = {
rs: [],
boldt1: [],
boldt2: [],
boldt3: [],
boldt4: [],
boldt5: [],
boldt6: [],
boldt7: [],
boldt8: [],
fm1: [],
fm2: [],
fmrest: [],
dwi: [],
t1: [],
asl: [],
aslcal: [[]],
}
last_run = len(seqinfo)
for s in seqinfo:
series_num_str = s.series_id.split("-", 1)[0]
if not series_num_str.isdecimal():
raise ValueError(
f"This heuristic can operate only on data when series_id has form -, "
f"and is a numeric number. Got series_id={s.series_id}"
)
series_num: int = int(series_num_str)
sl, nt = (s.dim3, s.dim4)
if (sl == 176) and (nt == 1) and ("MPRAGE" in s.protocol_name):
info[t1] = [s.series_id]
elif (nt > 60) and ("ge_func_2x2x2_Resting" in s.protocol_name):
if not s.is_motion_corrected:
info[rs].append(s.series_id)
elif (
(nt == 156)
and ("ge_functionals_128_PACE_ACPC-30" in s.protocol_name)
and series_num < last_run
):
if not s.is_motion_corrected:
info[boldt1].append(s.series_id)
last_run = series_num
elif (nt == 155) and ("ge_functionals_128_PACE_ACPC-30" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt2].append(s.series_id)
elif (nt == 222) and ("ge_functionals_128_PACE_ACPC-30" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt3].append(s.series_id)
elif (nt == 114) and ("ge_functionals_128_PACE_ACPC-30" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt4].append(s.series_id)
elif (nt == 156) and ("ge_functionals_128_PACE_ACPC-30" in s.protocol_name):
if not s.is_motion_corrected and (series_num > last_run):
info[boldt5].append(s.series_id)
elif (nt == 324) and ("ge_func_3.1x3.1x4_PACE" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt6].append(s.series_id)
elif (nt == 250) and ("ge_func_3.1x3.1x4_PACE" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt7].append(s.series_id)
elif (nt == 136) and ("ge_func_3.1x3.1x4_PACE" in s.protocol_name):
if not s.is_motion_corrected:
info[boldt8].append(s.series_id)
elif (nt == 101) and ("ep2d_pasl_FairQuipssII" in s.protocol_name):
if not s.is_motion_corrected:
info[asl].append(s.series_id)
elif (nt == 1) and ("ep2d_pasl_FairQuipssII" in s.protocol_name):
info[aslcal][0].append(s.series_id)
elif (sl > 1) and (nt == 70) and ("DIFFUSION" in s.protocol_name):
info[dwi].append(s.series_id)
elif "field_mapping_128" in s.protocol_name:
info[fm1].append(s.series_id)
elif "field_mapping_3.1" in s.protocol_name:
info[fm2].append(s.series_id)
elif "field_mapping_Resting" in s.protocol_name:
info[fmrest].append(s.series_id)
else:
pass
return info
heudiconv-1.3.2/heudiconv/heuristics/multires_7Tbold.py 0000664 0000000 0000000 00000005556 14715167373 0023331 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from typing import Optional
import pydicom as dcm
from heudiconv.utils import SeqInfo
scaninfo_suffix = ".json"
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def filter_dicom(dcmdata: dcm.dataset.Dataset) -> bool:
"""Return True if a DICOM dataset should be filtered out, else False"""
comments = getattr(dcmdata, "ImageComments", "")
if len(comments):
if "reference volume" in comments.lower():
print("Filter out image with comment '%s'" % comments)
return True
return False
def extract_moco_params(
basename: str, _outypes: tuple[str, ...], dicoms: list[str]
) -> None:
if "_rec-dico" not in basename:
return
from pydicom import dcmread
# get acquisition time for all dicoms
dcm_times = [
(d, float(dcmread(d, stop_before_pixels=True).AcquisitionTime)) for d in dicoms
]
# store MoCo info from image comments sorted by acquisition time
moco = [
"\t".join(
[
str(float(i))
for i in dcmread(fn, stop_before_pixels=True)
.ImageComments.split()[1]
.split(",")
]
)
for fn, t in sorted(dcm_times, key=lambda x: x[1])
]
outname = basename[:-4] + "recording-motion_physio.tsv"
with open(outname, "wt") as fp:
for m in moco:
fp.write("%s\n" % (m,))
custom_callable = extract_moco_params
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list[str]]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
info: dict[tuple[str, tuple[str, ...], None], list[str]] = {}
for s in seqinfo:
if "_bold_" not in s.protocol_name:
continue
if "_coverage" not in s.protocol_name:
label = "orientation%s_run-{item:02d}"
else:
label = "coverage%s"
resolution = s.protocol_name.split("_")[5][:-3]
assert float(resolution)
if s.is_motion_corrected:
label = label % ("_rec-dico",)
else:
label = label % ("",)
templ = "ses-%smm/func/{subject}_ses-%smm_task-%s_bold" % (
resolution,
resolution,
label,
)
key = create_key(templ)
if key not in info:
info[key] = []
info[key].append(s.series_id)
return info
heudiconv-1.3.2/heudiconv/heuristics/reproin.py 0000664 0000000 0000000 00000115301 14715167373 0021716 0 ustar 00root root 0000000 0000000 """
(AKA dbic-bids) Flexible heuristic to establish BIDS DataLad datasets hierarchy
Initially developed and deployed at Dartmouth Brain Imaging Center
(http://dbic.dartmouth.edu) using Siemens Prisma 3T under the umbrellas of the
Center of Reproducible Neuroimaging Computation (ReproNim, http://repronim.org)
and Center for Open Neuroscience (CON, http://centerforopenneuroscience.org).
## Dataset ownership/location
Datasets will be arranged in a hierarchy similar to how study/exam cards are
arranged at the scanner console. You should have
- "region" defined per each PI,
- on the first level most probably as PI_StudentOrRA/ (e.g., Gobbini_Matteo)
- StudyID_StudyName/ (e.g. 1002_faceangles)
- Arbitrary name for the exam card -- it doesn't get into Study Description.
Selecting specific exam card would populate Study Description field using
aforementioned levels, which will be used by this heuristic to decide on the
location of the dataset.
In case of multiple sessions, it is recommended to generate separate "cards"
per each session.
## Sequence naming on the scanner console
Sequence names on the scanner must follow this specification to avoid manual
conversion/handling:
[PREFIX:][WIP ]]>[_ses-][_task-][_acq-][_run-][_dir-][][__]
where
[PREFIX:] - leading capital letters followed by : are stripped/ignored
[WIP ] - prefix is stripped/ignored (added by Philips for patch sequences)
<...> - value to be entered
[...] - optional -- might be nearly mandatory for some modalities (e.g.,
run for functional) and very optional for others
*ID - alpha-numerical identifier (e.g. 01,02, pre, post, pre01) for a run,
task, session. Note that makes more sense to use numerical values for
RUNID (e.g., _run-01, _run-02) for obvious sorting and possibly
descriptive ones for e.g. SESID (_ses-movie, _ses-localizer)
a known BIDS sequence datatype which is usually a name of the folder under
subject's directory. And (optional) suffix is a specific sequence type
(e.g., "bold" for func, or "T1w" for "anat"), which could often
(but not always) be deduced from DICOM. Known to ReproIn BIDS modalities
are:
anat - anatomical data. Might also be collected multiple times across
runs (e.g. if subject is taken out of magnet etc), so could
(optionally) have "_run" definition attached. For "standard anat"
suffixes, please consult to "8.3 Anatomy imaging data" but most
common are 'T1w', 'T2w', 'angio'.
beh - behavioral data. known but not "treated".
func - functional (AKA task, including resting state) data.
Typically contains multiple runs, and might have multiple different
tasks different per each run
(e.g. _task-memory_run-01, _task-oddball_run-02)
fmap - field maps
dwi - diffusion weighted imaging (also can as well have runs)
The other BIDS modalities are not known ATM and their data will not be
converted and will be just skipped (with a warning). Full list of datatypes
can be found at
https://github.com/bids-standard/bids-specification/blob/v1.7.0/src/schema/objects/datatypes.yaml
and their corresponding suffixes at
https://github.com/bids-standard/bids-specification/tree/v1.7.0/src/schema/rules/datatypes
_ses- (optional)
a session. Having a single sequence within a study would make that study
follow "multi-session" layout. A common practice to have a _ses specifier
within the scout sequence name. You can either specify explicit session
identifier (SESID) or just say to maintain, create (starts with 1).
You can also use _ses-{date} in case of scanning phantoms or non-human
subjects and wanting sessions to be coded by the acquisition date.
_task- (optional)
a short name for a task performed during that run. If not provided and it
is a func sequence, _task-UNKNOWN will be automatically added to comply with
BIDS. Consult http://www.cognitiveatlas.org/tasks on known tasks.
_acq- (optional)
a short custom label to distinguish a different set of parameters used for
acquiring the same modality (e.g. _acq-highres, _acq-lowres etc)
_run- (optional)
a (typically functional) run. The same idea as with SESID.
_dir-[AP,PA,LR,RL,VD,DV] (optional)
to be used for fmap images, whenever a pair of the SE images is collected
to be used to estimate the fieldmap
(optional)
any other fields (e.g. _acq-) from BIDS acquisition
__ (optional)
after two underscores any arbitrary comment which will not matter to how
layout in BIDS. But that one theoretically should not be necessary,
and (ab)use of it would just signal lack of thought while preparing sequence
name to start with since everything could have been expressed in BIDS fields.
## Last moment checks/FAQ:
- Functional runs should have _task- field defined
- Do not use "+", "_" or "-" within SESID, TASKID, ACQLABEL, RUNID, so we
could detect "canceled" runs.
- If run was canceled -- just copy canceled run (with the same index) and re-run
it. Files with overlapping name will be considered duplicate/canceled session
and only the last one would remain. The others would acquire
__dup0 suffix.
Although we still support "-" and "+" used within SESID and TASKID, their use is
not recommended, thus not listed here
## Scanner specifics
We perform following actions regardless of the type of scanner, but applied
generally to accommodate limitations imposed by different manufacturers/models:
### Philips
- We replace all ( with { and ) with } to be able e.g. to specify session {date}
- "WIP " prefix unconditionally added by the scanner is stripped
"""
from __future__ import annotations
from collections.abc import Iterable
from glob import glob
import hashlib
import logging
import os.path
import re
from typing import Any, Optional, TypeVar
import pydicom as dcm
from heudiconv.due import Doi, due
from heudiconv.utils import SeqInfo, StudySessionInfo
lgr = logging.getLogger("heudiconv")
T = TypeVar("T")
# Terminology to harmonise and use to name variables etc
# experiment
# subject
# [session]
# exam (AKA scanning session) - currently seqinfo, unless brought together from multiple
# series (AKA protocol?)
# - series_spec - deduced from fields the spec (literal value)
# - series_info - the dictionary with fields parsed from series_spec
# Which fields in seqinfo (in this order) to check for the ReproIn spec
series_spec_fields = ("protocol_name", "series_description")
# dictionary from accession-number to runs that need to be marked as bad
# NOTE: even if filename has number that is 0-padded, internally no padding
# is done
fix_accession2run: dict[str, list[str]] = {
# e.g.:
# 'A000035': ['^8-', '^9-'],
}
# A dictionary containing fixes/remapping for sequence names per study.
# Keys are md5sum of study_description from DICOMs, in the form of PI-Experimenter^protocolname
# You can use `heudiconv -f reproin --command ls --files PATH
# to list the "study hash".
# Values are list of tuples in the form (regex_pattern, substitution).
# If the key is an empty string`''''`, it would apply to any study.
protocols2fix: dict[str | re.Pattern[str], list[tuple[str, str]]] = {
# e.g., QA:
# '43b67d9139e8c7274578b7451ab21123':
# [
# ('BOLD_p2_s4_3\.5mm', 'func_task-rest_acq-p2-s4-3.5mm'),
# ('BOLD_', 'func_task-rest'),
# ('_p2_s4', '_acq-p2-s4'),
# ('_p2', '_acq-p2'),
# ],
# '': # for any study example with regexes used
# [
# ('AAHead_Scout_.*', 'anat-scout'),
# ('^dti_.*', 'dwi'),
# ('^.*_distortion_corr.*_([ap]+)_([12])', r'fmap-epi_dir-\1_run-\2'),
# ('^(.+)_ap.*_r(0[0-9])', r'func_task-\1_run-\2'),
# ('^t1w_.*', 'anat-T1w'),
# # problematic case -- multiple identically named pepolar fieldmap runs
# # I guess we will just sacrifice ability to detect canceled runs here.
# # And we cannot just use _run+ since it would increment independently
# # for ap and then for pa. We will rely on having ap preceding pa.
# # Added _acq-mb8 so they match the one in funcs
# ('func_task-discorr_acq-ap', r'fmap-epi_dir-ap_acq-mb8_run+'),
# ('func_task-discorr_acq-pa', r'fmap-epi_dir-pa_acq-mb8_run='),
# ]
}
# list containing StudyInstanceUID to skip -- hopefully doesn't happen too often
dicoms2skip: list[str] = [
# e.g.
# '1.3.12.2.1107.5.2.43.66112.30000016110117002435700000001',
]
DEFAULT_FIELDS = {
# Let it just be in each json file extracted
"Acknowledgements": "We thank Terry Sacket and the rest of the DBIC (Dartmouth Brain Imaging "
"Center) personnel for assistance in data collection, and "
"Yaroslav O. Halchenko for preparing BIDS dataset. "
"TODO: adjust to your case.",
}
POPULATE_INTENDED_FOR_OPTS = {
"matching_parameters": ["ImagingVolume", "Shims"],
"criterion": "Closest",
}
KNOWN_DATATYPES = {"anat", "func", "dwi", "behav", "fmap"}
def _delete_chars(from_str: str, deletechars: str) -> str:
return from_str.translate(str.maketrans("", "", deletechars))
def filter_dicom(dcmdata: dcm.dataset.Dataset) -> bool:
"""Return True if a DICOM dataset should be filtered out, else False"""
return True if dcmdata.StudyInstanceUID in dicoms2skip else False
def filter_files(_fn: str) -> bool:
"""Return True if a file should be kept, else False.
ATM reproin does not do any filtering. Override if you need to add some
"""
return True
def create_key(
subdir: Optional[str],
file_suffix: str,
outtype: tuple[str, ...] = ("nii.gz", "dicom"),
annotation_classes: None = None,
prefix: str = "",
) -> tuple[str, tuple[str, ...], None]:
if not subdir:
raise ValueError("subdir must be a valid format string")
# may be even add "performing physician" if defined??
template = os.path.join(
prefix,
"{bids_subject_session_dir}",
subdir,
"{bids_subject_session_prefix}_%s" % file_suffix,
)
return template, outtype, annotation_classes
def md5sum(string: Optional[str]) -> str:
"""Computes md5sum of a string"""
if not string:
return "" # not None so None was not compared to strings
m = hashlib.md5(string.encode())
return m.hexdigest()
def get_study_description(seqinfo: list[SeqInfo]) -> str:
# Centralized so we could fix/override
v = get_unique(seqinfo, "study_description")
assert isinstance(v, str)
return v
def get_study_hash(seqinfo: list[SeqInfo]) -> str:
# XXX: ad hoc hack
return md5sum(get_study_description(seqinfo))
def fix_canceled_runs(seqinfo: list[SeqInfo]) -> list[SeqInfo]:
"""Function that adds cancelme_ to known bad runs which were forgotten"""
if not fix_accession2run:
return seqinfo # nothing to do
for i, curr_seqinfo in enumerate(seqinfo):
accession_number = curr_seqinfo.accession_number
if accession_number and accession_number in fix_accession2run:
lgr.info(
"Considering some runs possibly marked to be "
"canceled for accession %s",
accession_number,
)
# This code is reminiscent of prior logic when operating on
# a single accession, but left as is for now
badruns = fix_accession2run[accession_number]
badruns_pattern = "|".join(badruns)
if re.match(badruns_pattern, curr_seqinfo.series_id):
lgr.info("Fixing bad run {0}".format(curr_seqinfo.series_id))
fixedkwargs = dict()
for key in series_spec_fields:
fixedkwargs[key] = "cancelme_" + getattr(curr_seqinfo, key)
seqinfo[i] = curr_seqinfo._replace(**fixedkwargs)
return seqinfo
def fix_dbic_protocol(seqinfo: list[SeqInfo]) -> list[SeqInfo]:
"""Ad-hoc fixup for existing protocols.
It will operate in 3 stages on `protocols2fix` records.
1. consider a record which has md5sum of study_description
2. apply all substitutions, where key is a regular expression which
successfully searches (not necessarily matches, so anchor appropriately)
study_description
3. apply "catch all" substitutions in the key containing an empty string
3. is somewhat redundant since `re.compile('.*')` could match any, but is
kept for simplicity of its specification.
"""
study_hash = get_study_hash(seqinfo)
study_description = get_study_description(seqinfo)
# We will consider first study specific (based on hash)
if study_hash in protocols2fix:
_apply_substitutions(
seqinfo, protocols2fix[study_hash], "study (%s) specific" % study_hash
)
# Then go through all regexps returning regex "search" result
# on study_description
for sub, substitutions in protocols2fix.items():
if isinstance(sub, re.Pattern) and sub.search(study_description):
_apply_substitutions(
seqinfo, substitutions, "%r regex matching" % sub.pattern
)
# and at the end - global
if "" in protocols2fix:
_apply_substitutions(seqinfo, protocols2fix[""], "global")
return seqinfo
def _apply_substitutions(
seqinfo: list[SeqInfo], substitutions: list[tuple[str, str]], subs_scope: str
) -> None:
lgr.info("Considering %s substitutions", subs_scope)
for i, curr_seqinfo in enumerate(seqinfo):
fixed_kwargs = dict()
# need to replace both protocol_name series_description
for key in series_spec_fields:
oldvalue = value = getattr(curr_seqinfo, key)
# replace all I need to replace
for substring, replacement in substitutions:
value = re.sub(substring, replacement, value)
if oldvalue != value:
lgr.info(" %s: %r -> %r", key, oldvalue, value)
fixed_kwargs[key] = value
# namedtuples are immutable
seqinfo[i] = curr_seqinfo._replace(**fixed_kwargs)
def fix_seqinfo(seqinfo: list[SeqInfo]) -> list[SeqInfo]:
"""Just a helper on top of both fixers"""
# add cancelme to known bad runs
seqinfo = fix_canceled_runs(seqinfo)
seqinfo = fix_dbic_protocol(seqinfo)
return seqinfo
def ls(_study_session: StudySessionInfo, seqinfo: list[SeqInfo]) -> str:
"""Additional ls output for a seqinfo"""
# assert len(sequences) <= 1 # expecting only a single study here
# seqinfo = sequences.keys()[0]
return " study hash: %s" % get_study_hash(seqinfo)
# XXX we killed session indicator! what should we do now?!!!
# WE DON'T NEED IT -- it will be provided into conversion_info as `session`
# So we just need subdir and file_suffix!
@due.dcite(
Doi("10.5281/zenodo.1207117"),
path="heudiconv.heuristics.reproin",
description="ReproIn heudiconv heuristic for turnkey conversion into BIDS",
)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list[str]]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
session: scan index for longitudinal acq
"""
seqinfo = fix_seqinfo(seqinfo)
lgr.info("Processing %d seqinfo entries", len(seqinfo))
info: dict[tuple[str, tuple[str, ...], None], list[str]] = {}
skipped: list[str] = []
skipped_unknown: list[str] = []
current_run = 0
run_label: Optional[str] = None # run-
dcm_image_iod_spec: Optional[str] = None
skip_derived = False
for curr_seqinfo in seqinfo:
# XXX: skip derived sequences, we don't store them to avoid polluting
# the directory, unless it is the motion corrected ones
# (will get _rec-moco suffix)
if skip_derived and curr_seqinfo.is_derived and not curr_seqinfo.is_motion_corrected:
skipped.append(curr_seqinfo.series_id)
lgr.debug("Ignoring derived data %s", curr_seqinfo.series_id)
continue
# possibly apply present formatting in the series_description or protocol name
for f in "series_description", "protocol_name":
curr_seqinfo = curr_seqinfo._replace(
**{f: getattr(curr_seqinfo, f).format(**curr_seqinfo._asdict())}
)
template = None
suffix = ""
# seq = []
# figure out type of image from curr_seqinfo.image_info -- just for checking ATM
# since we primarily rely on encoded in the protocol name information
prev_dcm_image_iod_spec = dcm_image_iod_spec
if len(curr_seqinfo.image_type) > 2:
# https://dicom.innolitics.com/ciods/cr-image/general-image/00080008
# 0 - ORIGINAL/DERIVED
# 1 - PRIMARY/SECONDARY
# 3 - Image IOD specific specialization (optional)
dcm_image_iod_spec = curr_seqinfo.image_type[2]
image_type_datatype = {
# Note: P and M are too generic to make a decision here, could be
# for different datatypes (bold, fmap, etc)
"FMRI": "func",
"MPR": "anat",
"DIFFUSION": "dwi",
"MIP_SAG": "anat", # angiography
"MIP_COR": "anat", # angiography
"MIP_TRA": "anat", # angiography
}.get(dcm_image_iod_spec, None)
else:
dcm_image_iod_spec = image_type_datatype = None
series_info = {} # For please lintian and its friends
for sfield in series_spec_fields:
svalue = getattr(curr_seqinfo, sfield)
series_info = parse_series_spec(svalue)
if series_info: # looks like a valid spec - we are done
series_spec = svalue
break
else:
lgr.debug("Failed to parse reproin spec in .%s=%r", sfield, svalue)
if not series_info:
series_spec = None # we cannot know better
lgr.warning(
"Could not determine the series name by looking at %s fields",
", ".join(series_spec_fields),
)
skipped_unknown.append(curr_seqinfo.series_id)
continue
if dcm_image_iod_spec and dcm_image_iod_spec.startswith("MIP"):
series_info["acq"] = series_info.get("acq", "") + sanitize_str(
dcm_image_iod_spec
)
datatype = series_info.pop("datatype")
datatype_suffix = series_info.pop("datatype_suffix", None)
if image_type_datatype and datatype != image_type_datatype:
lgr.warning(
"Deduced datatype to be %s from DICOM, but got %s out of %s",
image_type_datatype,
datatype,
series_spec,
)
# if curr_seqinfo.is_derived:
# # Let's for now stash those close to original images
# # TODO: we might want a separate tree for all of this!?
# # so more of a parameter to the create_key
# #datatype += '/derivative'
# # just keep it lower case and without special characters
# # XXXX what for???
# #seq.append(curr_seqinfo.series_description.lower())
# prefix = os.path.join('derivatives', 'scanner')
# else:
# prefix = ''
prefix = ""
#
# Figure out the datatype_suffix (BIDS _suffix)
#
# If none was provided -- let's deduce it from the information we find:
# analyze curr_seqinfo.protocol_name (series_id is based on it) for full name mapping etc
if not datatype_suffix:
if datatype == "func":
if "_pace_" in series_spec:
datatype_suffix = "pace" # or should it be part of seq-
elif "P" in curr_seqinfo.image_type:
datatype_suffix = "phase"
elif "M" in curr_seqinfo.image_type:
datatype_suffix = "bold"
else:
# assume bold by default
datatype_suffix = "bold"
elif datatype == "fmap":
# TODO: support phase1 phase2 like in "Case 2: Two phase images ..."
if not dcm_image_iod_spec:
raise ValueError("Do not know image data type yet to make decision")
datatype_suffix = {
# might want explicit {file_index} ?
# _epi for pepolar fieldmaps, see
# https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/01-magnetic-resonance-imaging-data.html#case-4-multiple-phase-encoded-directions-pepolar
"M": "epi" if "dir" in series_info else "magnitude",
"P": "phasediff",
"DIFFUSION": "epi", # according to KODI those DWI are the EPIs we need
}[dcm_image_iod_spec]
elif datatype == "dwi":
# label for dwi as well
datatype_suffix = "dwi"
#
# Even if datatype_suffix was provided, for some data we might need to override,
# since they are complementary files produced along-side with original
# ones.
#
if curr_seqinfo.series_description.endswith("_SBRef"):
datatype_suffix = "sbref"
if not datatype_suffix:
# Might be provided by the bids ending within series_spec, we would
# just want to check if that the last element is not _key-value pair
bids_ending = series_info.get("bids", None)
if not bids_ending or "-" in bids_ending.split("_")[-1]:
lgr.warning(
"We ended up with an empty label/suffix for %r", series_spec
)
run = series_info.get("run")
if run is not None:
# so we have an indicator for a run
if run == "+":
# some sequences, e.g. fmap, would generate two (or more?)
# sequences -- e.g. one for magnitude(s) and other ones for
# phases. In those we must not increment run!
if dcm_image_iod_spec and dcm_image_iod_spec == "P":
if prev_dcm_image_iod_spec != "M":
# XXX if we have a known earlier study, we need to always
# increase the run counter for phasediff because magnitudes
# were not acquired
if get_study_hash([curr_seqinfo]) == "9d148e2a05f782273f6343507733309d":
current_run += 1
else:
raise RuntimeError(
"Was expecting phase image to follow magnitude "
"image, but previous one was %r",
prev_dcm_image_iod_spec,
)
# else we do nothing special
else: # and otherwise we go to the next run
current_run += 1
elif run == "=":
if not current_run:
current_run = 1
elif run.isdigit():
current_run_ = int(run)
if current_run_ < current_run:
lgr.warning(
"Previous run (%s) was larger than explicitly specified %s",
current_run,
current_run_,
)
current_run = current_run_
else:
raise ValueError(
"Don't know how to deal with run specification %s" % repr(run)
)
run_label = "run-%02d" % current_run
else:
# if there is no _run -- no run label added
run_label = None
# yoh: had a wrong assumption
# if curr_seqinfo.is_motion_corrected:
# assert curr_seqinfo.is_derived, "Motion corrected images must be 'derived'"
if curr_seqinfo.is_motion_corrected and "rec-" in series_info.get("bids", ""):
raise NotImplementedError(
"want to add _rec-moco but there is _rec- already"
)
def from_series_info(name: str) -> Optional[str]:
"""A little helper to provide _name-value if series_info knows it
Returns None otherwise
"""
if series_info.get(name): # noqa: B023
return "%s-%s" % (name, series_info[name]) # noqa: B023
else:
return None
# TODO: get order from schema, do not hardcode. ATM could be checked at
# https://bids-specification.readthedocs.io/en/stable/99-appendices/04-entity-table.html
# https://github.com/bids-standard/bids-specification/blob/HEAD/src/schema/rules/entities.yaml
# ATM we at large rely on possible (re)ordering according to schema to be done
# by heudiconv, not reproin here.
filename_suffix_parts = [
from_series_info("task"),
from_series_info("acq"),
# But we want to add an indicator in case it was motion corrected
# in the magnet. ref sample /2017/01/03/qa
None if not curr_seqinfo.is_motion_corrected else "rec-moco",
from_series_info("dir"),
series_info.get("bids"),
run_label,
datatype_suffix,
]
# filter those which are None, and join with _
suffix = "_".join(filter(bool, filename_suffix_parts)) # type: ignore[arg-type]
# # .series_description in case of
# sdesc = curr_seqinfo.study_description
# # temporary aliases for those phantoms which we already collected
# # so we rename them into this
# #MAPPING
#
# # the idea ias to have sequence names in the format like
# # bids__bidsrecord
# # in bids record we could have _run[+=]
# # which would say to either increment run number from already encountered
# # or reuse the last one
# if seq:
# suffix += 'seq-%s' % ('+'.join(seq))
# For scouts -- we want only dicoms
# https://github.com/nipy/heudiconv/issues/145
outtype: tuple[str, ...]
if (
"_Scout" in curr_seqinfo.series_description
or (
datatype == "anat"
and datatype_suffix
and datatype_suffix.startswith("scout")
)
or (
curr_seqinfo.series_description.lower()
== curr_seqinfo.protocol_name.lower() + "_setter"
)
):
outtype = ("dicom",)
else:
outtype = ("nii.gz", "dicom")
template = create_key(datatype, suffix, prefix=prefix, outtype=outtype)
# we wanted ordered dict for consistent demarcation of dups
if template not in info:
info[template] = []
info[template].append(curr_seqinfo.series_id)
if skipped:
lgr.info("Skipped %d sequences: %s" % (len(skipped), skipped))
if skipped_unknown:
lgr.warning(
"Could not figure out where to stick %d sequences: %s"
% (len(skipped_unknown), skipped_unknown)
)
info = get_dups_marked(info) # mark duplicate ones with __dup-0x suffix
return info
def get_dups_marked(
info: dict[tuple[str, tuple[str, ...], None], list[T]], per_series: bool = True
) -> dict[tuple[str, tuple[str, ...], None], list[T]]:
"""
Parameters
----------
info
per_series: bool
If set to False, it would create growing index through all series. That
could lead to non-desired effects if some "multi file" scans (such as
fmap with magnitude{1,2} and phasediff) would not be able to associate
multiple files for the same acquisition. By default (True) dup indices
would be per each series (change introduced in 0.5.2)
Returns
-------
"""
# analyze for "cancelled" runs, if run number was explicitly specified and
# thus we ended up with multiple entries which would mean that older ones
# were "cancelled"
info = info.copy()
dup_id = 0
for template, series_ids in list(info.items()):
if len(series_ids) > 1:
lgr.warning(
"Detected %d duplicated run(s) for template %s: %s",
len(series_ids) - 1,
template[0],
series_ids[:-1],
)
# copy the duplicate ones into separate ones
if per_series:
dup_id = 0 # reset since declared per series
for dup_series_id in series_ids[:-1]:
dup_id += 1
dup_template = ("%s__dup-%02d" % (template[0], dup_id),) + template[1:]
# There must have not been such a beast before!
if dup_template in info:
raise AssertionError(
"{} is already known to info={}. "
"May be a bug for per_series=True handling?"
"".format(dup_template, info)
)
info[dup_template] = [dup_series_id]
info[template] = series_ids[-1:]
assert len(info[template]) == 1
return info
def get_unique(seqinfos: list[SeqInfo], attr: str) -> Any:
"""Given a list of seqinfos, which must have come from a single study,
get specific attr, which must be unique across all of the entries
If not -- fail!
"""
values = set(getattr(si, attr) for si in seqinfos)
if len(values) != 1:
raise AssertionError(
f"Was expecting a single value for attribute {attr!r} "
f"but got: {', '.join(sorted(values))}"
)
return values.pop()
# TODO: might need to do grouping per each session and return here multiple
# hits, or may be we could just somehow demarkate that it will be multisession
# one and so then later value parsed (again) in infotodict would be used???
def infotoids(seqinfos: Iterable[SeqInfo], outdir: str) -> dict[str, Optional[str]]:
seqinfo_lst = list(seqinfos)
# decide on subjid and session based on patient_id
lgr.info("Processing sequence infos to deduce study/session")
study_description = get_study_description(seqinfo_lst)
study_description_hash = md5sum(study_description)
subject = fixup_subjectid(get_unique(seqinfo_lst, "patient_id"))
# TODO: fix up subject id if missing some 0s
if study_description:
# Generally it is a ^ but if entered manually, ppl place space in it
split = re.split("[ ^]", study_description, maxsplit=1)
# split first one even more, since could be PI_Student or PI-Student
split = re.split("[-_]", split[0], maxsplit=1) + split[1:]
# locator = study_description.replace('^', '/')
locator = "/".join(split)
else:
locator = "unknown"
# TODO: actually check if given study is study we would care about
# and if not -- we should throw some ???? exception
# So -- use `outdir` and locator etc to see if for a given locator/subject
# and possible ses+ in the sequence names, so we would provide a sequence
# So might need to go through parse_series_spec(curr_seqinfo.protocol_name)
# to figure out presence of sessions.
ses_markers: list[str] = []
# there might be fixups needed so we could deduce session etc
# this copy is not replacing original one, so the same fix_seqinfo
# might be called later
seqinfo_lst = fix_seqinfo(seqinfo_lst)
for s in seqinfo_lst:
if s.is_derived:
continue
session_ = parse_series_spec(s.protocol_name).get("session", None)
if session_ and "{" in session_:
# there was a marker for something we could provide from our seqinfo
# e.g. {date}
session_ = session_.format(**s._asdict())
if session_:
ses_markers.append(session_)
session: Optional[str] = None
if ses_markers:
# we have a session or possibly more than one even
# let's figure out which case we have
nonsign_vals = set(ses_markers).difference("+=")
# although we might want an explicit '=' to note the same session as
# mentioned before?
if len(nonsign_vals) > 1:
lgr.warning( # raise NotImplementedError(
"Cannot deal with multiple sessions in the same study yet!"
" We will process until the end of the first session"
)
if nonsign_vals:
# get only unique values
ses_markers = list(set(ses_markers))
if set(ses_markers).intersection("+="):
raise NotImplementedError(
"Should not mix hardcoded session markers with incremental ones (+=)"
)
if not len(ses_markers) == 1:
raise NotImplementedError(
"Should have got a single session marker. Got following: %s"
% ", ".join(map(repr, ses_markers))
)
session = ses_markers[0]
else:
# TODO - I think we are doomed to go through the sequence and split
# ... actually the same as with nonsign_vals, we just would need to figure
# out initial one if sign ones, and should make use of knowing
# outdir
# raise NotImplementedError()
# we need to look at what sessions we already have
sessions_dir = os.path.join(outdir, locator, "sub-" + subject)
prior_sessions = sorted(glob(os.path.join(sessions_dir, "ses-*")))
# TODO: more complicated logic
# For now just increment session if + and keep the same number if =
# and otherwise just give it 001
# Note: this disables our safety blanket which would refuse to process
# what was already processed before since it would try to override,
# BUT there is no other way besides only if heudiconv was storing
# its info based on some UID
if ses_markers == ["+"]:
session = "%03d" % (len(prior_sessions) + 1)
elif ses_markers == ["="]:
session = (
os.path.basename(prior_sessions[-1])[4:]
if prior_sessions
else "001"
)
else:
session = "001"
if study_description_hash == "9d148e2a05f782273f6343507733309d":
session = "siemens1"
lgr.info("Imposing session {0}".format(session))
return {
# TODO: request info on study from the JedCap
"locator": locator,
# Sessions to be deduced yet from the names etc TODO
"session": session,
"subject": subject,
}
def sanitize_str(value: str) -> str:
"""Remove illegal characters for BIDS from task/acq/etc.."""
return _delete_chars(value, "#!@$%^&.,:;_-")
def parse_series_spec(series_spec: str) -> dict[str, str]:
"""Parse protocol name according to our convention with minimal set of fixups"""
# Since Yarik didn't know better place to put it in, but could migrate outside
# at some point. TODO
series_spec = series_spec.replace("anat_T1w", "anat-T1w")
series_spec = series_spec.replace("hardi_64", "dwi_acq-hardi64")
series_spec = series_spec.replace("AAHead_Scout", "anat-scout")
# Parse the name according to our convention/specification
# leading or trailing spaces do not matter
series_spec = series_spec.strip(" ")
# Strip off leading CAPITALS: prefix to accommodate some reported usecases:
# https://github.com/ReproNim/reproin/issues/14
# where PU: prefix is added by the scanner
series_spec = re.sub("^[A-Z]*:", "", series_spec)
series_spec = re.sub("^WIP ", "", series_spec) # remove Philips WIP prefix
# Remove possible suffix we don't care about after __
series_spec = series_spec.split("__", 1)[0]
bids = False # we don't know yet for sure
# We need to figure out if it is a valid bids
split = series_spec.split("_")
prefix = split[0]
# Fixups
if prefix == "scout":
prefix = split[0] = "anat-scout"
if prefix != "bids" and "-" in prefix:
prefix, _ = prefix.split("-", 1)
if prefix == "bids":
bids = True # for sure
split = split[1:]
def split2(s: str) -> tuple[str, Optional[str]]:
# split on - if present, if not -- 2nd one returned None
if "-" in s:
a, _, b = s.partition("-")
return a, b
return s, None
# Let's analyze first element which should tell us sequence type
datatype, datatype_suffix = split2(split[0])
if datatype not in KNOWN_DATATYPES:
# It is not something we don't consume
if bids:
lgr.warning(
"It was instructed to be BIDS datatype but unknown "
"%s found. Known are: %s",
datatype,
", ".join(KNOWN_DATATYPES),
)
return {}
regd = dict(datatype=datatype)
if datatype_suffix:
regd["datatype_suffix"] = datatype_suffix
# now go through each to see if one which we care
bids_leftovers = []
for s in split[1:]:
key, value = split2(s)
if value is None and key[-1] in "+=":
value = key[-1]
key = key[:-1]
# sanitize values, which must not have _ and - is undesirable ATM as well
# TODO: BIDSv2.0 -- allows "-" so replace with it instead
value = (
str(value)
.replace("_", "X")
.replace("-", "X")
.replace("(", "{")
.replace(")", "}")
) # for Philips
if key in ["ses", "run", "task", "acq", "dir"]:
# those we care about explicitly
regd[{"ses": "session"}.get(key, key)] = sanitize_str(value)
else:
bids_leftovers.append(s)
if bids_leftovers:
regd["bids"] = "_".join(bids_leftovers)
# TODO: might want to check for all known "standard" BIDS suffixes here
# among bids_leftovers, thus serve some kind of BIDS validator
# if not regd.get('datatype_suffix', None):
# # might need to assign a default label for each datatype if was not
# # given
# regd['datatype_suffix'] = {
# 'func': 'bold'
# }.get(regd['datatype'], None)
return regd
def fixup_subjectid(subjectid: str) -> str:
"""Just in case someone managed to miss a zero or added an extra one"""
# make it lowercase
subjectid = subjectid.lower()
reg = re.match(r"sid0*(\d+)$", subjectid)
if not reg:
# some completely other pattern
# just filter out possible _- in it
return re.sub("[-_]", "", subjectid)
return "sid%06d" % int(reg.groups()[0])
heudiconv-1.3.2/heudiconv/heuristics/studyforrest_phase2.py 0000664 0000000 0000000 00000003757 14715167373 0024272 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
scaninfo_suffix = ".json"
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list[str]]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
label_map = {
"movie": "movielocalizer",
"retmap": "retmap",
"visloc": "objectcategories",
}
info: dict[tuple[str, tuple[str, ...], None], list[str]] = {}
for s in seqinfo:
if "EPI_3mm" not in s.protocol_name:
continue
label = s.protocol_name.split("_")[2].split()[0].strip("1234567890").lower()
if label in ("movie", "retmap", "visloc"):
key = create_key(
"ses-localizer/func/{subject}_ses-localizer_task-%s_run-{item:01d}_bold"
% label_map[label]
)
elif label == "sense":
# pilot retmap had different description
key = create_key(
"ses-localizer/func/{subject}_ses-localizer_task-retmap_run-{item:01d}_bold"
)
elif label == "r":
key = create_key(
"ses-movie/func/{subject}_ses-movie_task-movie_run-%i_bold"
% int(s.protocol_name.split("_")[2].split()[0][-1])
)
else:
raise RuntimeError("YOU SHALL NOT PASS!")
if key not in info:
info[key] = []
info[key].append(s.series_id)
return info
heudiconv-1.3.2/heudiconv/heuristics/test_b0dwi_for_fmap.py 0000664 0000000 0000000 00000002676 14715167373 0024167 0 ustar 00root root 0000000 0000000 """Heuristic to extract a b-value=0 DWI image (basically, a SE-EPI)
both as a fmap and as dwi
It is used just to test that a 'DIFFUSION' image that the user
chooses to extract as fmap (pepolar case) doesn't produce _bvecs/
_bvals json files, while it does for dwi images
"""
from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list[str]]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
fmap = create_key("sub-{subject}/fmap/sub-{subject}_acq-b0dwi_epi")
dwi = create_key("sub-{subject}/dwi/sub-{subject}_acq-b0dwi_dwi")
info: dict[tuple[str, tuple[str, ...], None], list[str]] = {fmap: [], dwi: []}
for s in seqinfo:
if "DIFFUSION" in s.image_type:
info[fmap].append(s.series_id)
info[dwi].append(s.series_id)
return info
heudiconv-1.3.2/heudiconv/heuristics/test_reproin.py 0000664 0000000 0000000 00000017632 14715167373 0022765 0 ustar 00root root 0000000 0000000 #
# Tests for reproin.py
#
from __future__ import annotations
import re
from typing import NamedTuple
from unittest.mock import patch
import pytest
from . import reproin
from .reproin import (
filter_files,
fix_canceled_runs,
fix_dbic_protocol,
fixup_subjectid,
get_dups_marked,
get_unique,
md5sum,
parse_series_spec,
sanitize_str,
)
class FakeSeqInfo(NamedTuple):
accession_number: str
study_description: str
field1: str
field2: str
def test_get_dups_marked() -> None:
no_dups: dict[tuple[str, tuple[str, ...], None], list[int]] = {
("some", ("foo",), None): [1]
}
assert get_dups_marked(no_dups) == no_dups
info: dict[tuple[str, tuple[str, ...], None], list[int | str]] = {
("bu", ("du",), None): [1, 2],
("smth", (), None): [3],
("smth2", ("apple", "banana"), None): ["a", "b", "c"],
}
assert (
get_dups_marked(info)
== get_dups_marked(info, True)
== {
("bu__dup-01", ("du",), None): [1],
("bu", ("du",), None): [2],
("smth", (), None): [3],
("smth2__dup-01", ("apple", "banana"), None): ["a"],
("smth2__dup-02", ("apple", "banana"), None): ["b"],
("smth2", ("apple", "banana"), None): ["c"],
}
)
assert get_dups_marked(info, per_series=False) == {
("bu__dup-01", ("du",), None): [1],
("bu", ("du",), None): [2],
("smth", (), None): [3],
("smth2__dup-02", ("apple", "banana"), None): ["a"],
("smth2__dup-03", ("apple", "banana"), None): ["b"],
("smth2", ("apple", "banana"), None): ["c"],
}
def test_filter_files() -> None:
# Filtering is currently disabled -- any sequence directory is Ok
assert filter_files("/home/mvdoc/dbic/09-run_func_meh/0123432432.dcm")
assert filter_files("/home/mvdoc/dbic/run_func_meh/012343143.dcm")
def test_md5sum() -> None:
assert md5sum("cryptonomicon") == "1cd52edfa41af887e14ae71d1db96ad1"
assert md5sum("mysecretmessage") == "07989808231a0c6f522f9d8e34695794"
def test_fix_canceled_runs() -> None:
class FakeSeqInfo(NamedTuple):
accession_number: str
series_id: str
protocol_name: str
series_description: str
seqinfo: list[FakeSeqInfo] = []
runname = "func_run+"
for i in range(1, 6):
seqinfo.append(
FakeSeqInfo("accession1", "{0:02d}-".format(i) + runname, runname, runname)
)
fake_accession2run = {"accession1": ["^01-", "^03-"]}
with patch.object(reproin, "fix_accession2run", fake_accession2run):
seqinfo_ = fix_canceled_runs(seqinfo) # type: ignore[arg-type]
for i, s in enumerate(seqinfo_, 1):
output = runname
if i == 1 or i == 3:
output = "cancelme_" + output
for key in ["series_description", "protocol_name"]:
value = getattr(s, key)
assert value == output
# check we didn't touch series_id
assert s.series_id == "{0:02d}-".format(i) + runname
def test_fix_dbic_protocol() -> None:
accession_number = "A003"
seq1 = FakeSeqInfo(
accession_number,
"mystudy",
"02-anat-scout_run+_MPR_sag",
"11-func_run-life2_acq-2mm692",
)
seq2 = FakeSeqInfo(accession_number, "mystudy", "nochangeplease", "nochangeeither")
seqinfos = [seq1, seq2]
protocols2fix = {
md5sum("mystudy"): [
(r"scout_run\+", "THESCOUT-runX"),
("run-life[0-9]", "run+_task-life"),
],
re.compile("^my.*"): [("THESCOUT-runX", "THESCOUT")],
# rely on 'catch-all' to fix up above scout
"": [("THESCOUT", "scout")],
}
with patch.object(reproin, "protocols2fix", protocols2fix), patch.object(
reproin, "series_spec_fields", ["field1"]
):
seqinfos_ = fix_dbic_protocol(seqinfos) # type: ignore[arg-type]
assert seqinfos[1] == seqinfos_[1] # type: ignore[comparison-overlap]
# field2 shouldn't have changed since I didn't pass it
assert seqinfos_[0] == FakeSeqInfo( # type: ignore[comparison-overlap]
accession_number, "mystudy", "02-anat-scout_MPR_sag", seq1.field2
)
# change also field2 please
with patch.object(reproin, "protocols2fix", protocols2fix), patch.object(
reproin, "series_spec_fields", ["field1", "field2"]
):
seqinfos_ = fix_dbic_protocol(seqinfos) # type: ignore[arg-type]
assert seqinfos[1] == seqinfos_[1] # type: ignore[comparison-overlap]
# now everything should have changed
assert seqinfos_[0] == FakeSeqInfo( # type: ignore[comparison-overlap]
accession_number,
"mystudy",
"02-anat-scout_MPR_sag",
"11-func_run+_task-life_acq-2mm692",
)
def test_sanitize_str() -> None:
assert sanitize_str("super@duper.faster") == "superduperfaster"
assert sanitize_str("perfect") == "perfect"
assert sanitize_str("never:use:colon:!") == "neverusecolon"
def test_fixupsubjectid() -> None:
assert fixup_subjectid("abra") == "abra"
assert fixup_subjectid("sub") == "sub"
assert fixup_subjectid("sid") == "sid"
assert fixup_subjectid("sid000030") == "sid000030"
assert fixup_subjectid("sid0000030") == "sid000030"
assert fixup_subjectid("sid00030") == "sid000030"
assert fixup_subjectid("sid30") == "sid000030"
assert fixup_subjectid("SID30") == "sid000030"
def test_parse_series_spec() -> None:
pdpn = parse_series_spec
assert pdpn("nondbic_func-bold") == {}
assert pdpn("cancelme_func-bold") == {}
assert (
pdpn("bids_func-bold")
== pdpn("func-bold")
== {"datatype": "func", "datatype_suffix": "bold"}
)
# pdpn("bids_func_ses+_task-boo_run+") == \
# order and PREFIX: should not matter, as well as trailing spaces
assert (
pdpn(" PREFIX:bids_func_ses+_task-boo_run+ ")
== pdpn("PREFIX:bids_func_ses+_task-boo_run+")
== pdpn("WIP func_ses+_task-boo_run+")
== pdpn("bids_func_ses+_run+_task-boo")
== {
"datatype": "func",
# 'datatype_suffix': 'bold',
"session": "+",
"run": "+",
"task": "boo",
}
)
# TODO: fix for that
assert (
pdpn("bids_func-pace_ses-1_task-boo_acq-bu_bids-please_run-2__therest")
== pdpn("bids_func-pace_ses-1_run-2_task-boo_acq-bu_bids-please__therest")
== pdpn("func-pace_ses-1_task-boo_acq-bu_bids-please_run-2")
== {
"datatype": "func",
"datatype_suffix": "pace",
"session": "1",
"run": "2",
"task": "boo",
"acq": "bu",
"bids": "bids-please",
}
)
assert pdpn("bids_anat-scout_ses+") == {
"datatype": "anat",
"datatype_suffix": "scout",
"session": "+",
}
assert pdpn("anat_T1w_acq-MPRAGE_run+") == {
"datatype": "anat",
"run": "+",
"acq": "MPRAGE",
"datatype_suffix": "T1w",
}
# Check for currently used {date}, which should also should get adjusted
# from (date) since Philips does not allow for {}
assert (
pdpn("func_ses-{date}")
== pdpn("func_ses-(date)")
== {"datatype": "func", "session": "{date}"}
)
assert pdpn("fmap_dir-AP_ses-01") == {
"datatype": "fmap",
"session": "01",
"dir": "AP",
}
def test_get_unique() -> None:
accession_number = "A003"
acqs = [
FakeSeqInfo(accession_number, "mystudy", "nochangeplease", "nochangeeither"),
FakeSeqInfo(accession_number, "mystudy2", "nochangeplease", "nochangeeither"),
]
assert get_unique(acqs, "accession_number") == accession_number # type: ignore[arg-type]
with pytest.raises(AssertionError) as ce:
get_unique(acqs, "study_description") # type: ignore[arg-type]
assert (
str(ce.value)
== "Was expecting a single value for attribute 'study_description' but got: mystudy, mystudy2"
)
heudiconv-1.3.2/heudiconv/heuristics/uc_bids.py 0000664 0000000 0000000 00000006644 14715167373 0021661 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from typing import Optional
from heudiconv.utils import SeqInfo
def create_key(
template: Optional[str],
outtype: tuple[str, ...] = ("nii.gz",),
annotation_classes: None = None,
) -> tuple[str, tuple[str, ...], None]:
if template is None or not template:
raise ValueError("Template must be a valid format string")
return (template, outtype, annotation_classes)
def infotodict(
seqinfo: list[SeqInfo],
) -> dict[tuple[str, tuple[str, ...], None], list]:
"""Heuristic evaluator for determining which runs belong where
allowed template fields - follow python string module:
item: index within category
subject: participant id
seqitem: run number during scanning
subindex: sub index within group
"""
t1w = create_key("anat/sub-{subject}_T1w")
t2w = create_key("anat/sub-{subject}_acq-{acq}_T2w")
flair = create_key("anat/sub-{subject}_acq-{acq}_FLAIR")
rest = create_key("func/sub-{subject}_task-rest_acq-{acq}_run-{item:02d}_bold")
info: dict[tuple[str, tuple[str, ...], None], list] = {
t1w: [],
t2w: [],
flair: [],
rest: [],
}
for seq in seqinfo:
x, _, z, n_vol, protocol, dcm_dir = (
seq.dim1,
seq.dim2,
seq.dim3,
seq.dim4,
seq.protocol_name,
seq.dcm_dir_name,
)
# t1_mprage --> T1w
if (
(z == 160)
and (n_vol == 1)
and ("t1_mprage" in protocol)
and ("XX" not in dcm_dir)
):
info[t1w] = [seq.series_id]
# t2_tse --> T2w
if (
(z == 35)
and (n_vol == 1)
and ("t2_tse" in protocol)
and ("XX" not in dcm_dir)
):
info[t2w].append({"item": seq.series_id, "acq": "TSE"})
# T2W --> T2w
if (
(z == 192)
and (n_vol == 1)
and ("T2W" in protocol)
and ("XX" not in dcm_dir)
):
info[t2w].append({"item": seq.series_id, "acq": "highres"})
# t2_tirm --> FLAIR
if (
(z == 35)
and (n_vol == 1)
and ("t2_tirm" in protocol)
and ("XX" not in dcm_dir)
):
info[flair].append({"item": seq.series_id, "acq": "TIRM"})
# t2_flair --> FLAIR
if (
(z == 160)
and (n_vol == 1)
and ("t2_flair" in protocol)
and ("XX" not in dcm_dir)
):
info[flair].append({"item": seq.series_id, "acq": "highres"})
# T2FLAIR --> FLAIR
if (
(z == 192)
and (n_vol == 1)
and ("T2-FLAIR" in protocol)
and ("XX" not in dcm_dir)
):
info[flair].append({"item": seq.series_id, "acq": "highres"})
# EPI (physio-matched) --> bold
if (
(x == 128)
and (z == 28)
and (n_vol == 300)
and ("EPI" in protocol)
and ("XX" not in dcm_dir)
):
info[rest].append({"item": seq.series_id, "acq": "128px"})
# EPI (physio-matched_NEW) --> bold
if (
(x == 64)
and (z == 34)
and (n_vol == 300)
and ("EPI" in protocol)
and ("XX" not in dcm_dir)
):
info[rest].append({"item": seq.series_id, "acq": "64px"})
return info
heudiconv-1.3.2/heudiconv/info.py 0000664 0000000 0000000 00000002710 14715167373 0017010 0 ustar 00root root 0000000 0000000 __author__ = "HeuDiConv team and contributors"
__url__ = "https://github.com/nipy/heudiconv"
__packagename__ = "heudiconv"
__description__ = "Heuristic DICOM Converter"
__license__ = "Apache 2.0"
__longdesc__ = """Convert DICOM dirs based on heuristic info - HeuDiConv
uses the dcmstack package and dcm2niix tool to convert DICOM directories or
tarballs into collections of NIfTI files following pre-defined heuristic(s)."""
CLASSIFIERS = [
"Environment :: Console",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Topic :: Scientific/Engineering",
"Typing :: Typed",
]
PYTHON_REQUIRES = ">=3.9"
REQUIRES = [
# not usable in some use cases since might be just a downloader, not binary
# 'dcm2niix',
"dcmstack>=0.8",
"etelemetry",
"filelock>=3.0.12",
"nibabel",
"nipype >=1.2.3",
"pydicom >= 1.0.0",
]
TESTS_REQUIRES = [
"pytest",
"tinydb",
"inotify",
]
MIN_DATALAD_VERSION = "0.13.0"
EXTRA_REQUIRES = {
"tests": TESTS_REQUIRES,
"extras": [
"duecredit", # optional dependency
], # Requires patched version ATM ['dcmstack'],
"datalad": ["datalad >=%s" % MIN_DATALAD_VERSION],
}
# Flatten the lists
EXTRA_REQUIRES["all"] = sum(EXTRA_REQUIRES.values(), [])
heudiconv-1.3.2/heudiconv/main.py 0000664 0000000 0000000 00000045630 14715167373 0017011 0 ustar 00root root 0000000 0000000 from __future__ import annotations
from glob import glob
import logging
import os.path as op
import sys
from types import TracebackType
from typing import Any, Optional
from . import __packagename__, __version__
from .bids import populate_bids_templates, populate_intended_for, tuneup_bids_json_files
from .convert import prep_conversion
from .due import Doi, due
from .parser import get_study_sessions
from .queue import queue_conversion
from .utils import SeqInfo, anonymize_sid, load_heuristic, treat_infofile
lgr = logging.getLogger(__name__)
INIT_MSG = "Running {packname} version {version} latest {latest}".format
def is_interactive() -> bool:
"""Return True if all in/outs are tty"""
# TODO: check on windows if hasattr check would work correctly and add value:
return sys.stdin.isatty() and sys.stdout.isatty() and sys.stderr.isatty()
def setup_exceptionhook() -> None:
"""
Overloads default sys.excepthook with our exceptionhook handler.
If interactive, our exceptionhook handler will invoke pdb.post_mortem;
if not interactive, then invokes default handler.
"""
def _pdb_excepthook(
exc_type: type[BaseException],
exc_value: BaseException,
tb: Optional[TracebackType],
) -> None:
if is_interactive():
import pdb
import traceback
traceback.print_exception(exc_type, exc_value, tb)
# print()
pdb.post_mortem(tb)
else:
lgr.warning("We cannot setup exception hook since not in interactive mode")
sys.excepthook = _pdb_excepthook
def process_extra_commands(
outdir: str,
command: str,
files: Optional[list[str]],
heuristic: Optional[str],
session: Optional[str],
subjs: Optional[list[str]],
grouping: str,
) -> None:
"""
Perform custom command instead of regular operations. Supported commands:
['treat-json', 'ls', 'populate-templates', 'populate-intended-for']
Parameters
----------
outdir : str
Output directory
command : {'treat-json', 'ls', 'populate-templates', 'populate-intended-for'}
Heudiconv command to run
files : list of str or None
List of files if command needs/expects it
heuristic : str or None
Path to heuristic file or name of builtin heuristic.
session : str or None
Session identifier
subjs : None or list of str
List of subject identifiers
grouping : {'studyUID', 'accession_number', 'all', 'custom'}
How to group dicoms.
"""
def ensure_has_files() -> None:
if files is None:
raise ValueError(f"command {command} expects --files being provided")
if command == "treat-jsons":
ensure_has_files()
assert files is not None # for mypy now
for fname in files:
treat_infofile(fname)
elif command == "ls":
ensure_heuristic_arg(heuristic)
assert heuristic is not None
ensure_has_files()
assert files is not None # for mypy now
heuristic_mod = load_heuristic(heuristic)
heuristic_ls = getattr(heuristic_mod, "ls", None)
for fname in files:
study_sessions = get_study_sessions(
None,
[fname],
heuristic_mod,
outdir,
session,
subjs,
grouping=grouping,
)
print(fname)
for study_session, sequences in study_sessions.items():
assert isinstance(sequences, dict)
suf = ""
if heuristic_ls:
suf += heuristic_ls(study_session, list(sequences.keys()))
print("\t%s %d sequences%s" % (str(study_session), len(sequences), suf))
elif command == "populate-templates":
ensure_heuristic_arg(heuristic)
assert heuristic is not None
ensure_has_files()
assert files is not None # for mypy now
heuristic_mod = load_heuristic(heuristic)
for fname in files:
populate_bids_templates(fname, getattr(heuristic_mod, "DEFAULT_FIELDS", {}))
elif command == "sanitize-jsons":
ensure_has_files()
assert files is not None # for mypy now
tuneup_bids_json_files(files)
elif command == "heuristics":
from .utils import get_known_heuristics_with_descriptions
for name_desc in get_known_heuristics_with_descriptions().items():
print("- %s: %s" % name_desc)
elif command == "heuristic-info":
ensure_heuristic_arg(heuristic)
assert heuristic is not None
from .utils import get_heuristic_description
print(get_heuristic_description(heuristic, full=True))
elif command == "populate-intended-for":
kwargs: dict[str, Any] = {}
if heuristic:
heuristic_mod = load_heuristic(heuristic)
kwargs = getattr(heuristic_mod, "POPULATE_INTENDED_FOR_OPTS", {})
if not subjs:
subjs = [
# search outdir for 'sub-*'; if it is a directory (not a regular file), remove
# the initial 'sub-':
op.basename(s)[len("sub-") :]
for s in glob(op.join(outdir, "sub-*"))
if op.isdir(s)
]
# read the subjects from the participants.tsv file to compare:
participants_tsv = op.join(outdir, "participants.tsv")
if op.lexists(participants_tsv):
with open(participants_tsv, "r") as f:
# read header line and find index for 'participant_id':
participant_id_index = (
f.readline().split("\t").index("participant_id")
)
# read all participants, removing the initial 'sub-':
known_subjects = [
ln.split("\t")[participant_id_index][len("sub-") :]
for ln in f.readlines()
]
if not set(subjs) == set(known_subjects):
# issue a warning, but continue with the 'subjs' list (the subjects for
# which there is data):
lgr.warning(
"'participants.tsv' contents are not identical to subjects found "
"in the BIDS dataset %s",
outdir,
)
for subj in subjs:
subject_path = op.join(outdir, "sub-" + subj)
if session:
session_paths = [op.join(subject_path, "ses-" + session)]
else:
# check to see if the data for this subject is organized by sessions; if not
# just use the subject_path
session_paths = [
s for s in glob(op.join(subject_path, "ses-*")) if op.isdir(s)
] or [subject_path]
for session_path in session_paths:
populate_intended_for(session_path, **kwargs)
else:
raise ValueError("Unknown command %s" % command)
def ensure_heuristic_arg(heuristic: Optional[str] = None) -> None:
"""
Check that the heuristic argument was provided.
"""
from .utils import get_known_heuristic_names
if not heuristic:
raise ValueError(
"Specify heuristic using -f. Known are: %s"
% ", ".join(get_known_heuristic_names())
)
@due.dcite(
Doi("10.5281/zenodo.1012598"),
path="heudiconv",
description="Flexible DICOM converter for organizing brain imaging data",
version=__version__,
cite_module=True,
)
def workflow(
*,
dicom_dir_template: Optional[str] = None,
files: Optional[list[str]] = None,
subjs: Optional[list[str]] = None,
converter: str = "dcm2niix",
outdir: str = ".",
locator: Optional[str] = None,
conv_outdir: Optional[str] = None,
anon_cmd: Optional[str] = None,
heuristic: Optional[str] = None,
with_prov: bool = False,
session: Optional[str] = None,
bids_options: Optional[str] = None,
overwrite: bool = False,
datalad: bool = False,
debug: bool = False,
command: Optional[str] = None,
grouping: str = "studyUID",
minmeta: bool = False,
random_seed: Optional[int] = None,
dcmconfig: Optional[str] = None,
queue: Optional[str] = None,
queue_args: Optional[str] = None,
) -> None:
"""Run the HeuDiConv conversion workflow.
Parameters
----------
dicom_dir_template : str or None, optional
Location of dicomdir that can be indexed with subject id
{subject} and session {session}. Tarballs (can be compressed)
are supported in addition to directory. All matching tarballs
for a subject are extracted and their content processed in a
single pass. If multiple tarballs are found, each is assumed to
be a separate session and the 'session' argument is ignored.
Mutually exclusive with 'files'. Default is None.
files : list or None, optional
Files (tarballs, dicoms) or directories containing files to
process. Mutually exclusive with 'dicom_dir_template'. Default is None.
subjs : list or None, optional
List of subjects - required for dicom template. If not
provided, DICOMS would first be "sorted" and subject IDs
deduced by the heuristic. Default is None.
converter : {'dcm2niix', 'none'}, optional
Tool to use for DICOM conversion. Setting to 'none' disables
the actual conversion step -- useful for testing heuristics.
Default is 'dcm2niix'.
outdir : str, optional
Output directory for conversion setup (for further
customization and future reference. This directory will refer
to non-anonymized subject IDs.
Default is '.' (current working directory).
locator : str or 'unknown' or None, optional
Study path under outdir. If provided, it overloads the value
provided by the heuristic. If 'datalad=True', every
directory within locator becomes a super-dataset thus
establishing a hierarchy. Setting to "unknown" will skip that
dataset. Default is None.
conv_outdir : str or None, optional
Output directory for converted files. By default this is
identical to --outdir. This option is most useful in
combination with 'anon_cmd'. Default is None.
anon_cmd : str or None, optional
Command to run to convert subject IDs used for DICOMs to
anonymized IDs. Such command must take a single argument and
return a single anonymized ID. Also see 'conv_outdir'. Default is None.
heuristic : str or None, optional
Name of a known heuristic or path to the Python script containing
heuristic. Default is None.
with_prov : bool, optional
Store additional provenance information. Requires python-rdflib.
Default is False.
session : str or None, optional
Session for longitudinal study_sessions. Default is None.
bids_options : str or None, optional
Flag for output into BIDS structure. Can also take BIDS-
specific options, e.g., --bids notop. The only currently
supported options is "notop", which skips creation of
top-level BIDS files. This is useful when running in batch
mode to prevent possible race conditions. Default is None.
overwrite : bool, optional
Overwrite existing converted files. Default is False.
datalad : bool, optional
Store the entire collection as DataLad dataset(s). Small files
will be committed directly to git, while large to annex. New
version (6) of annex repositories will be used in a "thin"
mode so it would look to mortals as just any other regular
directory (i.e. no symlinks to under .git/annex). For now just
for BIDS mode. Default is False.
debug : bool, optional
Do not catch exceptions and show exception traceback. Default is False.
command : {'heuristics', 'heuristic-info', 'ls', 'populate-templates',
'sanitize-jsons', 'treat-jsons', 'populate-intended-for', None}, optional
Custom action to be performed on provided files instead of regular
operation. Default is None.
grouping : {'studyUID', 'accession_number', 'all', 'custom'}, optional
How to group dicoms. Default is 'studyUID'.
minmeta : bool, optional
Exclude dcmstack meta information in sidecar jsons. Default is False.
random_seed : int or None, optional
Random seed to initialize RNG. Default is None.
dcmconfig : str or None, optional
JSON file for additional dcm2niix configuration. Default is None.
queue : {'SLURM', None}, optional
Batch system to submit jobs in parallel. Default is None.
If set, will cause scheduling of conversion and return without performing
any further action.
queue_args : str or None, optional
Additional queue arguments passed as single string of space-separated
Argument=Value pairs. Default is None.
Notes
-----
All parameters in this function must be called as keyword arguments.
"""
# To be done asap so anything random is deterministic
if random_seed is not None:
import random
random.seed(random_seed)
import numpy
numpy.random.seed(random_seed)
# Ensure only supported bids options are passed
if debug:
lgr.setLevel(logging.DEBUG)
# Should be possible but only with a single subject -- will be used to
# override subject deduced from the DICOMs
if files and subjs and len(subjs) > 1:
raise ValueError("Unable to processes multiple `--subjects` with files")
if debug:
setup_exceptionhook()
# Deal with provided files or templates
# pre-process provided list of files and possibly sort into groups/sessions
# Group files per each study/sid/session
outdir = op.abspath(outdir)
latest = None
try:
import etelemetry
latest = etelemetry.get_project("nipy/heudiconv")
except Exception as e:
lgr.warning("Could not check for version updates: %s", str(e))
lgr.info(
INIT_MSG(
packname=__packagename__,
version=__version__,
latest=(latest or {}).get("version", "Unknown"),
)
)
if command:
if dicom_dir_template:
lgr.warning(
f"DICOM directory template {dicom_dir_template!r} was provided but will be ignored since "
f"commands do not care about it ATM"
)
process_extra_commands(
outdir,
command,
files,
heuristic,
session,
subjs,
grouping,
)
return
#
# Load heuristic -- better do it asap to make sure it loads correctly
#
if not heuristic:
raise RuntimeError("No heuristic specified - add to arguments and rerun")
if queue:
lgr.info("Queuing %s conversion", queue)
if files:
iterarg = "files"
iterables = len(files)
elif subjs:
iterarg = "subjects"
iterables = len(subjs)
else:
raise ValueError("'queue' given but both 'files' and 'subjects' are false")
queue_conversion(queue, iterarg, iterables, queue_args)
return
heuristic_mod = load_heuristic(heuristic)
study_sessions = get_study_sessions(
dicom_dir_template,
files,
heuristic_mod,
outdir,
session,
subjs,
grouping=grouping,
)
# extract tarballs, and replace their entries with expanded lists of files
# TODO: we might need to sort so sessions are ordered???
lgr.info("Need to process %d study sessions", len(study_sessions))
# processed_studydirs = set()
locator_manual, session_manual = locator, session
for (locator, session_, sid), files_or_seqinfo in study_sessions.items():
# Allow for session to be overloaded from command line
if session_manual is not None:
session_ = session_manual
if locator_manual is not None:
locator = locator_manual
if not len(files_or_seqinfo):
raise ValueError("nothing to process?")
# that is how life is ATM :-/ since we don't do sorting if subj
# template is provided
if isinstance(files_or_seqinfo, dict):
assert isinstance(list(files_or_seqinfo.keys())[0], SeqInfo)
dicoms = None
seqinfo = files_or_seqinfo
else:
dicoms = files_or_seqinfo
seqinfo = None
if locator == "unknown":
lgr.warning("Skipping unknown locator dataset")
continue
if anon_cmd and sid is not None:
anon_sid = anonymize_sid(sid, anon_cmd)
lgr.info("Anonymized {} to {}".format(sid, anon_sid))
else:
anon_sid = None
study_outdir = op.join(outdir, locator or "")
anon_outdir = conv_outdir or outdir
anon_study_outdir = op.join(anon_outdir, locator or "")
if datalad:
from .external.dlad import prepare_datalad
dlad_sid = sid if not anon_sid else anon_sid
dl_msg = prepare_datalad(
anon_study_outdir,
anon_outdir,
dlad_sid,
session_,
seqinfo,
dicoms,
bids_options,
)
lgr.info(
"PROCESSING STARTS: {0}".format(
str(dict(subject=sid, outdir=study_outdir, session=session_))
)
)
prep_conversion(
sid,
dicoms,
study_outdir,
heuristic_mod,
converter=converter,
anon_sid=anon_sid,
anon_outdir=anon_study_outdir,
with_prov=with_prov,
ses=session_,
bids_options=bids_options,
seqinfo=seqinfo,
min_meta=minmeta,
overwrite=overwrite,
dcmconfig=dcmconfig,
grouping=grouping,
)
lgr.info(
"PROCESSING DONE: {0}".format(
str(dict(subject=sid, outdir=study_outdir, session=session_))
)
)
if datalad:
from .external.dlad import add_to_datalad
msg = "Converted subject %s" % dl_msg
# TODO: whenever propagate to supers work -- do just
# ds.save(msg=msg)
# also in batch mode might fail since we have no locking ATM
# and theoretically no need actually to save entire study
# we just need that
add_to_datalad(outdir, study_outdir, msg, bids_options)
# if bids:
# # Let's populate BIDS templates for folks to take care about
# for study_outdir in processed_studydirs:
# populate_bids_templates(study_outdir)
#
# TODO: record_collection of the sid/session although that information
# is pretty much present in .heudiconv/SUBJECT/info so we could just poke there
heudiconv-1.3.2/heudiconv/parser.py 0000664 0000000 0000000 00000027152 14715167373 0017360 0 ustar 00root root 0000000 0000000 from __future__ import annotations
import atexit
from collections import defaultdict
from collections.abc import ItemsView, Iterable, Iterator
from glob import glob
import logging
import os
import os.path as op
import re
import shutil
import sys
from types import ModuleType
from typing import Optional
from .dicoms import group_dicoms_into_seqinfos
from .utils import SeqInfo, StudySessionInfo, TempDirs, docstring_parameter
lgr = logging.getLogger(__name__)
tempdirs = TempDirs()
# Ensure they are cleaned up upon exit
atexit.register(tempdirs.cleanup)
_VCS_REGEX = r"%s\.(?:git|gitattributes|svn|bzr|hg)(?:%s|$)" % (op.sep, op.sep)
def _get_unpack_formats() -> dict[str, bool]:
"""For each extension return if it is a tar"""
out = {}
for _, exts, d in shutil.get_unpack_formats():
for e in exts:
out[e] = bool(re.search(r"\btar\b", d.lower()))
return out
_UNPACK_FORMATS = _get_unpack_formats()
_TAR_UNPACK_FORMATS = tuple(k for k, is_tar in _UNPACK_FORMATS.items() if is_tar)
@docstring_parameter(_VCS_REGEX)
def find_files(
regex: str,
topdir: list[str] | tuple[str, ...] | str = op.curdir,
exclude: Optional[str] = None,
exclude_vcs: bool = True,
dirs: bool = False,
) -> Iterator[str]:
"""Generator to find files matching regex
Parameters
----------
regex: string
exclude: string, optional
Matches to exclude
exclude_vcs:
If True, excludes commonly known VCS subdirectories. If string, used
as regex to exclude those files (regex: `{}`)
topdir: string or list, optional
Directory where to search
dirs: bool, optional
Either to match directories as well as files
"""
if isinstance(topdir, (list, tuple)):
for topdir_ in topdir:
yield from find_files(
regex,
topdir=topdir_,
exclude=exclude,
exclude_vcs=exclude_vcs,
dirs=dirs,
)
return
for dirpath, dirnames, filenames in os.walk(topdir):
names = (dirnames + filenames) if dirs else filenames
paths = (op.join(dirpath, name) for name in names)
for path in filter(re.compile(regex).search, paths):
path = path.rstrip(op.sep)
if exclude and re.search(exclude, path):
continue
if exclude_vcs and re.search(_VCS_REGEX, path):
continue
yield path
def get_extracted_dicoms(fl: Iterable[str]) -> ItemsView[Optional[str], list[str]]:
"""Given a collection of files and/or directories, list out and possibly
extract the contents from archives.
Parameters
----------
fl
Files (possibly archived) to process.
Returns
-------
ItemsView[str | None, list[str]]
The absolute paths of (possibly newly extracted) files.
Notes
-----
For 'classical' heudiconv, if multiple archives are provided, they
correspond to different sessions, so here we would group into sessions
and return pairs `sessionid`, `files` with `sessionid` being None if no
"sessions" detected for that file or there was just a single tarball in the
list.
When contents of fl appear to be an unpackable archive, the contents are
extracted into utils.TempDirs(f'heudiconvDCM') and the mode of all
extracted files is set to 700.
When contents of fl are a list of unarchived files, they are treated as
a single session.
When contents of fl are a list of unarchived and archived files, the
unarchived files are grouped into a single session (key: None). If there is
only one archived file, the contents of that file are grouped with
the unarchived file. If there are multiple archived files, they are grouped
into separate sessions.
"""
sessions: dict[Optional[str], list[str]] = defaultdict(list)
# keep track of session manually to ensure that the variable is bound
# when it is used after the loop (e.g., consider situation with
# fl being empty)
session = 0
# needs sorting to keep the generated "session" label deterministic
for _, t in enumerate(sorted(fl)):
if not t.endswith(tuple(_UNPACK_FORMATS)):
sessions[None].append(t)
continue
# Each file extracted must be associated with the proper session,
# but the high-level shutil does not have a way to list the files
# contained within each archive. So, files are temporarily
# extracted into unique tempdirs
# cannot use TempDirs since will trigger cleanup with __del__
tmpdir = tempdirs(prefix="heudiconvDCM")
# check content and sanitize permission bits before extraction
os.chmod(tmpdir, mode=0o700)
# For tar (only!) starting with 3.12 we should provide filter
# (enforced in 3.14) on how to filter/safe-guard filenames.
kws: dict[str, str] = {}
if sys.version_info >= (3, 12) and t.endswith(_TAR_UNPACK_FORMATS):
# Allow for a user-workaround if would be desired
# see e.g. https://docs.python.org/3.12/library/tarfile.html#extraction-filters
kws["filter"] = os.environ.get("HEUDICONV_TAR_FILTER", "tar")
shutil.unpack_archive(t, extract_dir=tmpdir, **kws) # type: ignore[arg-type]
archive_content = list(find_files(regex=".*", topdir=tmpdir))
# may be too cautious (tmpdir is already 700).
for f in archive_content:
os.chmod(f, mode=0o700)
# store full paths to each file, so we don't need to drag along
# tmpdir as some basedir
sessions[str(session)] = archive_content
session += 1
if session == 1:
# we had only 1 session (and at least 1), so not really multiple
# sessions according to classical 'heudiconv' assumptions, thus
# just move them all into None
sessions[None] += sessions.pop("0")
return sessions.items()
def get_study_sessions(
dicom_dir_template: Optional[str],
files_opt: Optional[list[str]],
heuristic: ModuleType,
outdir: str,
session: Optional[str],
sids: Optional[list[str]],
grouping: str = "studyUID",
) -> dict[StudySessionInfo, list[str] | dict[SeqInfo, list[str]]]:
"""Sort files or dicom seqinfos into study_sessions.
study_sessions put together files for a single session of a subject
in a study. Two major possible workflows:
- if dicom_dir_template provided -- doesn't pre-load DICOMs and just
loads files pointed by each subject and possibly sessions as corresponding
to different tarballs.
- if files_opt is provided, sorts all DICOMs it can find under those paths
"""
study_sessions: dict[StudySessionInfo, list[str] | dict[SeqInfo, list[str]]] = {}
if dicom_dir_template:
dicom_dir_template = op.abspath(dicom_dir_template)
# MG - should be caught by earlier checks
# assert not files_opt # see above TODO
assert sids
# expand the input template
if "{subject}" not in dicom_dir_template:
raise ValueError(
"dicom dir template must have {subject} as a placeholder for a "
"subject id. Got %r" % dicom_dir_template
)
for sid in sids:
sdir = dicom_dir_template.format(subject=sid, session=session)
for session_, files_ in get_extracted_dicoms(sorted(glob(sdir))):
if session_ is not None and session:
lgr.warning(
"We had session specified (%s) but while analyzing "
"files got a new value %r (using it instead)"
% (session, session_)
)
# in this setup we do not care about tracking "studies" so
# locator would be the same None
study_sessions[
StudySessionInfo(
None, session_ if session_ is not None else session, sid
)
] = files_
else:
# MG - should be caught on initial run
# YOH - what if it is the initial run?
# prep files
assert files_opt
files: list[str] = []
for f in files_opt:
if op.isdir(f):
files += sorted(
find_files(".*", topdir=f, exclude_vcs=True, exclude=r"/\.datalad/")
)
else:
files.append(f)
# in this scenario we don't care about sessions obtained this way
extracted_files: list[str] = []
for _, files_ex in get_extracted_dicoms(files):
extracted_files += files_ex
# sort all DICOMS using heuristic
seqinfo_dict = group_dicoms_into_seqinfos(
extracted_files,
grouping,
file_filter=getattr(heuristic, "filter_files", None),
dcmfilter=getattr(heuristic, "filter_dicom", None),
custom_grouping=getattr(heuristic, "grouping", None),
custom_seqinfo=getattr(heuristic, "custom_seqinfo", None),
)
if sids:
if len(sids) != 1:
raise RuntimeError(
"We were provided some subjects (%s) but "
"we can deal only "
"with overriding only 1 subject id. Got %d subjects and "
"found %d sequences" % (sids, len(sids), len(seqinfo_dict))
)
sid = sids[0]
else:
sid = None
if not getattr(heuristic, "infotoids", None):
# allow bypass with subject override
if not sid:
raise NotImplementedError(
"Cannot guarantee subject id - add "
"`infotoids` to heuristic file or "
"provide `--subjects` option"
)
lgr.info(
"Heuristic is missing an `infotoids` method, assigning "
"empty method and using provided subject id %s. "
"Provide `session` and `locator` fields for best results.",
sid,
)
def infotoids(
seqinfos: Iterable[SeqInfo], outdir: str # noqa: U100
) -> dict[str, Optional[str]]:
return {"locator": None, "session": None, "subject": None}
heuristic.infotoids = infotoids # type: ignore[attr-defined]
for _studyUID, seqinfo in seqinfo_dict.items():
# so we have a single study, we need to figure out its
# locator, session, subject
# TODO: Try except to ignore those we can't handle?
# actually probably there should be a dedicated exception for
# heuristics to throw if they detect that the study they are given
# is not the one they would be willing to work on
ids = heuristic.infotoids(seqinfo.keys(), outdir=outdir)
# TODO: probably infotoids is doomed to do more and possibly
# split into multiple sessions!!!! but then it should be provided
# full seqinfo with files which it would place into multiple groups
study_session_info = StudySessionInfo(
ids.get("locator"),
ids.get("session", session) or session,
sid or ids.get("subject", None),
)
lgr.info("Study session for %r", study_session_info)
if grouping != "all":
assert study_session_info not in study_sessions, (
f"Existing study session {study_session_info} "
f"already in analyzed sessions {study_sessions.keys()}"
)
study_sessions[study_session_info] = seqinfo
return study_sessions
heudiconv-1.3.2/heudiconv/py.typed 0000664 0000000 0000000 00000000000 14715167373 0017170 0 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/queue.py 0000664 0000000 0000000 00000006410 14715167373 0017202 0 ustar 00root root 0000000 0000000 from __future__ import annotations
import logging
import os
import subprocess
import sys
from typing import Optional
from nipype.utils.filemanip import which
lgr = logging.getLogger(__name__)
def queue_conversion(
queue: str, iterarg: str, iterables: int, queue_args: Optional[str] = None
) -> None:
"""
Write out conversion arguments to file and submit to a job scheduler.
Parses `sys.argv` for heudiconv arguments.
Parameters
----------
queue: string
Batch scheduler to use
iterarg: str
Multi-argument to index (`subjects` OR `files`)
iterables: int
Number of `iterarg` arguments
queue_args: string (optional)
Additional queue arguments for job submission
"""
SUPPORTED_QUEUES = {"SLURM": "sbatch"}
if queue not in SUPPORTED_QUEUES:
raise NotImplementedError("Queuing with %s is not supported", queue)
for i in range(iterables):
args = clean_args(sys.argv[1:], iterarg, i)
# make arguments executable
heudiconv_exec = which("heudiconv") or "heudiconv"
args.insert(0, heudiconv_exec)
convertcmd = " ".join(args)
# will overwrite across subjects
queue_file = os.path.abspath("heudiconv-%s.sh" % queue)
with open(queue_file, "wt") as fp:
fp.write("#!/bin/bash\n")
if queue_args:
for qarg in queue_args.split():
fp.write("#SBATCH %s\n" % qarg)
fp.write(convertcmd + "\n")
cmd = [SUPPORTED_QUEUES[queue], queue_file]
subprocess.call(cmd)
lgr.info("Submitted %d jobs", iterables)
def clean_args(hargs: list[str], iterarg: str, iteridx: int) -> list[str]:
"""
Filters arguments for batch submission.
Parameters
----------
hargs: list
Command-line arguments
iterarg: str
Multi-argument to index (`subjects` OR `files`)
iteridx: int
`iterarg` index to submit
Returns
-------
cmdargs : list
Filtered arguments for batch submission
Example
--------
>>> from heudiconv.queue import clean_args
>>> cmd = ['heudiconv', '-d', '/some/{subject}/path',
... '-q', 'SLURM',
... '-s', 'sub-1', 'sub-2', 'sub-3', 'sub-4']
>>> clean_args(cmd, 'subjects', 0)
['heudiconv', '-d', '/some/{subject}/path', '-s', 'sub-1']
"""
if iterarg == "subjects":
iterargs = ["-s", "--subjects"]
elif iterarg == "files":
iterargs = ["--files"]
else:
raise ValueError("Cannot index %s" % iterarg)
# remove these or cause an infinite loop
queue_args = ["-q", "--queue", "--queue-args"]
# control variables for multi-argument parsing
is_iterarg = False
itercount = 0
indices = []
cmdargs = hargs[:]
for i, arg in enumerate(hargs):
if arg.startswith("-") and is_iterarg:
# moving on to another argument
is_iterarg = False
if is_iterarg:
if iteridx != itercount:
indices.append(i)
itercount += 1
if arg in iterargs:
is_iterarg = True
if arg in queue_args:
indices.extend([i, i + 1])
for j in sorted(indices, reverse=True):
del cmdargs[j]
return cmdargs
heudiconv-1.3.2/heudiconv/tests/ 0000775 0000000 0000000 00000000000 14715167373 0016645 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/tests/__init__.py 0000664 0000000 0000000 00000000000 14715167373 0020744 0 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/tests/anonymize_script.py 0000775 0000000 0000000 00000000636 14715167373 0022624 0 ustar 00root root 0000000 0000000 #! /usr/bin/env python3
import hashlib
import re
import sys
def bids_id_(sid: str) -> str:
m = re.compile(r"^(?:sub-|)(.+)$").search(sid)
if m:
parsed_id = m.group(1)
return hashlib.md5(parsed_id.encode()).hexdigest()[:8]
else:
raise ValueError("invalid sid")
def main() -> str:
sid = sys.argv[1]
return bids_id_(sid)
if __name__ == "__main__":
print(main())
heudiconv-1.3.2/heudiconv/tests/conftest.py 0000664 0000000 0000000 00000000501 14715167373 0021040 0 ustar 00root root 0000000 0000000 import os
import pytest
@pytest.fixture(autouse=True, scope="session")
def git_env() -> None:
os.environ["GIT_AUTHOR_EMAIL"] = "maxm@example.com"
os.environ["GIT_AUTHOR_NAME"] = "Max Mustermann"
os.environ["GIT_COMMITTER_EMAIL"] = "maxm@example.com"
os.environ["GIT_COMMITTER_NAME"] = "Max Mustermann"
heudiconv-1.3.2/heudiconv/tests/data/ 0000775 0000000 0000000 00000000000 14715167373 0017556 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/tests/data/01-anat-scout/ 0000775 0000000 0000000 00000000000 14715167373 0022052 5 ustar 00root root 0000000 0000000 heudiconv-1.3.2/heudiconv/tests/data/01-anat-scout/0001.dcm 0000664 0000000 0000000 00000523766 14715167373 0023142 0 ustar 00root root 0000000 0000000 DICM UL ´ OB UI 1.2.840.10008.5.1.4.1.1.4 UI4 1.3.12.2.1107.5.2.43.66112.2016101409252673867900728 UI 1.2.840.10008.1.2.1 UI 1.2.40.0.13.1.1 SH dcm4che-2.0 CS
ISO_IR 100 CS ORIGINAL\PRIMARY\M\ND\NORM DA 20161014 TM 092530.706000 UI 1.2.840.10008.5.1.4.1.1.4 UI4 1.3.12.2.1107.5.2.43.66112.2016101409252673867900728 DA 20161014 ! DA 20161014 " DA 20161014 # DA 20161014 0 TM 092234.190000 1 TM 092530.687000 2 TM 092512.690000 3 TM 092530.706000 P SH
phantom-1 ` CS MR p LO SIEMENS € LO Dartmouth College - PBS ST Maynard 3,Hanover,NH,US,03755 PN SH AWP66112 0LO Halchenko_Yarik^950_bids_test4 >LO anat-scout_ses-localizer @LO
Department PPN LO Prisma PN phantom1-sid1 LO phantom1-sid1 0 DA 19800801 @ CS O AS 036Y DS
1.82880366 0DS 90.71848554 CS BRAIN CS GR ! CS SP " CS PFP # CS 3D $ SH
*fl3d1_ns % CS N P DS 1.6000000238419 € DS 3.15 DS 1.37 ƒ DS 1 „ DS
123.252622 … SH 1H † IS 1 ‡ DS 3 ‰ IS 118 ‘ IS 1 “ DS 100 ” DS 100 • DS 540 LO 66112 LO syngo MR E11 0LO anat-scout_ses-localizer QSH Body US CS ROW DS 8 CS N DS 0.01101367460529 DS 0 QCS HFS LO SIEMENS MR HEADER CS IMAGE NUM 4 LO 1.0 DS 12345 SH Normal SH No SL úúÿÿ SL úúÿÿ IS 0\0\0 FD
À»lffYÀ €aÀ @`@ DS 0.6875 IS 5800
UI8 1.3.12.2.1107.5.2.43.66112.30000016101413223420300000001 UI: 1.3.12.2.1107.5.2.43.66112.2016101409252673719600723.0.0.0 SH 1 IS 1 IS 1 IS 1 2 DS -101.60000151396\-140\130 7 DS 0\1\0\0\0\-1 R UI4 1.3.12.2.1107.5.2.43.66112.1.20161014092234552.0.0.0 @LO ADS -101.60000151396( US ( CS MONOCHROME2 ( US ( US ( 0 DS 1.625\1.625 ( US ( US ( US ( US ( US ( US ( PDS 21( QDS 50( ULO Algo1 ) LO SIEMENS CSA HEADER) LO SIEMENS MEDCOM HEADER2) CS IMAGE NUM 4 ) LO 20161014) OB è- SV10e M EchoLinePosition IS M M 80 Í Í Í Í Í EchoColumnPosition IS M M 80 Í Í Í Í Í EchoPartitionPosition IS M M 64 Í Í Í Í Í UsedChannelMask Date UL Í UsedChannelString me UT M ! ! M ! XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Í Í Í Í Í Actual3DImaPartNumber IS M M 0 Í Í Í Í Í ICE_Dims er LO M M X_1_1_1_1_1_1_1_1_1_1_1_13 Í Í Í Í Í B_value rientation IS Í Filter1 e IS Í Filter2 e IS Í ProtocolSliceNumber IS M M 0 Í Í Í Í Í RealDwellTime ber IS M M 5800 Í Í Í Í Í PixelFile nDate UN Í PixelFileName e UN Í SliceMeasurementDuration DS M M 12345.00000000 Í Í Í Í Í SequenceMask sition UL M
M
134217728 Í Í Í Í Í AcquisitionMatrixText SH M M 160p*160 Í Í Í Í Í MeasuredFourierLines IS M M 0 Í Í Í Í Í FlowEncodingDirection atio IS Í FlowVenc eCompressionMethod FD Í PhaseEncodingDirectionPositive IS M M 1 Í Í Í Í Í NumberOfImagesInMosaic US
Í DiffusionGradientDirection FD Í ImageGroup ion US
Í SliceNormalVector uenceRef FD Í DiffusionDirectionality CS Í TimeAfterStart nceSQ DS Í FlipAngle CodeSequence J ƒ à `À DS Í SequenceName entUID 2 r × fé8z¯œ•‘‹‰ SH Í RepetitionTime 5 • È*}²©¢˜…tkZXSQQMS DS Í EchoTime diaFileSetID ¢™…rcVLJKGIHFGCFGEIKI DS Í NumberOfAverages etUID F=9540-.-.0/0236;?AF DS Í VoxelThickness ))&%$#$##$###%()-029AGIN DS Í VoxelPhaseFOV !#%(-4:AHPW_jz DS Í VoxelReadoutFOV ing $+18AJSZ_k‚¡³Á DS Í VoxelPositionSag #*19BLU]^i†§¹¾¸ÿ C DS Í VoxelPositionCor $)1;FQ^bbk‹¤®œ:Cn DS Í VoxelPositionTra ER_ae}ž§¶bsx DS Í VoxelNormalSag Saƒ”–e” $ DS Í VoxelNormalCor x • DS Í VoxelNormalTra DS Í VoxelInPlaneRot DS Í ImagePositionPatient DS Í ImageOrientationPatient DS Í PixelSpacing DS Í SliceLocation gentSQ DS Í SliceThickness DS Í SpectrumTextRegionLabel SH Í Comp_Algorithm IS Í Comp_Blended xtensions % [ E IS Í Comp_ManualAdjusted A Å t}mI6. IS Í Comp_AutoParam
) t Šb†vW;1+# LT Í Comp_AdjustedParam F ï 9{j?2/&
LT Í Comp_JobID \…†S70,!
LT Í FMRIStimulInfo
#(+ IS Í FlowEncodingDirectionString !$%)-/11579: SH Í RepetitionTimeEffective n !%)+-13459?@>=BBCCD DS Í CsiImagePositionPatient 255:=@BAADEHIILLIFGF DS Í CsiImageOrientationPatient EGILKJLOOKJJIFHHHF DS Í CsiPixelSpacing FHKIJMONOOOPJGFIJKGA?<:9 DS Í CsiSliceLocation QSQQOLMLLMMIEB@===941-( DS Í CsiSliceThickness MKOQOIDEB?>=;741+'&# DS Í OriginalSeriesNumber CA?>=;62-'%$ ÿ IS Í OriginalImageNumber =840-)'" IS Í ImaAbsTablePosition ($ ÿÿ
%1EU SL M M 0 M 0 M -1286 Í Í Í NonPlanarImage o
$/BX[m¸Ý US
M M 0 Í Í Í Í Í MoCoQMeasure PixelValue
$.@V]dªäHâ US
Í LQAlgorithm PixelValue <T_dšíz-/ SH Í SlicePosition_PCS x à mD
FD M M -101.60000151 M -140.00000000
M
130.00000000 Í Í Í RBMoCoTrans ValueRange FD Í RBMoCoRot ames FD Í MultistepIndex ointer IS M M 0 Í Í Í Í Í ImaRelTablePosition IS M M 0 M 0 M 0 Í Í Í ImaCoilString LO M M HEA;HEP Í Í Í Í Í RFSWDDataType SH M M measured Í Í Í Í Í GSWDDataType an C E * SH M M measured Í Í Í Í Í NormalizeManipulated ; E÷íëæà IS Í ImaPATModeText ute 4 I!ëéçâÞÚ×ÖÔÓ× LO M M p3 Í Í Í Í Í B_matrix olusVolume 6 .íäãÝÚ×ÔÒÐÑÕÖÙßäæé FD Í BandwidthPerPixelPhaseEncode ÓÑÍÎÒÓÕÚßâçîõû FD Í FMRIStimulLevel pTime ÉÉÍÐÒÖÛÞäìôúþ
$( FD Í FmriConditionsDataSequence éï÷ý &+049=? UT Í FmriResultSequence òú ',-189;;@IJJL UT Í MosaicRefAcqTimes ons &*+.36:=<BCFKMRVTSY FD Í AutoInlineImageFilterEnabled :;AFGHNQOLPV[\WTT IS M M 1 Í Í Í Í Í QCData tBolusIngredientConcentration OUTSRQPRUSPQO FD Í ExamLandmarks gentSequence SQQPPOONMKJLKKMJE@ LT Í ExamDataRole AdministrationRouteSequence FEB@>;98:2+ ST M k k M k
Loc
Head
Sag