pax_global_header00006660000000000000000000000064147151673730014530gustar00rootroot0000000000000052 comment=8a063dd1a55467da732c9cbdadfbc892239d5ea7 heudiconv-1.3.2/000077500000000000000000000000001471516737300135175ustar00rootroot00000000000000heudiconv-1.3.2/LICENSE000066400000000000000000000014251471516737300145260ustar00rootroot00000000000000Copyright [2014-2024] [HeuDiConv developers] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. Some parts of the codebase/documentation are borrowed from other sources: - HeuDiConv tutorial from https://bitbucket.org/dpat/neuroimaging_core_docs/src Copyright 2023 Dianne Patterson heudiconv-1.3.2/PKG-INFO000066400000000000000000000027361471516737300146240ustar00rootroot00000000000000Metadata-Version: 2.1 Name: heudiconv Version: 1.3.2 Summary: Heuristic DICOM Converter Author: HeuDiConv team and contributors License: Apache 2.0 Classifier: Environment :: Console Classifier: Intended Audience :: Science/Research Classifier: License :: OSI Approved :: Apache Software License Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Topic :: Scientific/Engineering Classifier: Typing :: Typed Requires-Python: >=3.9 License-File: LICENSE Requires-Dist: dcmstack>=0.8 Requires-Dist: etelemetry Requires-Dist: filelock>=3.0.12 Requires-Dist: nibabel Requires-Dist: nipype>=1.2.3 Requires-Dist: pydicom>=1.0.0 Provides-Extra: tests Requires-Dist: pytest; extra == "tests" Requires-Dist: tinydb; extra == "tests" Requires-Dist: inotify; extra == "tests" Provides-Extra: extras Requires-Dist: duecredit; extra == "extras" Provides-Extra: datalad Requires-Dist: datalad>=0.13.0; extra == "datalad" Provides-Extra: all Requires-Dist: pytest; extra == "all" Requires-Dist: tinydb; extra == "all" Requires-Dist: inotify; extra == "all" Requires-Dist: duecredit; extra == "all" Requires-Dist: datalad>=0.13.0; extra == "all" Convert DICOM dirs based on heuristic info - HeuDiConv uses the dcmstack package and dcm2niix tool to convert DICOM directories or tarballs into collections of NIfTI files following pre-defined heuristic(s). heudiconv-1.3.2/README.rst000066400000000000000000000152361471516737300152150ustar00rootroot00000000000000============= **HeuDiConv** ============= `a heuristic-centric DICOM converter` .. image:: https://joss.theoj.org/papers/10.21105/joss.05839/status.svg :target: https://doi.org/10.21105/joss.05839 :alt: JOSS Paper .. image:: https://img.shields.io/badge/docker-nipy/heudiconv:latest-brightgreen.svg?logo=docker&style=flat :target: https://hub.docker.com/r/nipy/heudiconv/tags/ :alt: Our Docker image .. image:: https://github.com/nipy/heudiconv/actions/workflows/test.yml/badge.svg?event=push :target: https://github.com/nipy/heudiconv/actions/workflows/test.yml :alt: GitHub Actions (test) .. image:: https://codecov.io/gh/nipy/heudiconv/branch/master/graph/badge.svg :target: https://codecov.io/gh/nipy/heudiconv :alt: CodeCoverage .. image:: https://readthedocs.org/projects/heudiconv/badge/?version=latest :target: http://heudiconv.readthedocs.io/en/latest/?badge=latest :alt: Readthedocs .. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.1012598.svg :target: https://doi.org/10.5281/zenodo.1012598 :alt: Zenodo (latest) .. image:: https://repology.org/badge/version-for-repo/debian_unstable/heudiconv.svg?header=Debian%20Unstable :target: https://repology.org/project/heudiconv/versions :alt: Debian Unstable .. image:: https://repology.org/badge/version-for-repo/gentoo_ovl_science/python:heudiconv.svg?header=Gentoo%20%28%3A%3Ascience%29 :target: https://repology.org/project/python:heudiconv/versions :alt: Gentoo (::science) .. image:: https://repology.org/badge/version-for-repo/pypi/python:heudiconv.svg?header=PyPI :target: https://repology.org/project/python:heudiconv/versions :alt: PyPI .. image:: https://img.shields.io/badge/RRID-SCR__017427-blue :target: https://identifiers.org/RRID:SCR_017427 :alt: RRID About ----- ``heudiconv`` is a flexible DICOM converter for organizing brain imaging data into structured directory layouts. - It allows flexible directory layouts and naming schemes through customizable heuristics implementations. - It only converts the necessary DICOMs and ignores everything else in a directory. - You can keep links to DICOM files in the participant layout. - Using `dcm2niix `_ under the hood, it's fast. - It can track the provenance of the conversion from DICOM to NIfTI in W3C PROV format. - It provides assistance in converting to `BIDS `_. - It integrates with `DataLad `_ to place converted and original data under git/git-annex version control while automatically annotating files with sensitive information (e.g., non-defaced anatomicals, etc). Heudiconv can be inserted into your workflow to provide automatic conversion as part of a data acquisition pipeline, as seen in the figure below: .. image:: figs/environment.png Installation ------------ See our `installation page `_ on heudiconv.readthedocs.io . HOWTO 101 --------- In a nutshell -- ``heudiconv`` is given a file tree of DICOMs, and it produces a restructured file tree of NifTI files (conversion handled by `dcm2niix`_) with accompanying metadata files. The input and output structure is as flexible as your data, which is accomplished by using a Python file called a ``heuristic`` that knows how to read your input structure and decides how to name the resultant files. You can run your conversion automatically (which will produce a ``.heudiconv`` directory storing the used parameters), or generate the default parameters, edit them to customize file naming, and continue conversion via an additional invocation of `heudiconv`: .. image:: figs/workflow.png ``heudiconv`` comes with `existing heuristics `_ which can be used as is, or as examples. For instance, the Heuristic `convertall `_ extracts standard metadata from all matching DICOMs. ``heudiconv`` creates mapping files, ``.edit.text`` which lets researchers simply establish their own conversion mapping. In most use-cases of retrospective study data conversion, you would need to create your custom heuristic following the examples and the `"Heuristic" section `_ in the documentation. **Note** that `ReproIn heuristic `_ is generic and powerful enough to be adopted virtually for *any* study: For prospective studies, you would just need to name your sequences following the `ReproIn convention `_, and for retrospective conversions, you often would be able to create a new versatile heuristic by simply providing remappings into ReproIn as shown in `this issue (documentation is coming) `_. Having decided on a heuristic, you could use the command line:: heudiconv -f HEURISTIC-FILE-OR-NAME -o OUTPUT-PATH --files INPUT-PATHs with various additional options (see ``heudiconv --help`` or `"Usage" in documentation `__) to tune its behavior to convert your data. For detailed examples and guides, please check out `ReproIn conversion invocation examples `_ and the `user tutorials `_ in the documentation. How to cite ----------- Please use `Zenodo record `_ for your specific version of HeuDiConv. We also support gathering all relevant citations via `DueCredit `_. How to contribute ----------------- For a detailed into, see our `contributing guide `_. Our releases are packaged using Intuit auto, with the corresponding workflow including Docker image preparation being found in ``.github/workflows/release.yml``. 3-rd party heuristics --------------------- - https://github.com/courtois-neuromod/ds_prep/blob/main/mri/convert/heuristics_unf.py Support ------- All bugs, concerns and enhancement requests for this software can be submitted here: https://github.com/nipy/heudiconv/issues. If you have a problem or would like to ask a question about how to use ``heudiconv``, please submit a question to `NeuroStars.org `_ with a ``heudiconv`` tag. NeuroStars.org is a platform similar to StackOverflow but dedicated to neuroinformatics. All previous ``heudiconv`` questions are available here: http://neurostars.org/tags/heudiconv/ heudiconv-1.3.2/heudiconv.egg-info/000077500000000000000000000000001471516737300171755ustar00rootroot00000000000000heudiconv-1.3.2/heudiconv.egg-info/PKG-INFO000066400000000000000000000027361471516737300203020ustar00rootroot00000000000000Metadata-Version: 2.1 Name: heudiconv Version: 1.3.2 Summary: Heuristic DICOM Converter Author: HeuDiConv team and contributors License: Apache 2.0 Classifier: Environment :: Console Classifier: Intended Audience :: Science/Research Classifier: License :: OSI Approved :: Apache Software License Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Topic :: Scientific/Engineering Classifier: Typing :: Typed Requires-Python: >=3.9 License-File: LICENSE Requires-Dist: dcmstack>=0.8 Requires-Dist: etelemetry Requires-Dist: filelock>=3.0.12 Requires-Dist: nibabel Requires-Dist: nipype>=1.2.3 Requires-Dist: pydicom>=1.0.0 Provides-Extra: tests Requires-Dist: pytest; extra == "tests" Requires-Dist: tinydb; extra == "tests" Requires-Dist: inotify; extra == "tests" Provides-Extra: extras Requires-Dist: duecredit; extra == "extras" Provides-Extra: datalad Requires-Dist: datalad>=0.13.0; extra == "datalad" Provides-Extra: all Requires-Dist: pytest; extra == "all" Requires-Dist: tinydb; extra == "all" Requires-Dist: inotify; extra == "all" Requires-Dist: duecredit; extra == "all" Requires-Dist: datalad>=0.13.0; extra == "all" Convert DICOM dirs based on heuristic info - HeuDiConv uses the dcmstack package and dcm2niix tool to convert DICOM directories or tarballs into collections of NIfTI files following pre-defined heuristic(s). heudiconv-1.3.2/heudiconv.egg-info/SOURCES.txt000066400000000000000000000045341471516737300210670ustar00rootroot00000000000000LICENSE README.rst pyproject.toml setup.py heudiconv/__init__.py heudiconv/_version.py heudiconv/bids.py heudiconv/convert.py heudiconv/dicoms.py heudiconv/due.py heudiconv/info.py heudiconv/main.py heudiconv/parser.py heudiconv/py.typed heudiconv/queue.py heudiconv/utils.py heudiconv.egg-info/PKG-INFO heudiconv.egg-info/SOURCES.txt heudiconv.egg-info/dependency_links.txt heudiconv.egg-info/entry_points.txt heudiconv.egg-info/requires.txt heudiconv.egg-info/top_level.txt heudiconv/cli/__init__.py heudiconv/cli/monitor.py heudiconv/cli/run.py heudiconv/external/__init__.py heudiconv/external/dlad.py heudiconv/external/tests/__init__.py heudiconv/external/tests/test_dlad.py heudiconv/heuristics/__init__.py heudiconv/heuristics/banda-bids.py heudiconv/heuristics/bids_ME.py heudiconv/heuristics/bids_PhoenixReport.py heudiconv/heuristics/bids_with_ses.py heudiconv/heuristics/cmrr_heuristic.py heudiconv/heuristics/convertall.py heudiconv/heuristics/convertall_custom.py heudiconv/heuristics/example.py heudiconv/heuristics/multires_7Tbold.py heudiconv/heuristics/reproin.py heudiconv/heuristics/studyforrest_phase2.py heudiconv/heuristics/test_b0dwi_for_fmap.py heudiconv/heuristics/test_reproin.py heudiconv/heuristics/uc_bids.py heudiconv/tests/__init__.py heudiconv/tests/anonymize_script.py heudiconv/tests/conftest.py heudiconv/tests/test_archives.py heudiconv/tests/test_bids.py heudiconv/tests/test_convert.py heudiconv/tests/test_dicoms.py heudiconv/tests/test_heuristics.py heudiconv/tests/test_main.py heudiconv/tests/test_monitor.py heudiconv/tests/test_queue.py heudiconv/tests/test_regression.py heudiconv/tests/test_tarballs.py heudiconv/tests/test_utils.py heudiconv/tests/utils.py heudiconv/tests/data/MRI_102TD_PHA_S.MR.Chen_Matthews_1.3.1.2022.11.16.15.50.20.357.31204541.dcm heudiconv/tests/data/axasc35.dcm heudiconv/tests/data/phantom.dcm heudiconv/tests/data/sample_nifti.nii.gz heudiconv/tests/data/sample_nifti_params.txt heudiconv/tests/data/01-anat-scout/0001.dcm heudiconv/tests/data/01-fmap_acq-3mm/1.3.12.2.1107.5.2.43.66112.2016101409263663466202201.dcm heudiconv/tests/data/Phoenix/01+AA/01+AA+00001.dcm heudiconv/tests/data/Phoenix/99+PhoenixDocument/99+PhoenixDocument+00001.dcm heudiconv/tests/data/b0dwiForFmap/b0dwi_for_fmap+00001.dcm heudiconv/tests/data/b0dwiForFmap/b0dwi_for_fmap+00002.dcm heudiconv/tests/data/b0dwiForFmap/b0dwi_for_fmap+00003.dcmheudiconv-1.3.2/heudiconv.egg-info/dependency_links.txt000066400000000000000000000000011471516737300232430ustar00rootroot00000000000000 heudiconv-1.3.2/heudiconv.egg-info/entry_points.txt000066400000000000000000000001441471516737300224720ustar00rootroot00000000000000[console_scripts] heudiconv = heudiconv.cli.run:main heudiconv_monitor = heudiconv.cli.monitor:main heudiconv-1.3.2/heudiconv.egg-info/requires.txt000066400000000000000000000003241471516737300215740ustar00rootroot00000000000000dcmstack>=0.8 etelemetry filelock>=3.0.12 nibabel nipype>=1.2.3 pydicom>=1.0.0 [all] pytest tinydb inotify duecredit datalad>=0.13.0 [datalad] datalad>=0.13.0 [extras] duecredit [tests] pytest tinydb inotify heudiconv-1.3.2/heudiconv.egg-info/top_level.txt000066400000000000000000000000121471516737300217200ustar00rootroot00000000000000heudiconv heudiconv-1.3.2/heudiconv/000077500000000000000000000000001471516737300155035ustar00rootroot00000000000000heudiconv-1.3.2/heudiconv/__init__.py000066400000000000000000000003511471516737300176130ustar00rootroot00000000000000import logging from ._version import __version__ from .info import __packagename__ __all__ = ["__packagename__", "__version__"] lgr = logging.getLogger(__name__) lgr.debug("Starting the abomination") # just to "run-test" logging heudiconv-1.3.2/heudiconv/_version.py000066400000000000000000000000261471516737300176770ustar00rootroot00000000000000__version__ = "1.3.2" heudiconv-1.3.2/heudiconv/bids.py000066400000000000000000001260211471516737300170000ustar00rootroot00000000000000"""Handle BIDS specific operations""" from __future__ import annotations __docformat__ = "numpy" from collections import OrderedDict import csv import errno from glob import glob import hashlib import logging import os import os.path as op from pathlib import Path import re from typing import Any, Optional import warnings import numpy as np import pydicom as dcm from . import __version__, dicoms from .parser import find_files from .utils import ( create_file_if_missing, is_readonly, json_dumps, load_json, remove_prefix, remove_suffix, save_json, set_readonly, strptime_bids, update_json, ) lgr = logging.getLogger(__name__) # Fields to be populated in _scans files. Order matters SCANS_FILE_FIELDS = OrderedDict( [ ("filename", OrderedDict([("Description", "Name of the nifti file")])), ( "acq_time", OrderedDict( [ ("LongName", "Acquisition time"), ("Description", "Acquisition time of the particular scan"), ] ), ), ("operator", OrderedDict([("Description", "Name of the operator")])), ( "randstr", OrderedDict( [("LongName", "Random string"), ("Description", "md5 hash of UIDs")] ), ), ] ) #: JSON Key where we will embed our version in the newly produced .json files HEUDICONV_VERSION_JSON_KEY = "HeudiconvVersion" class BIDSError(Exception): pass BIDS_VERSION = "1.8.0" # List defining allowed parameter matching for fmap assignment: SHIM_KEY = "ShimSetting" AllowedFmapParameterMatching = [ "Shims", "ImagingVolume", "ModalityAcquisitionLabel", "CustomAcquisitionLabel", "PlainAcquisitionLabel", "Force", ] # Key info returned by get_key_info_for_fmap_assignment when # matching_parameter = "Force" KeyInfoForForce = "Forced" # List defining allowed criteria to assign a given fmap to a non-fmap run # among the different fmaps with matching parameters: AllowedCriteriaForFmapAssignment = [ "First", "Closest", ] def maybe_na(val: Any) -> str: """Return 'n/a' if non-None value represented as str is not empty Primarily for the consistent use of lower case 'n/a' so 'N/A' and 'NA' are also treated as 'n/a' """ if val is not None: valstr = str(val).strip() return "n/a" if (not valstr or valstr in ("N/A", "NA")) else valstr else: return "n/a" def treat_age(age: str | float | None) -> str | None: """Age might encounter 'Y' suffix or be a float""" if age is None: return None # might be converted to N/A by maybe_na agestr = str(age) if agestr.endswith("M"): agestr = agestr.rstrip("M") ageflt = float(agestr) / 12 agestr = ("%.2f" if ageflt != int(ageflt) else "%d") % ageflt else: agestr = agestr.rstrip("Y") if agestr: # strip all leading 0s but allow to scan a newborn (age 0Y) agestr = "0" if not agestr.lstrip("0") else agestr.lstrip("0") if agestr.startswith("."): # we had float point value, let's prepend 0 agestr = "0" + agestr return agestr def populate_bids_templates( path: str, defaults: Optional[dict[str, Any]] = None ) -> None: """Premake BIDS text files with templates""" lgr.info("Populating template files under %s", path) descriptor = op.join(path, "dataset_description.json") if defaults is None: defaults = {} if not op.lexists(descriptor): save_json( descriptor, OrderedDict( [ ("Name", "TODO: name of the dataset"), ("BIDSVersion", BIDS_VERSION), ( "License", defaults.get( "License", "TODO: choose a license, e.g. PDDL " "(http://opendatacommons.org/licenses/pddl/)", ), ), ( "Authors", defaults.get( "Authors", ["TODO:", "First1 Last1", "First2 Last2", "..."] ), ), ( "Acknowledgements", defaults.get( "Acknowledgements", "TODO: whom you want to acknowledge" ), ), ( "HowToAcknowledge", "TODO: describe how to acknowledge -- either cite a " "corresponding paper, or just in acknowledgement " "section", ), ("Funding", ["TODO", "GRANT #1", "GRANT #2"]), ("ReferencesAndLinks", ["TODO", "List of papers or websites"]), ("DatasetDOI", "TODO: eventually a DOI for the dataset"), ] ), ) sourcedata_README = op.join(path, "sourcedata", "README") if op.exists(op.dirname(sourcedata_README)): create_file_if_missing( sourcedata_README, ( "TODO: Provide description about source data, e.g. \n" "Directory below contains DICOMS compressed into tarballs per " "each sequence, replicating directory hierarchy of the BIDS dataset" " itself." ), ) create_file_if_missing( op.join(path, "CHANGES"), "0.0.1 Initial data acquired\n" "TODOs:\n\t- verify and possibly extend information in participants.tsv" " (see for example http://datasets.datalad.org/?dir=/openfmri/ds000208)" "\n\t- fill out dataset_description.json, README, sourcedata/README" " (if present)\n\t- provide _events.tsv file for each _bold.nii.gz with" " onsets of events (see '8.5 Task events' of BIDS specification)", ) create_file_if_missing( op.join(path, "README"), "TODO: Provide description for the dataset -- basic details about the " "study, possibly pointing to pre-registration (if public or embargoed)", ) create_file_if_missing( op.join(path, "scans.json"), json_dumps(SCANS_FILE_FIELDS, sort_keys=False) ) create_file_if_missing(op.join(path, ".bidsignore"), ".duecredit.p") if op.lexists(op.join(path, ".git")): create_file_if_missing(op.join(path, ".gitignore"), ".duecredit.p") populate_aggregated_jsons(path) def populate_aggregated_jsons(path: str) -> None: """Aggregate across the entire BIDS dataset ``.json``\\s into top level ``.json``\\s Top level .json files would contain only the fields which are common to all ``subject[/session]/type/*_modality.json``\\s. ATM aggregating only for ``*_task*_bold.json`` files. Only the task- and OPTIONAL _acq- field is retained within the aggregated filename. The other BIDS _key-value pairs are "aggregated over". Parameters ---------- path: str Path to the top of the BIDS dataset """ # TODO: collect all task- .json files for func files to tasks = {} # way too many -- let's just collect all which are the same! # FIELDS_TO_TRACK = {'RepetitionTime', 'FlipAngle', 'EchoTime', # 'Manufacturer', 'SliceTiming', ''} for fpath in find_files( r".*_task-.*\_bold\.json", topdir=glob(op.join(path, "sub-*")), exclude_vcs=True, exclude=r"/\.(datalad|heudiconv)/", ): # # According to BIDS spec I think both _task AND _acq (may be more? # _rec, _dir, ...?) should be retained? # TODO: if we are to fix it, then old ones (without _acq) should be # removed first task = re.sub(r".*_(task-[^_\.]*(_acq-[^_\.]*)?)_.*", r"\1", fpath) json_ = load_json(fpath, retry=100) if task not in tasks: tasks[task] = json_ else: rec = tasks[task] # let's retain only those fields which have the same value for field in sorted(rec): if field not in json_ or json_[field] != rec[field]: del rec[field] # create a stub onsets file for each one of those suf = "_bold.json" assert fpath.endswith(suf) # specify the name of the '_events.tsv' file: if "_echo-" in fpath: # multi-echo sequence: bids (1.1.0) specifies just one '_events.tsv' # file, common for all echoes. The name will not include _echo-. # TODO: RF to use re.match for better readability/robustness # So, find out the echo number: fpath_split = fpath.split("_echo-", 1) # split fpath using '_echo-' fpath_split_2 = fpath_split[1].split( "_", 1 ) # split the second part of fpath_split using '_' echoNo = fpath_split_2[0] # get echo number if echoNo == "1": if len(fpath_split_2) != 2: raise ValueError("Found no trailer after _echo-") # we modify fpath to exclude '_echo-' + echoNo: fpath = fpath_split[0] + "_" + fpath_split_2[1] else: # for echoNo greater than 1, don't create the events file, so go to # the next for loop iteration: continue events_file = remove_suffix(fpath, suf) + "_events.tsv" # do not touch any existing thing, it may be precious if not op.lexists(events_file): lgr.debug("Generating %s", events_file) with open(events_file, "w") as fp: fp.write( "onset\tduration\ttrial_type\tresponse_time\tstim_file" "\tTODO -- fill in rows and add more tab-separated " "columns if desired" ) # extract tasks files stubs for task_acq, fields in tasks.items(): task_file = op.join(path, task_acq + "_bold.json") # Since we are pulling all unique fields we have to possibly # rewrite this file to guarantee consistency. # See https://github.com/nipy/heudiconv/issues/277 for a usecase/bug # when we didn't touch existing one. # But the fields we enter (TaskName and CogAtlasID) might need need # to be populated from the file if it already exists placeholders = { "TaskName": ( "TODO: full task name for %s" % task_acq.split("_")[0].split("-")[1] ), "CogAtlasID": "http://www.cognitiveatlas.org/task/id/TODO", } if op.lexists(task_file): j = load_json(task_file, retry=100) # Retain possibly modified placeholder fields for f in placeholders: if f in j: placeholders[f] = j[f] act = "Regenerating" else: act = "Generating" lgr.debug("%s %s", act, task_file) fields.update(placeholders) save_json(task_file, fields, sort_keys=True, pretty=True) def tuneup_bids_json_files(json_files: list[str]) -> None: """Given a list of BIDS .json files, e.g.""" if not json_files: return # Harmonize generic .json formatting for jsonfile in json_files: json_ = load_json(jsonfile) # sanitize! for f1 in ["Acquisition", "Study", "Series"]: for f2 in ["DateTime", "Date"]: json_.pop(f1 + f2, None) # TODO: should actually be placed into series file which must # go under annex (not under git) and marked as sensitive # MG - Might want to replace with flag for data sensitivity # related - https://github.com/nipy/heudiconv/issues/92 if "Date" in str(json_): # Let's hope no word 'Date' comes within a study name or smth like # that raise ValueError("There must be no dates in .json sidecar") # Those files should not have our version field already - should have been # freshly produced assert HEUDICONV_VERSION_JSON_KEY not in json_ json_[HEUDICONV_VERSION_JSON_KEY] = str(__version__) save_json(jsonfile, json_) # Load the beast seqtype = op.basename(op.dirname(jsonfile)) # MG - want to expand this for other _epi # possibly add IntendedFor automatically as well? if seqtype == "fmap": json_basename = "_".join(jsonfile.split("_")[:-1]) # if we got by now all needed .json files -- we can fix them up # unfortunately order of "items" is not guaranteed atm json_phasediffname = json_basename + "_phasediff.json" json_mag = json_basename + "_magnitude*.json" if op.exists(json_phasediffname) and len(glob(json_mag)) >= 1: json_ = load_json(json_phasediffname) # TODO: we might want to reorder them since ATM # the one for shorter TE is the 2nd one! # For now just save truthfully by loading magnitude files lgr.debug("Placing EchoTime fields into phasediff file") for i in 1, 2: try: json_["EchoTime%d" % i] = load_json( json_basename + "_magnitude%d.json" % i )["EchoTime"] except IOError as exc: lgr.error("Failed to open magnitude file: %s", exc) # might have been made R/O already, but if not -- it will be set # only later in the pipeline, so we must not make it read-only yet was_readonly = is_readonly(json_phasediffname) if was_readonly: set_readonly(json_phasediffname, False) save_json(json_phasediffname, json_) if was_readonly: set_readonly(json_phasediffname) def add_participant_record( studydir: str, subject: str, age: str | None, sex: str | None ) -> None: participants_tsv = op.join(studydir, "participants.tsv") participant_id = "sub-%s" % subject if not create_file_if_missing( participants_tsv, "\t".join(["participant_id", "age", "sex", "group"]) + "\n" ): # check if may be subject record already exists with open(participants_tsv) as f: f.readline() known_subjects = {ln.split("\t")[0] for ln in f.readlines()} if participant_id in known_subjects: return else: # Populate participants.json (an optional file to describe column names in # participant.tsv). This auto generation will make BIDS-validator happy. participants_json = op.join(studydir, "participants.json") if not op.lexists(participants_json): save_json( participants_json, OrderedDict( [ ( "participant_id", OrderedDict([("Description", "Participant identifier")]), ), ( "age", OrderedDict( [ ( "Description", "Age in years (TODO - verify) as in the initial" " session, might not be correct for other sessions", ) ] ), ), ( "sex", OrderedDict( [ ( "Description", "self-rated by participant, M for male/F for " "female (TODO: verify)", ) ] ), ), ( "group", OrderedDict( [ ( "Description", "(TODO: adjust - by default everyone is in " "control group)", ) ] ), ), ] ), sort_keys=False, ) # Add a new participant with open(participants_tsv, "a") as f: f.write( "\t".join( map( str, [ participant_id, maybe_na(treat_age(age)), maybe_na(sex), "control", ], ) ) + "\n" ) def find_subj_ses(f_name: str) -> tuple[Optional[str], Optional[str]]: """Given a path to the bids formatted filename parse out subject/session""" # we will allow the match at either directories or within filename # assuming that bids layout is "correct" regex = re.compile("sub-(?P[a-zA-Z0-9]*)([/_]ses-(?P[a-zA-Z0-9]*))?") regex_res = regex.search(f_name) res = regex_res.groupdict() if regex_res else {} return res.get("subj", None), res.get("ses", None) def save_scans_key( item: tuple[str, tuple[str, ...], list[str]], bids_files: list[str] ) -> None: """ Parameters ---------- item: bids_files: list of str Returns ------- """ rows = {} assert bids_files, "we do expect some files since it was called" # we will need to deduce subject and session from the bids_filename # and if there is a conflict, we would just blow since this function # should be invoked only on a result of a single item conversion as far # as I see it, so should have the same subject/session subj: Optional[str] = None ses: Optional[str] = None for bids_file in bids_files: # get filenames f_name = "/".join(bids_file.split("/")[-2:]) f_name = f_name.replace("json", "nii.gz") rows[f_name] = get_formatted_scans_key_row(item[-1][0]) subj_, ses_ = find_subj_ses(f_name) if not subj_: lgr.warning( "Failed to detect fulfilled BIDS layout. " "No scans.tsv file(s) will be produced for %s", ", ".join(bids_files), ) return if subj and subj_ != subj: raise ValueError( "We found before subject %s but now deduced %s from %s" % (subj, subj_, f_name) ) subj = subj_ if ses and ses_ != ses: raise ValueError( "We found before session %s but now deduced %s from %s" % (ses, ses_, f_name) ) ses = ses_ # where should we store it? output_dir = op.dirname(op.dirname(bids_file)) # save ses = "_ses-%s" % ses if ses else "" add_rows_to_scans_keys_file( op.join(output_dir, "sub-{0}{1}_scans.tsv".format(subj, ses)), rows ) def add_rows_to_scans_keys_file(fn: str, newrows: dict[str, list[str]]) -> None: """Add new rows to the _scans file. Parameters ---------- fn: str filename newrows: dict extra rows to add (acquisition time, referring physician, random string) """ if op.lexists(fn): with open(fn, "r") as csvfile: reader = csv.reader(csvfile, delimiter="\t") existing_rows = [row for row in reader] # skip header fnames2info = {row[0]: row[1:] for row in existing_rows[1:]} newrows_key = newrows.keys() newrows_toadd = list(set(newrows_key) - set(fnames2info.keys())) for key_toadd in newrows_toadd: fnames2info[key_toadd] = newrows[key_toadd] # remove os.unlink(fn) else: fnames2info = newrows header = list(SCANS_FILE_FIELDS.keys()) # prepare all the data rows data_rows = [[k] + v for k, v in fnames2info.items()] # sort by the date/filename try: data_rows_sorted = sorted(data_rows, key=lambda x: (x[1], x[0])) except TypeError as exc: lgr.warning("Sorting scans by date failed: %s", str(exc)) data_rows_sorted = sorted(data_rows) # save with open(fn, "a") as csvfile: writer = csv.writer(csvfile, delimiter="\t") writer.writerows([header] + data_rows_sorted) def get_formatted_scans_key_row(dcm_fn: str | Path) -> list[str]: """ Parameters ---------- dcm_fn: str Returns ------- row: list [ISO acquisition time, performing physician name, random string] """ dcm_data = dcm.dcmread(dcm_fn, stop_before_pixels=True, force=True) # we need to store filenames and acquisition datetimes acq_datetime = dicoms.get_datetime_from_dcm(dcm_data=dcm_data) # add random string # But let's make it reproducible by using all UIDs # (might change across versions?) randcontent = "".join( [getattr(dcm_data, f) or "" for f in sorted(dir(dcm_data)) if f.endswith("UID")] ) randstr = hashlib.md5(randcontent.encode()).hexdigest()[:8] try: perfphys = dcm_data.PerformingPhysicianName except AttributeError: perfphys = "" row = [acq_datetime.isoformat() if acq_datetime else "", perfphys, randstr] # empty entries should be 'n/a' # https://github.com/dartmouth-pbs/heudiconv/issues/32 row = ["n/a" if not str(e) else e for e in row] return row def convert_sid_bids(subject_id: str) -> str: """Shim for stripping any non-BIDS compliant characters within subject_id Parameters ---------- subject_id : string Returns ------- sid : string New subject ID """ warnings.warn( "convert_sid_bids() is deprecated, please use sanitize_label() instead.", DeprecationWarning, stacklevel=2, ) return sanitize_label(subject_id) def get_shim_setting(json_file: str) -> Any: """ Gets the "ShimSetting" field from a json_file. If no "ShimSetting" present, return error Parameters ---------- json_file : str Returns ------- str with "ShimSetting" value """ data = load_json(json_file) try: shims = data[SHIM_KEY] except KeyError: lgr.error( 'File %s does not have "%s". ' 'Please use a different "matching_parameters" in your heuristic file', json_file, SHIM_KEY, ) raise return shims def find_fmap_groups(fmap_dir: str) -> dict[str, list[str]]: """ Finds the different fmap groups in a fmap directory. By groups here we mean fmaps that are intended to go together (with reversed PE polarity, magnitude/phase, etc.) Parameters ---------- fmap_dir : str path to the session folder (or to the subject folder, if there are no sessions). Returns ------- fmap_groups : dict key: prefix common to the group (e.g. no "dir" entity, "_phase"/"_magnitude", ...) value: list of all fmap paths in the group """ if op.basename(fmap_dir) != "fmap": lgr.error("%s is not a fieldmap folder", fmap_dir) # Get a list of all fmap json files in the session: fmap_jsons = sorted(glob(op.join(fmap_dir, "*.json"))) # RegEx to remove fmap-specific substrings from fmap file names # "_phase[1,2]", "_magnitude[1,2]", "_phasediff", "_dir-