tedana: TE Dependent ANAlysis

The tedana package is part of the ME-ICA pipeline, performing TE-dependent analysis of multi-echo functional magnetic resonance imaging (fMRI) data. TE-dependent analysis (tedana) is a Python module for denoising multi-echo functional magnetic resonance imaging (fMRI) data.

Latest Version PyPI - Python Version DOI CircleCI License Documentation Status Codecov Join the chat

About

https://user-images.githubusercontent.com/7406227/40031156-57b7cbb8-57bc-11e8-8c51-5b29f2e86a48.png

tedana originally came about as a part of the ME-ICA pipeline. The ME-ICA pipeline originally performed both pre-processing and TE-dependent analysis of multi-echo fMRI data; however, tedana now assumes that you’re working with data which has been previously preprocessed.

Citations

When using tedana, please include the following citations:

tedana Available from: https://doi.org/10.5281/zenodo.1250561

2. Kundu, P., Inati, S. J., Evans, J. W., Luh, W. M. & Bandettini, P. A. (2011). Differentiating BOLD and non-BOLD signals in fMRI time series using multi-echo EPI. NeuroImage, 60, 1759-1770.

3. Kundu, P., Brenowitz, N. D., Voon, V., Worbe, Y., Vértes, P. E., Inati, S. J., Saad, Z. S., Bandettini, P. A., & Bullmore, E. T. (2013). Integrated strategy for improving functional connectivity mapping using multiecho fMRI. Proceedings of the National Academy of Sciences, 110, 16187-16192.

Alternatively, you can automatically compile relevant citations by running your tedana code with duecredit. For example, if you plan to run a script using tedana (in this case, tedana_script.py):

python -m duecredit tedana_script.py

You can also learn more about why citing software is important.

Posters

_images/tedana-poster.png

License Information

tedana is licensed under GNU Lesser General Public License version 2.1.

Installation

You’ll need to set up a working development environment to use tedana. To set up a local environment, you will need Python >=3.5 and the following packages will need to be installed:

  • nilearn
  • nibabel>=2.1.0
  • numpy
  • scikit-learn
  • scipy

You can then install tedana with:

pip install tedana

In addition to the Python package, installing tedana will add the tedana and t2smap workflow CLIs to your path.

Multi-echo fMRI

In multi-echo (ME) fMRI, data are acquired for multiple echo times, resulting in multiple time series for each voxel.

The physics of multi-echo fMRI

Multi-echo fMRI data is obtained by acquiring multiple TEs (commonly called echo times) for each MRI volume during data collection. While fMRI signal contains important neural information (termed the blood oxygen-level dependent, or BOLD signal, it also contains “noise” (termed non-BOLD signal) caused by things like participant motion and changes in breathing. Because the BOLD signal is known to decay at a set rate, collecting multiple echos allows us to assess whether components of the fMRI signal are BOLD- or non-BOLD. For a comprehensive review, see Kundu et al. (2017).

Why use multi-echo?

ME-EPI exhibits higher SNR and improves statistical power of analyses.

Resources

Journal articles
  • A review on multi-echo fMRI and its applications
  • A spreadsheet cataloguing papers using multi-echo fMRI, with information about acquisition parameters.
Videos
Sequences
  • Multi-echo sequences: who has them and how to get them.

Usage

tedana minimally requires:

  1. Acquired echo times (in milliseconds)
  2. Functional datasets equal to the number of acquired echoes

But you can supply many other options, viewable with tedana -h or t2smap -h.

For most use cases, we recommend that users call tedana from within existing fMRI preprocessing pipelines such as fMRIPrep or afni_proc.py. fMRIPrep currently supports Optimal combination through tedana, but not the full multi-echo denoising pipeline, although there are plans underway to integrate it. Users can also construct their own preprocessing pipelines from which to call tedana; for recommendations on doing so, see our general guidelines for Constructing ME-EPI pipelines.

Run tedana

This is the full tedana workflow, which runs multi-echo ICA and outputs multi-echo denoised data along with many other derivatives. To see which files are generated by this workflow, check out the outputs page: https://tedana.readthedocs.io/en/latest/outputs.html

usage: tedana [-h] -d FILE [FILE ...] -e TE [TE ...] [--mask FILE]
              [--mix FILE] [--ctab FILE] [--manacc MANACC] [--sourceTEs STE]
              [--combmode {t2s,ste}] [--verbose] [--tedort]
              [--gscontrol {t1c,gsr} [{t1c,gsr} ...]] [--wvpca]
              [--tedpca {mle,kundu,kundu-stabilize}] [--out-dir OUT_DIR]
              [--seed FIXED_SEED]
required arguments
-d Multi-echo dataset for analysis. May be a single file with spatially concatenated data or a set of echo-specific files, in the same order as the TEs are listed in the -e argument.
-e Echo times (in ms). E.g., 15.0 39.0 63.0
Named Arguments
--mask Binary mask of voxels to include in TE Dependent ANAlysis. Must be in the same space as data.
--mix File containing mixing matrix. If not provided, ME-PCA & ME-ICA is done.
--ctab File containing a component table from which to extract pre-computed classifications.
--manacc Comma separated list of manually accepted components
--sourceTEs

Source TEs for models. E.g., 0 for all, -1 for opt. com., and 1,2 for just TEs 1 and 2. Default=-1.

Default: -1

--combmode

Possible choices: t2s, ste

Combination scheme for TEs: t2s (Posse 1999, default), ste (Poser)

Default: “t2s”

--verbose

Generate intermediate and additional files.

Default: False

--tedort

Orthogonalize rejected components w.r.t. accepted components prior to denoising.

Default: False

--gscontrol

Possible choices: t1c, gsr

Perform additional denoising to remove spatially diffuse noise. Default is None. This argument can be single value or a space delimited list

--wvpca

Perform PCA on wavelet-transformed data

Default: False

--tedpca

Possible choices: mle, kundu, kundu-stabilize

Method with which to select components in TEDPCA

Default: “mle”

--out-dir

Output directory.

Default: “.”

--seed

Value passed to repr(mdp.numx_rand.seed()) Set to an integer value for reproducible ICA results; otherwise, set to -1 for varying results across calls.

Default: 42

Note

The --mask argument is not intended for use with very conservative region-of-interest analyses. One of the ways by which components are assessed as BOLD or non-BOLD is their spatial pattern, so overly conservative masks will invalidate several steps in the tedana workflow. To examine regions-of-interest with multi-echo data, apply masks after TE Dependent ANAlysis.

Run t2smap

This workflow uses multi-echo data to optimally combine data across echoes and to estimate T2* and S0 maps or time series. To see which files are generated by this workflow, check out the workflow documentation: tedana.workflows.t2smap_workflow().

usage: t2smap [-h] -d FILE [FILE ...] -e TE [TE ...] [--mask FILE]
              [--fitmode {all,ts}] [--combmode {t2s,ste}] [--label LABEL]
required arguments
-d Multi-echo dataset for analysis. May be a single file with spatially concatenated data or a set of echo-specific files, in the same order as the TEs are listed in the -e argument.
-e Echo times (in ms). E.g., 15.0 39.0 63.0
Named Arguments
--mask Binary mask of voxels to include in TE Dependent ANAlysis. Must be in the same space as data.
--fitmode

Possible choices: all, ts

Monoexponential model fitting scheme. “all” means that the model is fit, per voxel, across all timepoints. “ts” means that the model is fit, per voxel and per timepoint.

Default: “all”

--combmode

Possible choices: t2s, ste

Combination scheme for TEs: t2s (Posse 1999, default), ste (Poser)

Default: “t2s”

--label Label for output directory.

Constructing ME-EPI pipelines

tedana must be called in the context of a larger ME-EPI preprocessing pipeline. Two common pipelines which support ME-EPI processing include fMRIPrep and afni_proc.py.

Users can also construct their own preprocessing pipeline for ME-EPI data from which to call tedana. There are several general principles to keep in mind when constructing ME-EPI processing pipelines.

In general, we recommend

1. Perform slice timing correction and motion correction before tedana

Similarly to single-echo EPI data, slice time correction allows us to assume that voxels across slices represent roughly simultaneous events. If the TR is slow enough to necessitate slice-timing (i.e., TR >= 1 sec., as a rule of thumb), then slice-timing correction should be done before tedana. This is because slice timing differences may impact echo-dependent estimates.

The slice time is generally defined as the excitation pulse time for each slice. For single-echo EPI data, that excitation time would be the same regardless of the echo time, and the same is true when one is collecting multiple echoes after a single excitation pulse. Therefore, we suggest using the same slice timing for all echoes in an ME-EPI series.

2. Perform distortion correction, spatial normalization, smoothing, and any rescaling or filtering after tedana

When preparing ME-EPI data for multi-echo denoising as in tedana, it is important not to do anything that mean shifts the data or otherwise separately scales the voxelwise values at each echo.

For example, head-motion correction parameters should not be calculated and applied at an individual echo level. Instead, we recommend that researchers apply the same transforms to all echoes in an ME-EPI series. That is, that they calculate head motion correction parameters from one echo and apply the resulting transformation to all echoes.

Similarly, any intensity normalization or nuisance regressors should be applied to the data after tedana calculates the BOLD and non-BOLD weighting of components.

If this is not considered, resulting intensity gradients (e.g., in the case of scaling) or alignment parameters (e.g., in the case of motion correction, normalization) are likely to differ across echos, and the subsequent calculation of voxelwise T2* values will be distorted. See the description of tedana’s approach for more details on how T2* values are calculated.

Support and communication

All bugs, concerns and enhancement requests for this software can be submitted here: https://github.com/ME-ICA/tedana/issues.

If you would like to ask a question about usage or tedana’s outputs, please submit a question to NeuroStars with the multi-echo tag.

All previous tedana-related questions are available under the multi-echo tag.

We will also attempt to archive certain common questions and associate answers in the Frequently Asked Questions (FAQ) section below.

FAQ

ICA has failed to converge.

The TEDICA step may fail to converge if TEDPCA is either too strict (i.e., there are too few components) or too lenient (there are too many).

In our experience, this may happen when preprocessing has not been applied to the data, or when improper steps have been applied to the data (e.g., distortion correction, rescaling, nuisance regression). If you are confident that your data have been preprocessed correctly prior to applying tedana, and you encounter this problem, please submit a question to NeuroStars.

I think that some BOLD ICA components have been misclassified as noise.

tedana allows users to manually specify accepted components when calling the pipeline. You can use the --manacc argument to specify the indices of components to accept.

Why isn’t v3.2 of the component selection algorithm supported in tedana?

There is a lot of solid logic behind the updated version of the TEDICA component selection algorithm, first added to the original ME-ICA codebase here by Dr. Prantik Kundu. However, we (the tedana developers) have encountered certain difficulties with this method (e.g., misclassified components) and the method itself has yet to be validated in any papers, posters, etc., which is why we have chosen to archive the v3.2 code, with the goal of revisiting it when tedana is more stable.

Anyone interested in using v3.2 may compile and install an earlier release (<=0.0.4) of tedana.

Processing pipeline details

tedana works by decomposing multi-echo BOLD data via PCA and ICA. These components are then analyzed to determine whether they are TE-dependent or -independent. TE-dependent components are classified as BOLD, while TE-independent components are classified as non-BOLD, and are discarded as part of data cleaning.

In tedana, we take the time series from all the collected TEs, combine them, and decompose the resulting data into components that can be classified as BOLD or non-BOLD. This is performed in a series of steps, including:

  • Principal components analysis
  • Independent components analysis
  • Component classification
_images/tedana-workflow.png

Multi-echo data

Here are the echo-specific time series for a single voxel in an example resting-state scan with 5 echoes.

_images/01_echo_timeseries.png

The values across volumes for this voxel scale with echo time in a predictable manner.

_images/02_echo_value_distributions.png

Adaptive mask generation

Longer echo times are more susceptible to signal dropout, which means that certain brain regions (e.g., orbitofrontal cortex, temporal poles) will only have good signal for some echoes. In order to avoid using bad signal from affected echoes in calculating T_{2}^* and S_{0} for a given voxel, tedana generates an adaptive mask, where the value for each voxel is the number of echoes with “good” signal. When T_{2}^* and S_{0} are calculated below, each voxel’s values are only calculated from the first n echoes, where n is the value for that voxel in the adaptive mask.

_images/03_adaptive_mask.png

Monoexponential decay model fit

The next step is to fit a monoexponential decay model to the data in order to estimate voxel-wise T_{2}^* and S_0.

In order to make it easier to fit the decay model to the data, tedana transforms the data. The BOLD data are transformed as log(|S|+1), where S is the BOLD signal. The echo times are also multiplied by -1.

_images/04_echo_log_value_distributions.png

A simple line can then be fit to the transformed data with linear regression. For the sake of this introduction, we can assume that the example voxel has good signal in all five echoes (i.e., the adaptive mask has a value of 5 at this voxel), so the line is fit to all available data.

Note

tedana actually performs and uses two sets of T_{2}^*/S_0 model fits. In one case, tedana estimates T_{2}^* and S_0 for voxels with good signal in at least two echoes. The resulting “limited” T_{2}^* and S_0 maps are used throughout most of the pipeline. In the other case, tedana estimates T_{2}^* and S_0 for voxels with good data in only one echo as well, but uses the first two echoes for those voxels. The resulting “full” T_{2}^* and S_0 maps are used to generate the optimally combined data.

_images/05_loglinear_regression.png

The values of interest for the decay model, S_0 and T_{2}^*, are then simple transformations of the line’s intercept (B_{0}) and slope (B_{1}), respectively:

S_{0} = e^{B_{0}}

T_{2}^{*} = \frac{1}{B_{1}}

The resulting values can be used to show the fitted monoexponential decay model on the original data.

_images/06_monoexponential_decay_model.png

We can also see where T_{2}^* lands on this curve.

_images/07_monoexponential_decay_model_with_t2.png

Optimal combination

Using the T_{2}^* estimates, tedana combines signal across echoes using a weighted average. The echoes are weighted according to the formula

w_{TE} = TE * e^{\frac{-TE}{T_{2}^*}}

The weights are then normalized across echoes. For the example voxel, the resulting weights are:

_images/08_optimal_combination_echo_weights.png

The distribution of values for the optimally combined data lands somewhere between the distributions for other echoes.

_images/09_optimal_combination_value_distributions.png

The time series for the optimally combined data also looks like a combination of the other echoes (which it is).

_images/10_optimal_combination_timeseries.png

TEDPCA

The next step is to identify and temporarily remove Gaussian (thermal) noise with TE-dependent principal components analysis (PCA). TEDPCA applies PCA to the optimally combined data in order to decompose it into component maps and time series. Here we can see time series for some example components (we don’t really care about the maps):

_images/11_pca_component_timeseries.png

These components are subjected to component selection, the specifics of which vary according to algorithm.

In the simplest approach, tedana uses Minka’s MLE to estimate the dimensionality of the data, which disregards low-variance components.

A more complicated approach involves applying a decision tree to identify and discard PCA components which, in addition to not explaining much variance, are also not significantly TE-dependent (i.e., have low Kappa) or TE-independent (i.e., have low Rho).

After component selection is performed, the retained components and their associated betas are used to reconstruct the optimally combined data, resulting in a dimensionally reduced (i.e., whitened) version of the dataset.

_images/12_pca_whitened_data.png

TEDICA

Next, tedana applies TE-dependent independent components analysis (ICA) in order to identify and remove TE-independent (i.e., non-BOLD noise) components. The dimensionally reduced optimally combined data are first subjected to ICA in order to fit a mixing matrix to the whitened data.

_images/13_ica_component_timeseries.png

Linear regression is used to fit the component time series to each voxel in each echo from the original, echo-specific data. This way, the thermal noise is retained in the data, but is ignored by the TEDICA process. This results in echo- and voxel-specific betas for each of the components.

TE-dependence (R_2) and TE-independence (S_0) models can then be fit to these betas. These models allow calculation of F-statistics for the R_2 and S_0 models (referred to as \kappa and \rho, respectively).

_images/14_te_dependence_models_component_0.png _images/14_te_dependence_models_component_1.png _images/14_te_dependence_models_component_2.png

A decision tree is applied to \kappa, \rho, and other metrics in order to classify ICA components as TE-dependent (BOLD signal), TE-independent (non-BOLD noise), or neither (to be ignored). The actual decision tree is dependent on the component selection algorithm employed. tedana includes two options: kundu_v2_5 (which uses hardcoded thresholds applied to each of the metrics) and kundu_v3_2 (which trains a classifier to select components).

_images/15_denoised_data_timeseries.png

Removal of spatially diffuse noise (optional)

Due to the constraints of ICA, MEICA is able to identify and remove spatially localized noise components, but it cannot identify components that are spread out throughout the whole brain. See Power et al. (2018) for more information about this issue. One of several post-processing strategies may be applied to the ME-DN or ME-HK datasets in order to remove spatially diffuse (ostensibly respiration-related) noise. Methods which have been employed in the past include global signal regression (GSR), T1c-GSR, anatomical CompCor, Go Decomposition (GODEC), and robust PCA.

_images/16_t1c_denoised_data_timeseries.png

Outputs of tedana

tedana derivatives

Filename Content
t2sv.nii Limited estimated T2* 3D map. The difference between the limited and full maps is that, for voxels affected by dropout where only one echo contains good data, the full map uses the single echo’s value while the limited map has a NaN.
s0v.nii Limited S0 3D map. The difference between the limited and full maps is that, for voxels affected by dropout where only one echo contains good data, the full map uses the single echo’s value while the limited map has a NaN.
ts_OC.nii Optimally combined time series.
dn_ts_OC.nii Denoised optimally combined time series. Recommended dataset for analysis.
lowk_ts_OC.nii Combined time series from rejected components.
midk_ts_OC.nii Combined time series from “mid-k” rejected components.
hik_ts_OC.nii High-kappa time series. This dataset does not include thermal noise or low variance components. Not the recommended dataset for analysis.
comp_table_pca.txt TEDPCA component table. A tab-delimited file with summary metrics and inclusion/exclusion information for each component from the PCA decomposition.
mepca_mix.1D Mixing matrix (component time series) from PCA decomposition.
meica_mix.1D Mixing matrix (component time series) from ICA decomposition. The only differences between this mixing matrix and the one above are that components may be sorted differently and signs of time series may be flipped.
betas_OC.nii Full ICA coefficient feature set.
betas_hik_OC.nii High-kappa ICA coefficient feature set
feats_OC2.nii Z-normalized spatial component maps
comp_table_ica.txt TEDICA component table. A tab-delimited file with summary metrics and inclusion/exclusion information for each component from the ICA decomposition.

If verbose is set to True:

Filename Content
t2ss.nii Voxel-wise T2* estimates using ascending numbers of echoes, starting with 2.
s0vs.nii Voxel-wise S0 estimates using ascending numbers of echoes, starting with 2.
t2svG.nii Full T2* map/time series. The difference between the limited and full maps is that, for voxels affected by dropout where only one echo contains good data, the full map uses the single echo’s value while the limited map has a NaN. Only used for optimal combination.
s0vG.nii Full S0 map/time series. Only used for optimal combination.
__meica_mix.1D Mixing matrix (component time series) from ICA decomposition.
hik_ts_e[echo].nii High-Kappa time series for echo number echo
midk_ts_e[echo].nii Mid-Kappa time series for echo number echo
lowk_ts_e[echo].nii Low-Kappa time series for echo number echo
dn_ts_e[echo].nii Denoised time series for echo number echo

If gscontrol includes ‘gsr’:

Filename Content
T1gs.nii Spatial global signal
glsig.1D Time series of global signal from optimally combined data.
tsoc_orig.nii Optimally combined time series with global signal retained.
tsoc_nogs.nii Optimally combined time series with global signal removed.

If gscontrol includes ‘t1c’:

Filename Content
sphis_hik.nii T1-like effect
hik_ts_OC_T1c.nii T1 corrected high-kappa time series by regression
dn_ts_OC_T1c.nii T1 corrected denoised time series
betas_hik_OC_T1c.nii T1-GS corrected high-kappa components
meica_mix_T1c.1D T1-GS corrected mixing matrix

Component tables

TEDPCA and TEDICA use tab-delimited tables to track relevant metrics, component classifications, and rationales behind classifications. TEDPCA rationale codes start with a “P”, while TEDICA codes start with an “I”.

Classification Description
accepted BOLD-like components retained in denoised and high-Kappa data
rejected Non-BOLD components removed from denoised and high-Kappa data
ignored Low-variance components ignored in denoised, but not high-Kappa, data
TEDPCA codes
Code Classification Description
P001 rejected Low Rho, Kappa, and variance explained
P002 rejected Low variance explained
P003 rejected Kappa equals fmax
P004 rejected Rho equals fmax
P005 rejected Cumulative variance explained above 95% (only in stabilized PCA decision tree)
P006 rejected Kappa below fmin (only in stabilized PCA decision tree)
P007 rejected Rho below fmin (only in stabilized PCA decision tree)
TEDICA codes
Code Classification Description
I001 rejected Manual exclusion
I002 rejected Rho greater than Kappa or more significant voxels in S0 model than R2 model
I003 rejected S0 Dice is higher than R2 Dice and high variance explained
I004 rejected Noise F-value is higher than signal F-value and high variance explained
I005 ignored No good components found
I006 rejected Mid-Kappa component
I007 ignored Low variance explained
I008 rejected Artifact candidate type A
I009 rejected Artifact candidate type B
I010 ignored ign_add0
I011 ignored ign_add1

Visual reports

We’re working on it.

Contributing to tedana

This document explains contributing to tedana at a very high level, with a focus on project governance and development philosophy. For a more practical guide to the tedana development, please see our contributing guide.

Governance

Governance is a hugely important part of any project. It is especially important to have clear process and communication channels for open source projects that rely on a distributed network of volunteers, such as tedana.

tedana is currently supported by a small group of five core developers. Even with only five members involved in decision making processes, we’ve found that setting expectations and communicating a shared vision has great value.

By starting the governance structure early in our development, we hope to welcome more people into the contributing team. We are committed to continuing to update the governance structures as necessary. Every member of the tedana community is encouraged to comment on these processes and suggest improvements.

As the first interim Benevolent Dictator for Life (BDFL), Elizabeth DuPre is ultimately responsible for any major decisions pertaining to tedana development. However, all potential changes are explicitly and openly discussed in the described channels of communication, and we strive for consensus amongst all community members.

Code of conduct

All tedana community members are expected to follow our code of conduct during any interaction with the project. That includes—but is not limited to—online conversations, in-person workshops or development sprints, and when giving talks about the software.

As stated in the code, severe or repeated violations by community members may result in exclusion from collective decision-making and rejection of future contributions to the tedana project.

tedana’s development philosophy

In contributing to any open source project, we have found that it is hugely valuable to understand the core maintainers’ development philosophy. In order to aid other contributors in on-boarding to tedana development, we have therefore laid out our shared opinion on several major decision points. These are:

  1. Which options are available to users?,
  2. Structuring project developments,
  3. Is tedana backwards compatible with MEICA?,
  4. How does tedana future-proof its development?, and
  5. When to release a new version
Which options are available to users?

The tedana developers are committed to providing useful and interpretable outputs for a majority of use cases.

In doing so, we have made a decision to embrace defaults which support the broadest base of users. For example, the choice of an independent component analysis (ICA) cost function is part of the tedana pipeline that can have a significant impact on the results and is difficult for individual researchers to form an opinion on.

The tedana “opinionated approach” is therefore to provide reasonable defaults and to hide some options from the top level workflows.

This decision has two key benefits:

  1. By default, users should get high quality results from running the pipelines, and
  2. The work required of the tedana developers to maintain the project is more focused and somewhat restricted.

It is important to note that tedana is shipped under an LGPL2 license which means that the code can—at all times—be cloned and re-used by anyone for any purpose.

“Power users” will always be able to access and extend all of the options available. We encourage those users to feed back their work into tedana development, particularly if they have good evidence for updating the default values.

We understand that it is possible to build the software to provide more options within the existing framework, but we have chosen to focus on the 80 percent use cases.

You can provide feedback on this philosophy through any of the channels listed on the tedana support page.

Structuring project developments

The tedana developers have chosen to structure ongoing development around specific goals. When implemented successfully, this focuses the direction of the project and helps new contributors prioritize what work needs to be completed.

We have outlined our goals for tedana in our The tedana roadmap, which we encourage all contributors to read and give feedback on. Feedback can be provided through any of the channels listed on our support page.

In order to more directly map between our The tedana roadmap and ongoing project issues, we have also created milestones in our github repository.

This allows us to:

  1. Label individual issues as supporting specific aims, and
  2. Measure progress towards each aim’s concrete deliverable(s).
Is tedana backwards compatible with MEICA?

The short answer is No.

There are two main reasons why. The first is that mdp, the python library used to run the ICA decomposition core to the original MEICA method, is no longer supported.

In November 2018, the tedana developers made the decision to switch to scikit-learn to perform these analyses. scikit-learn is well supported and under long term development. tedana will be more stable and have better performance going forwards as a result of this switch, but it also means that exactly reproducing previous MEICA analyses is not possible.

The other reason is that the core developers have chosen to look forwards rather than maintaining an older code base. As described in the Governance section, tedana is maintained by a small team of volunteers with limited development time. If you’d like to use MEICA as has been previously published the code is available on bitbucket and freely available under a LGPL2 license.

How does tedana future-proof its development?

tedana is a reasonably young project that is run by volunteers. No one involved in the development is paid for their time. In order to focus our limited time, we have made the decision to not let future possibilities limit or over-complicate the most immediately required features. That is, to not let the perfect be the enemy of the good.

While this stance will almost certainly yield ongoing refactoring as the scope of the software expands, the team’s commitment to transparency, reproducibility, and extensive testing mean that this work should be relatively manageable.

We hope that the lessons we learn building something useful in the short term will be applicable in the future as other needs arise.

When to release a new version

In the broadest sense, we have adopted a “you know it when you see it” approach to releasing new versions of the software.

To try to be more concrete, if a change to the project substantially changes the user’s experience of working with tedana, we recommend releasing an updated version. Additional functionality and bug fixes are very clear opportunities to release updated versions, but there will be many other reasons to update the software as hosted on PyPi.

To give two concrete examples of slightly less obvious cases:

1. A substantial update to the documentation that makes tedana easier to use would count as a substantial change to tedana and a new release should be considered.

2. In contrast, updating code coverage with additional unit tests does not affect the user’s experience with tedana and therefore does not require a new release.

Any member of the tedana community can propose that a new version is released. They should do so by opening an issue recommending a new release and giving a 1-2 sentence explanation of why the changes are sufficient to update the version. More information about what is required for a release to proceed is available in the Release Checklist.

Release Checklist

This is the checklist of items that must be completed when cutting a new release of tedana. These steps can only be completed by a project maintainer, but they are a good resource for releasing your own Python projects!

  1. All continuous integration must be passing and docs must be building successfully.
  2. Create a new release, using the GitHub guide for creating a release on GitHub. Release-drafter should have already drafted release notes listing all changes since the last release; check to make sure these are correct.
  3. Pulling from the master branch, locally build a new copy of tedana and upload it to PyPi.

We have set up tedana so that releases automatically mint a new DOI with Zenodo; a guide for doing this integration is available here.

The tedana roadmap

Project vision

ME-EPI processing is not well integrated into major preprocessing packages, yielding duplicated and unmaintained code. tedana has been developed to address this need and will serve as a central repository for standard ME-EPI denoising as well as a testing ground for novel ME-EPI denoising methods. This will jointly reduce the external burden on pipeline maintainers, facilitate increased ME-EPI adoption, and enable future development in ME-EPI denoising.

Metrics of success and corresponding milestones

We will know that we have been successful in creating tedana when we have succeeded in providing several concrete deliverables, which can be broadly categorized into:

  1. Documentation,
  2. Transparent and reproducible processing,
  3. Testing,
  4. Workflow integration: AFNI,
  5. Method extensions & improvements, and
  6. Developing a healthy community

Each deliverable has been synthesized into a milestone that gives the tedana community a link between the issues and the high level vision for the project.

Documentation

Summary: One long-standing concern with ME-EPI denoising has been the availability of documentation for the method outside of published scientific papers. To address this, we have created a ReadTheDocs site; however, there are still several sections either explicitly marked as “#TODO” or otherwise missing crucial information.

We are committed to providing helpful documentation for all users of tedana. One metric of success, then, is to develop documentation that includes:

  1. Motivations for conducting echo time dependent analysis,
  2. A collection of key ME-EPI references and acqusition sequences from the published literature,
  3. Tutorials on how to use tedana,
  4. The different processing steps that are conducted in each workflow,
  5. An up-to-date description of the API,
  6. A transparent explanation of the different decisions that are made through the tedana pipeline, and
  7. Where to seek support

Associated Milestone

This milestone will close when the online documentation contains the minimum necessary information to orient a complete newcomer to ME-EPI, both on the theoretical basis of the method as well as the practical steps used in ME-EPI denoising.

Transparent and reproducible processing

Summary: Alongside the lack of existing documentation, there is a general unfamiliarity with how selection criteria are applied to individual data sets. This lack of transparency, combined with the non-deterministic nature of the decomposition, has generated significant uncertainty when interpreting results.

In order to build and maintain confidence in ME-EPI processing, any analysis software—including tedana—must provide enough information such that the user is empowered to conduct transparent and reproducible analyses. This will permit clear reporting of the ME-EPI results in published studies and facilitate a broader conversation in the scientific community on the nature of ME-EPI processing.

We are therefore committed to making tedana analysis transparent and reproducible such that we report back all processing steps applied to any individual data set, including the specific selection criteria used in making denoising decisions. This, combined with the reproducibility afforded by seeding all non-deterministic steps, will enable both increased confidence and better reporting of ME-EPI results.

A metric of success for tedana then, should be enhancements to the code such that:

  1. Non-deterministic steps are made reproducible by enabling access to a “seed value”, and
  2. The decision process for individual component data is made accessible to the end user.

Associated Milestone

This milestone will close when when the internal decision making process for component selection is made accessible to the end user, and an analysis can be reproduced by an independent researcher who has access to the same data.

Testing

Summary: Historically, the lack of testing for ME-EPI analysis pipelines has prevented new developers from engaging with the code for fear of silently breaking or otherwise degrading the existing implementation. Moving forward, we want to grow an active development community, where developers feel empowered to explore new enhancements to the tedana code base.

One means to ensure that new code does not introduce bugs is through extensive testing. We are therefore committed to implementing high test coverage at both the unit test and integration test levels; that is, both in testing individual functions and broader workflows, respectively.

A metric of success should thus be:

  1. Achieving 90% test coverage for unit tests, as well as
  2. Three distinguishable integration tests over a range of possible acquisition conditions.

Associated Milestone

This milestone will close when we have 90% test coverage for unit tests and three distinguishable integration tests, varying number of echos and acquisition type (i.e., task vs. rest).

Workflow integration: AFNI

Summary: Currently, afni_proc.py distributes an older version of tedana, around which they have built a wrapper script, tedana_wrapper.py, to ensure compatibility. AFNI users at this point are therefore not accessing the latest version of tedana. We will grow our user base if tedana can be accessed through AFNI, and we are therefore committed to supporting native integration of tedana in AFNI.

One metric of success, therefore, will be if we can demonstrate sufficient stability and support such that the afni_proc.py maintainers are willing to switch to tedana as the recommended method of accessing ME-EPI denoising in AFNI. We will aim to aid in this process by increasing compatibility between tedana and the afni_proc.py workflow, eliminating the need for an additional wrapper script. For example, tedana could directly accept BRIK/HEAD files, facilitating interoperability with other AFNI pipelines.

Associated Milestone

This milestone will close when tedana is stable enough such that the recommended default in afni_proc.py is to access ME-EPI denoising via pip install tedana, rather than maintaining the alternative version that is currently used.

Workflow integration: BIDS

Summary: Currently, the BIDS ecosystem has limited support for ME-EPI processing. We will grow our user base if tedana is integrated into existing BIDS Apps and therefore accessible to members of the BIDS community. One promising opportunity is if tedana can be used natively in FMRIPrep. Some of the work is not required at this repository, but other changes will need to happen here; for example, making sure the outputs are BIDS compliant.

A metric of success, then, will be:

  1. Fully integrating tedana into FMRIPrep, and
  2. Making tedana outputs compliant with the BIDS derivatives specification.

Associated Milestone

This milestone will close when the denoising steps of tedana are stable enough to integrate into FMRIPrep and the FMRIPrep project is updated to process ME-EPI scans.

Method extensions & improvements

Summary: Overall, each of the listed deliverables will support a broader goal: to improve on ME-EPI processing itself. This is an important research question and will advance the state-of-the-art in ME-EPI processing.

A metric of success here would be * EITHER integrating a new decomposition method, beyond ICA * OR validating new selection criteria.

To achieve either of these metrics, it is likely that we will need to incoporate a quality-assurance module into tedana, possibly as visual reports.

Associated Milestone

This milestone will close when the codebase is stable enough to integrate novel methods into tedana, and that happens!

Developing a healthy community

Summary: In developing tedana, we are committed to fostering a healthy community. A healthy community is one in which the maintainers are happy and not overworked, and which empowers users to contribute back to the project. By making tedana stable and well-documented, with enough modularity to integrate improvements, we will enable new contributors to feel that their work is welcomed.

We therefore have one additional metric of success:

  1. An outside contributor integrates an improvement to ME-EPI denoising.

Associated Milestone

This milestone will probably never close, but will serve to track issues related to building and supporting the tedana community.

API

tedana.workflows: Common workflows

tedana.workflows
tedana.workflows.tedana_workflow(data, tes) Run the “canonical” TE-Dependent ANAlysis workflow.
tedana.workflows.t2smap_workflow(data, tes) Estimate T2 and S0, and optimally combine data across TEs.

tedana.model: Modeling TE-dependence

tedana.model
tedana.model.fitmodels_direct(catd, mmix, …) Fit TE-dependence and -independence models to components.
tedana.model.fit Fit models.

tedana.decomposition: Data decomposition

tedana.decomposition
tedana.decomposition.tedpca(catd, OCcatd, …) Use principal components analysis (PCA) to identify and remove thermal noise from multi-echo data.
tedana.decomposition.tedica(n_components, …) Performs ICA on dd and returns mixing matrix
tedana.decomposition._utils Utility functions for tedana decomposition

tedana.combine: Combine time series

Functions to optimally combine data across echoes.

tedana.combine Functions to optimally combine data across echoes.
tedana.combine.make_optcom(data, tes, mask) Optimally combine BOLD data across TEs.
tedana.combine Functions to optimally combine data across echoes.

tedana.decay: Signal decay

Functions to estimate S0 and T2* from multi-echo data.

tedana.decay Functions to estimate S0 and T2* from multi-echo data.
tedana.decay.fit_decay(data, tes, mask, masksum) Fit voxel-wise monoexponential decay models to data
tedana.decay.fit_decay_ts(data, tes, mask, …) Fit voxel- and timepoint-wise monoexponential decay models to data
tedana.decay Functions to estimate S0 and T2* from multi-echo data.

tedana.selection: Component selection

tedana.selection
tedana.selection.selcomps(seldict, …) Classify components in seldict as “accepted,” “rejected,” “midk,” or “ignored.”
tedana.selection._utils Utility functions for tedana.selection

tedana.io: Reading and writing data

Functions to handle file input/output

tedana.io Functions to handle file input/output
tedana.io.split_ts(data, mmix, mask, acc) Splits data time series into accepted component time series and remainder
tedana.io.ctabsel(ctabfile) Loads a pre-existing component table file
tedana.io.filewrite(data, filename, ref_img) Writes data to filename in format of ref_img
tedana.io.gscontrol_mmix(optcom_ts, mmix, …) Perform global signal regression.
tedana.io.load_data(data[, n_echos]) Coerces input data files to required 3D array output
tedana.io.new_nii_like(ref_img, data[, …]) Coerces data into NiftiImage format like ref_img
tedana.io.write_split_ts(data, mmix, mask, …) Splits data into denoised / noise / ignored time series and saves to disk
tedana.io.writect
tedana.io.writefeats(data, mmix, mask, ref_img) Converts data to component space with mmix and saves to disk
tedana.io.writeresults(ts, mask, comptable, …) Denoises ts and saves all resulting files to disk
tedana.io.writeresults_echoes(catd, mmix, …) Saves individually denoised echos to disk
tedana.io Functions to handle file input/output

tedana.utils: Utility functions

Utilities for tedana package

tedana.utils Utilities for tedana package
tedana.utils.andb(arrs) Sums arrays in arrs
tedana.utils.dice(arr1, arr2) Compute Dice’s similarity index between two numpy arrays.
tedana.utils.fitgaussian(data) Returns estimated gaussian parameters of a 2D distribution found by a fit
tedana.utils.gaussian(height, center_x, …) Returns gaussian function
tedana.utils.get_dtype(data) Determines neuroimaging format of data
tedana.utils.getfbounds(n_echos) Gets F-statistic boundaries based on number of echos
tedana.utils.load_image(data) Takes input data and returns a sample x time array
tedana.utils.make_adaptive_mask(data[, …]) Makes map of data specifying longest echo a voxel can be sampled with
tedana.utils.make_min_mask(data[, roi]) Generates a 3D mask of data
tedana.utils.moments(data) Returns gaussian parameters of a 2D distribution by calculating its moments
tedana.utils.unmask(data, mask) Unmasks data using non-zero entries of mask
tedana.utils Utilities for tedana package

Indices and tables