ParcelExtract

Extract time-series signals from 4D neuroimaging data using brain parcellation schemes.

ParcelExtract is a Python package and command-line tool for extracting regional time-series signals from 4D neuroimaging data (e.g., fMRI) using brain atlases. It supports multiple extraction strategies and provides BIDS-compliant outputs for seamless integration into neuroimaging analysis pipelines.

🚀 Features

Multiple Extraction Strategies: Mean, median, PCA, and weighted mean signal extraction
TemplateFlow Integration: Automatic downloading of standard brain atlases
BIDS-Compliant Outputs: TSV time-series files with JSON sidecar metadata
Flexible Atlas Support: Use TemplateFlow atlases or custom parcellation files
Command-Line Interface: Easy-to-use CLI for batch processing and scripting
Python API: Integrate directly into your analysis workflows
Comprehensive Testing: 110 tests with 89% code coverage

📦 Installation

Using uv (Recommended)

# Install from source
git clone <repository-url>
cd parcelextract
uv sync

Using pip

# Install from source
git clone <repository-url>
cd parcelextract
pip install -e .

Dependencies

ParcelExtract requires Python 3.12+ and the following packages:

nibabel ≥3.2.0 (neuroimaging file I/O)
nilearn ≥0.10.0 (neuroimaging data manipulation)
numpy ≥1.20.0 (numerical operations)
pandas ≥1.3.0 (data structuring)
scipy ≥1.7.0 (scientific computing)
templateflow ≥0.8.0 (brain atlas management)

🔧 Quick Start

Command-Line Usage

Extract time-series using a TemplateFlow atlas:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas Schaefer2018 \
    --desc 400Parcels17Networks \
    --output-dir results/ \
    --strategy mean \
    --verbose

Extract using a custom atlas file:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas /path/to/custom_atlas.nii.gz \
    --output-dir results/ \
    --strategy median

Python API Usage

from parcelextract.core.extractor import ParcelExtractor

# Initialize extractor with atlas and strategy
extractor = ParcelExtractor(
    atlas='/path/to/atlas.nii.gz',
    strategy='mean'
)

# Extract time-series from 4D image
timeseries = extractor.fit_transform('/path/to/bold.nii.gz')

# timeseries is a 2D array: (n_parcels, n_timepoints)
print(f"Extracted {timeseries.shape[0]} parcels, {timeseries.shape[1]} timepoints")

With TemplateFlow atlas:

from parcelextract.atlases.templateflow import TemplateFlowManager
from parcelextract.core.extractor import ParcelExtractor

# Download atlas from TemplateFlow
tf_manager = TemplateFlowManager()
atlas_path = tf_manager.get_atlas(
    'Schaefer2018', 
    space='MNI152NLin2009cAsym',
    desc='400Parcels17Networks'
)

# Use with extractor
extractor = ParcelExtractor(atlas=atlas_path, strategy='pca')
timeseries = extractor.fit_transform('sub-01_task-rest_bold.nii.gz')

📖 Usage Guide

Command-Line Interface

The parcelextract command provides a complete extraction pipeline:

parcelextract [OPTIONS]

Required Arguments

--input PATH: Path to input 4D NIfTI file (.nii or .nii.gz)
--atlas ATLAS: Atlas specification (TemplateFlow name or file path)
--output-dir PATH: Output directory for results

Optional Arguments

--strategy {mean,median,pca,weighted_mean}: Signal extraction strategy (default: mean)
--space SPACE: Template space for TemplateFlow atlases (default: MNI152NLin2009cAsym)
--desc DESC: Atlas description/variant (e.g., 400Parcels17Networks)
--verbose: Enable verbose output
--help: Show help message
--version: Show version information

Examples

Basic extraction with TemplateFlow atlas:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas Schaefer2018 \
    --output-dir derivatives/parcelextract/

Specify atlas variant:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas Schaefer2018 \
    --desc 800Parcels7Networks \
    --output-dir derivatives/parcelextract/ \
    --strategy median

Use different template space:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas AAL \
    --space MNI152NLin6Asym \
    --output-dir derivatives/parcelextract/

Custom atlas file:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas /path/to/my_custom_atlas.nii.gz \
    --output-dir derivatives/parcelextract/ \
    --strategy pca

Supported Atlases

ParcelExtract supports atlases from TemplateFlow and custom atlas files:

TemplateFlow Atlases

Schaefer2018: Multi-resolution cortical parcellations
AAL: Automated Anatomical Labeling atlas
HarvardOxford: Harvard-Oxford cortical atlas

Custom Atlases

Any 3D NIfTI file (.nii or .nii.gz) with integer labels
Labels should start from 1 (background = 0 is ignored)
Must be in the same space as your input data

Extraction Strategies

Mean (default): Average signal across all voxels in each parcel
Median: Median signal across voxels (robust to outliers)
PCA: First principal component of voxel signals
Weighted Mean: Probability-weighted average (for probabilistic atlases)

Output Format

ParcelExtract generates BIDS-compliant output files:

Time-series File (TSV)

sub-01_task-rest_atlas-Schaefer2018_desc-400Parcels17Networks_timeseries.tsv

Content:

parcel_0    parcel_1    parcel_2    ...
-0.142      0.256       -0.089      ...
0.031       -0.124      0.198       ...
...         ...         ...         ...

Metadata File (JSON)

sub-01_task-rest_atlas-Schaefer2018_desc-400Parcels17Networks_timeseries.json

Content:

{
    "extraction_strategy": "mean",
    "atlas": "Schaefer2018",
    "n_parcels": 400,
    "n_timepoints": 200,
    "input_file": "/path/to/sub-01_task-rest_bold.nii.gz"
}

🐍 Python API Reference

ParcelExtractor Class

The main class for time-series extraction:

from parcelextract.core.extractor import ParcelExtractor

extractor = ParcelExtractor(atlas, strategy='mean')

Parameters

atlas (str or Path): Path to atlas NIfTI file
strategy (str): Extraction strategy ('mean', 'median', 'pca', 'weighted_mean')

Methods

fit_transform(img_4d) Extract time-series from 4D image.

Parameters: img_4d (str, Path, or nibabel image): 4D neuroimaging data
Returns: numpy.ndarray (n_parcels × n_timepoints): Extracted time-series

# From file path
timeseries = extractor.fit_transform('data.nii.gz')

# From nibabel image object
import nibabel as nib
img = nib.load('data.nii.gz')
timeseries = extractor.fit_transform(img)

TemplateFlow Integration

Access TemplateFlow atlases programmatically:

from parcelextract.atlases.templateflow import TemplateFlowManager

tf_manager = TemplateFlowManager()

# Download atlas
atlas_path = tf_manager.get_atlas(
    atlas_name='Schaefer2018',
    space='MNI152NLin2009cAsym',
    desc='400Parcels17Networks'
)

# Use with extractor
extractor = ParcelExtractor(atlas=atlas_path)

I/O Utilities

Save results programmatically:

from parcelextract.io.writers import write_timeseries_tsv, write_json_sidecar

# Save time-series to TSV
write_timeseries_tsv(timeseries, 'output_timeseries.tsv')

# Save metadata to JSON
metadata = {
    'extraction_strategy': 'mean',
    'atlas': 'Schaefer2018',
    'n_parcels': timeseries.shape[0],
    'n_timepoints': timeseries.shape[1]
}
write_json_sidecar(metadata, 'output_timeseries.json')

🔍 Advanced Usage

Batch Processing

Process multiple subjects using shell scripting:

#!/bin/bash

# Process all subjects in BIDS dataset
for subject in sub-*; do
    for session in ${subject}/ses-*; do
        for run in ${session}/func/*task-rest*_bold.nii.gz; do
            parcelextract \
                --input "${run}" \
                --atlas Schaefer2018 \
                --desc 400Parcels17Networks \
                --output-dir derivatives/parcelextract/"${subject}"/
        done
    done
done

Integration with Python Workflows

from pathlib import Path
from parcelextract.core.extractor import ParcelExtractor

def process_subject(subject_dir, atlas_path, output_dir):
    """Process all functional runs for a subject."""
    extractor = ParcelExtractor(atlas=atlas_path, strategy='mean')
    
    # Find all BOLD files
    bold_files = subject_dir.glob('**/*_bold.nii.gz')
    
    results = {}
    for bold_file in bold_files:
        print(f"Processing {bold_file.name}...")
        
        # Extract time-series
        timeseries = extractor.fit_transform(bold_file)
        
        # Generate output path
        output_stem = bold_file.stem.replace('.nii', '')
        output_file = output_dir / f"{output_stem}_atlas-custom_timeseries.tsv"
        
        # Save results
        write_timeseries_tsv(timeseries, output_file)
        results[bold_file.name] = timeseries
    
    return results

# Usage
results = process_subject(
    subject_dir=Path('sub-01'),
    atlas_path='custom_atlas.nii.gz',
    output_dir=Path('derivatives/parcelextract/sub-01')
)

🛠️ Development

Running Tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=parcelextract

# Run specific test file
uv run pytest tests/test_extractor.py

Code Quality

# Format code
uv run ruff format parcelextract

# Lint code
uv run ruff check parcelextract

# Type checking
uv run mypy parcelextract

Project Structure

parcelextract/
├── src/parcelextract/          # Main package
│   ├── core/                   # Core extraction logic
│   │   ├── extractor.py       # Main ParcelExtractor class
│   │   ├── strategies.py      # Extraction strategies
│   │   └── validators.py      # Input validation
│   ├── io/                    # Input/output operations
│   │   ├── readers.py         # File reading utilities
│   │   └── writers.py         # Output generation
│   ├── atlases/               # Atlas management
│   │   ├── manager.py         # Atlas loading
│   │   └── templateflow.py    # TemplateFlow integration
│   └── cli/                   # Command-line interface
│       └── main.py            # CLI entry point
├── tests/                     # Test suite
├── docs/                      # Documentation
└── pyproject.toml             # Project configuration

❓ FAQ

Q: What input formats are supported? A: ParcelExtract supports 4D NIfTI files (.nii and .nii.gz) as input. The data should be preprocessed and in standard space if using TemplateFlow atlases.

Q: Can I use my own custom atlas? A: Yes! Any 3D NIfTI file with integer labels can be used as an atlas. Labels should start from 1 (background = 0 is ignored).

Q: Which extraction strategy should I use? A: 'mean' is the most common choice. Use 'median' for robustness to outliers, 'pca' for dimensionality reduction, or 'weighted_mean' for probabilistic atlases.

Q: How do I handle missing or corrupted parcels? A: ParcelExtract automatically handles empty parcels by returning NaN values. Check your extraction results for NaN values and investigate the corresponding parcels in your atlas.

Q: Can I extract signals from only specific parcels? A: Currently, ParcelExtract extracts signals from all parcels in the atlas. You can post-process the results to select specific parcels of interest.

Q: Is ParcelExtract BIDS-compliant? A: ParcelExtract generates BIDS-inspired output filenames and metadata. While not fully BIDS-compliant, the outputs follow BIDS naming conventions for derivatives.

🤝 Contributing

We welcome contributions! Please see our contributing guidelines for details on:

Reporting bugs
Requesting features
Submitting pull requests
Code style and testing requirements

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

📚 Citation

If you use ParcelExtract in your research, please cite:

@software{parcelextract2025,
  title={ParcelExtract: Time-series extraction from neuroimaging data},
  author={Your Name},
  year={2025},
  url={https://github.com/yourusername/parcelextract}
}

🆘 Support

Documentation: [Link to full documentation]
Issues: Report bugs and feature requests on GitHub Issues
Discussions: Ask questions on GitHub Discussions

ParcelExtract: Making neuroimaging time-series extraction simple, standardized, and reproducible.

Prompts

This project was developed using a prompt-driven approach with Claude. The following prompts were used to generate the project requirements and development guidelines:

PRD Prompt

Help me create a Project Requirement Document (PRD) for a Python module called parcelextract that will take in a 4-dimensional Nifti brain image and extract signal from clusters defined by a specified brain parcellation, saving it to a text file accompanied by a json sidecar file containing relevant metadata. The tool should leverage existing packages such as nibabel, nilearn, and templateflow, and should follow the BIDS standard for file naming as closely as possible. The code should be written in a clean and modular way, using a test-driven development framework.

CLAUDE.md Prompt

Generate a CLAUDE.md file from the attached PRD that will guide Claude Code sessions on this project. Add the following additional guidelines:

## Development strategy

- Use a test-driven development strategy, developing tests prior to generating solutions to the tests.
- Run the tests and ensure that they fail prior to generating any solutions.
- Write code that passes the tests.
- IMPORTANT: Do not modify the tests simply so that the code passes. Only modify the tests if you identify a specific error in the test.

## Notes for Development

- Think about the problem before generating code.
- Always add a smoke test for the main() function.
- Prefer reliance on widely used packages (such as numpy, pandas, and scikit-learn); avoid unknown packages from Github.
- Do not include any code in init.py files.
- Use pytest for testing.
- Write code that is clean and modular. Prefer shorter functions/methods over longer ones.
- Use functions rather than classes for tests. Use pytest fixtures to share resources between tests.

## Session Guidelines

- Always read PLANNING.md at the start of every new conversation
- Check TASKS.md and SCRATCHPAD.md before starting your work
- Mark completed tasks immediately within TASKS.md
- Add newly discovered tasks to TASKS.md
- use SCRATCHPAD.md as a scratchpad to outline plans

PLANNING.md Prompt

Based on the attached CLAUDE.md and PRD.md files, create a PLANNING.md file that includes architecture, technology stack, development processes/workflow, and required tools list for this app.

Followup: use ruff for formatting and style checking instead of flake8/black

TASKS.md Prompt

Based on the attached CLAUDE.md and PRD.md files, create a TASKS.md file with bullet points tasks divided into milestones for building this app.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.claude/commands		.claude/commands
docs		docs
examples		examples
src/parcelextract		src/parcelextract
testdata/test_output		testdata/test_output
tests		tests
.coverage		.coverage
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
INSTALLATION.md		INSTALLATION.md
LICENSE		LICENSE
PLANNING.md		PLANNING.md
PRD.md		PRD.md
README.md		README.md
SCRATCHPAD.md		SCRATCHPAD.md
TASKS.md		TASKS.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Folders and files

Latest commit

History

Repository files navigation

ParcelExtract

🚀 Features

📦 Installation

Using uv (Recommended)

Using pip

Dependencies

🔧 Quick Start

Command-Line Usage

Python API Usage

📖 Usage Guide

Command-Line Interface

Required Arguments

Optional Arguments

Examples

Supported Atlases

TemplateFlow Atlases

Custom Atlases

Extraction Strategies

Output Format

Time-series File (TSV)

Metadata File (JSON)

🐍 Python API Reference

ParcelExtractor Class

Parameters

Methods

TemplateFlow Integration

I/O Utilities

🔍 Advanced Usage

Batch Processing

Integration with Python Workflows

🛠️ Development

Running Tests

Code Quality

Project Structure

❓ FAQ

🤝 Contributing

📄 License

📚 Citation

🆘 Support

Prompts

PRD Prompt

CLAUDE.md Prompt

PLANNING.md Prompt

TASKS.md Prompt

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages