Skip to content

BetterCodeBetterScience/example-parcelextract

Repository files navigation

ParcelExtract

Extract time-series signals from 4D neuroimaging data using brain parcellation schemes.

ParcelExtract is a Python package and command-line tool for extracting regional time-series signals from 4D neuroimaging data (e.g., fMRI) using brain atlases. It supports multiple extraction strategies and provides BIDS-compliant outputs for seamless integration into neuroimaging analysis pipelines.

Python Tests Coverage

πŸš€ Features

  • Multiple Extraction Strategies: Mean, median, PCA, and weighted mean signal extraction
  • TemplateFlow Integration: Automatic downloading of standard brain atlases
  • BIDS-Compliant Outputs: TSV time-series files with JSON sidecar metadata
  • Flexible Atlas Support: Use TemplateFlow atlases or custom parcellation files
  • Command-Line Interface: Easy-to-use CLI for batch processing and scripting
  • Python API: Integrate directly into your analysis workflows
  • Comprehensive Testing: 110 tests with 89% code coverage

πŸ“¦ Installation

Using uv (Recommended)

# Install from source
git clone <repository-url>
cd parcelextract
uv sync

Using pip

# Install from source
git clone <repository-url>
cd parcelextract
pip install -e .

Dependencies

ParcelExtract requires Python 3.12+ and the following packages:

  • nibabel β‰₯3.2.0 (neuroimaging file I/O)
  • nilearn β‰₯0.10.0 (neuroimaging data manipulation)
  • numpy β‰₯1.20.0 (numerical operations)
  • pandas β‰₯1.3.0 (data structuring)
  • scipy β‰₯1.7.0 (scientific computing)
  • templateflow β‰₯0.8.0 (brain atlas management)

πŸ”§ Quick Start

Command-Line Usage

Extract time-series using a TemplateFlow atlas:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas Schaefer2018 \
    --desc 400Parcels17Networks \
    --output-dir results/ \
    --strategy mean \
    --verbose

Extract using a custom atlas file:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas /path/to/custom_atlas.nii.gz \
    --output-dir results/ \
    --strategy median

Python API Usage

from parcelextract.core.extractor import ParcelExtractor

# Initialize extractor with atlas and strategy
extractor = ParcelExtractor(
    atlas='/path/to/atlas.nii.gz',
    strategy='mean'
)

# Extract time-series from 4D image
timeseries = extractor.fit_transform('/path/to/bold.nii.gz')

# timeseries is a 2D array: (n_parcels, n_timepoints)
print(f"Extracted {timeseries.shape[0]} parcels, {timeseries.shape[1]} timepoints")

With TemplateFlow atlas:

from parcelextract.atlases.templateflow import TemplateFlowManager
from parcelextract.core.extractor import ParcelExtractor

# Download atlas from TemplateFlow
tf_manager = TemplateFlowManager()
atlas_path = tf_manager.get_atlas(
    'Schaefer2018', 
    space='MNI152NLin2009cAsym',
    desc='400Parcels17Networks'
)

# Use with extractor
extractor = ParcelExtractor(atlas=atlas_path, strategy='pca')
timeseries = extractor.fit_transform('sub-01_task-rest_bold.nii.gz')

πŸ“– Usage Guide

Command-Line Interface

The parcelextract command provides a complete extraction pipeline:

parcelextract [OPTIONS]

Required Arguments

  • --input PATH: Path to input 4D NIfTI file (.nii or .nii.gz)
  • --atlas ATLAS: Atlas specification (TemplateFlow name or file path)
  • --output-dir PATH: Output directory for results

Optional Arguments

  • --strategy {mean,median,pca,weighted_mean}: Signal extraction strategy (default: mean)
  • --space SPACE: Template space for TemplateFlow atlases (default: MNI152NLin2009cAsym)
  • --desc DESC: Atlas description/variant (e.g., 400Parcels17Networks)
  • --verbose: Enable verbose output
  • --help: Show help message
  • --version: Show version information

Examples

Basic extraction with TemplateFlow atlas:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas Schaefer2018 \
    --output-dir derivatives/parcelextract/

Specify atlas variant:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas Schaefer2018 \
    --desc 800Parcels7Networks \
    --output-dir derivatives/parcelextract/ \
    --strategy median

Use different template space:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas AAL \
    --space MNI152NLin6Asym \
    --output-dir derivatives/parcelextract/

Custom atlas file:

parcelextract \
    --input sub-01_task-rest_bold.nii.gz \
    --atlas /path/to/my_custom_atlas.nii.gz \
    --output-dir derivatives/parcelextract/ \
    --strategy pca

Supported Atlases

ParcelExtract supports atlases from TemplateFlow and custom atlas files:

TemplateFlow Atlases

  • Schaefer2018: Multi-resolution cortical parcellations
  • AAL: Automated Anatomical Labeling atlas
  • HarvardOxford: Harvard-Oxford cortical atlas

Custom Atlases

  • Any 3D NIfTI file (.nii or .nii.gz) with integer labels
  • Labels should start from 1 (background = 0 is ignored)
  • Must be in the same space as your input data

Extraction Strategies

  1. Mean (default): Average signal across all voxels in each parcel
  2. Median: Median signal across voxels (robust to outliers)
  3. PCA: First principal component of voxel signals
  4. Weighted Mean: Probability-weighted average (for probabilistic atlases)

Output Format

ParcelExtract generates BIDS-compliant output files:

Time-series File (TSV)

sub-01_task-rest_atlas-Schaefer2018_desc-400Parcels17Networks_timeseries.tsv

Content:

parcel_0    parcel_1    parcel_2    ...
-0.142      0.256       -0.089      ...
0.031       -0.124      0.198       ...
...         ...         ...         ...

Metadata File (JSON)

sub-01_task-rest_atlas-Schaefer2018_desc-400Parcels17Networks_timeseries.json

Content:

{
    "extraction_strategy": "mean",
    "atlas": "Schaefer2018",
    "n_parcels": 400,
    "n_timepoints": 200,
    "input_file": "/path/to/sub-01_task-rest_bold.nii.gz"
}

🐍 Python API Reference

ParcelExtractor Class

The main class for time-series extraction:

from parcelextract.core.extractor import ParcelExtractor

extractor = ParcelExtractor(atlas, strategy='mean')

Parameters

  • atlas (str or Path): Path to atlas NIfTI file
  • strategy (str): Extraction strategy ('mean', 'median', 'pca', 'weighted_mean')

Methods

fit_transform(img_4d) Extract time-series from 4D image.

  • Parameters: img_4d (str, Path, or nibabel image): 4D neuroimaging data
  • Returns: numpy.ndarray (n_parcels Γ— n_timepoints): Extracted time-series
# From file path
timeseries = extractor.fit_transform('data.nii.gz')

# From nibabel image object
import nibabel as nib
img = nib.load('data.nii.gz')
timeseries = extractor.fit_transform(img)

TemplateFlow Integration

Access TemplateFlow atlases programmatically:

from parcelextract.atlases.templateflow import TemplateFlowManager

tf_manager = TemplateFlowManager()

# Download atlas
atlas_path = tf_manager.get_atlas(
    atlas_name='Schaefer2018',
    space='MNI152NLin2009cAsym',
    desc='400Parcels17Networks'
)

# Use with extractor
extractor = ParcelExtractor(atlas=atlas_path)

I/O Utilities

Save results programmatically:

from parcelextract.io.writers import write_timeseries_tsv, write_json_sidecar

# Save time-series to TSV
write_timeseries_tsv(timeseries, 'output_timeseries.tsv')

# Save metadata to JSON
metadata = {
    'extraction_strategy': 'mean',
    'atlas': 'Schaefer2018',
    'n_parcels': timeseries.shape[0],
    'n_timepoints': timeseries.shape[1]
}
write_json_sidecar(metadata, 'output_timeseries.json')

πŸ” Advanced Usage

Batch Processing

Process multiple subjects using shell scripting:

#!/bin/bash

# Process all subjects in BIDS dataset
for subject in sub-*; do
    for session in ${subject}/ses-*; do
        for run in ${session}/func/*task-rest*_bold.nii.gz; do
            parcelextract \
                --input "${run}" \
                --atlas Schaefer2018 \
                --desc 400Parcels17Networks \
                --output-dir derivatives/parcelextract/"${subject}"/
        done
    done
done

Integration with Python Workflows

from pathlib import Path
from parcelextract.core.extractor import ParcelExtractor

def process_subject(subject_dir, atlas_path, output_dir):
    """Process all functional runs for a subject."""
    extractor = ParcelExtractor(atlas=atlas_path, strategy='mean')
    
    # Find all BOLD files
    bold_files = subject_dir.glob('**/*_bold.nii.gz')
    
    results = {}
    for bold_file in bold_files:
        print(f"Processing {bold_file.name}...")
        
        # Extract time-series
        timeseries = extractor.fit_transform(bold_file)
        
        # Generate output path
        output_stem = bold_file.stem.replace('.nii', '')
        output_file = output_dir / f"{output_stem}_atlas-custom_timeseries.tsv"
        
        # Save results
        write_timeseries_tsv(timeseries, output_file)
        results[bold_file.name] = timeseries
    
    return results

# Usage
results = process_subject(
    subject_dir=Path('sub-01'),
    atlas_path='custom_atlas.nii.gz',
    output_dir=Path('derivatives/parcelextract/sub-01')
)

πŸ› οΈ Development

Running Tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=parcelextract

# Run specific test file
uv run pytest tests/test_extractor.py

Code Quality

# Format code
uv run ruff format parcelextract

# Lint code
uv run ruff check parcelextract

# Type checking
uv run mypy parcelextract

Project Structure

parcelextract/
β”œβ”€β”€ src/parcelextract/          # Main package
β”‚   β”œβ”€β”€ core/                   # Core extraction logic
β”‚   β”‚   β”œβ”€β”€ extractor.py       # Main ParcelExtractor class
β”‚   β”‚   β”œβ”€β”€ strategies.py      # Extraction strategies
β”‚   β”‚   └── validators.py      # Input validation
β”‚   β”œβ”€β”€ io/                    # Input/output operations
β”‚   β”‚   β”œβ”€β”€ readers.py         # File reading utilities
β”‚   β”‚   └── writers.py         # Output generation
β”‚   β”œβ”€β”€ atlases/               # Atlas management
β”‚   β”‚   β”œβ”€β”€ manager.py         # Atlas loading
β”‚   β”‚   └── templateflow.py    # TemplateFlow integration
β”‚   └── cli/                   # Command-line interface
β”‚       └── main.py            # CLI entry point
β”œβ”€β”€ tests/                     # Test suite
β”œβ”€β”€ docs/                      # Documentation
└── pyproject.toml             # Project configuration

❓ FAQ

Q: What input formats are supported? A: ParcelExtract supports 4D NIfTI files (.nii and .nii.gz) as input. The data should be preprocessed and in standard space if using TemplateFlow atlases.

Q: Can I use my own custom atlas? A: Yes! Any 3D NIfTI file with integer labels can be used as an atlas. Labels should start from 1 (background = 0 is ignored).

Q: Which extraction strategy should I use? A: 'mean' is the most common choice. Use 'median' for robustness to outliers, 'pca' for dimensionality reduction, or 'weighted_mean' for probabilistic atlases.

Q: How do I handle missing or corrupted parcels? A: ParcelExtract automatically handles empty parcels by returning NaN values. Check your extraction results for NaN values and investigate the corresponding parcels in your atlas.

Q: Can I extract signals from only specific parcels? A: Currently, ParcelExtract extracts signals from all parcels in the atlas. You can post-process the results to select specific parcels of interest.

Q: Is ParcelExtract BIDS-compliant? A: ParcelExtract generates BIDS-inspired output filenames and metadata. While not fully BIDS-compliant, the outputs follow BIDS naming conventions for derivatives.

🀝 Contributing

We welcome contributions! Please see our contributing guidelines for details on:

  • Reporting bugs
  • Requesting features
  • Submitting pull requests
  • Code style and testing requirements

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“š Citation

If you use ParcelExtract in your research, please cite:

@software{parcelextract2025,
  title={ParcelExtract: Time-series extraction from neuroimaging data},
  author={Your Name},
  year={2025},
  url={https://github.com/yourusername/parcelextract}
}

πŸ†˜ Support

  • Documentation: [Link to full documentation]
  • Issues: Report bugs and feature requests on GitHub Issues
  • Discussions: Ask questions on GitHub Discussions

ParcelExtract: Making neuroimaging time-series extraction simple, standardized, and reproducible.

Prompts

This project was developed using a prompt-driven approach with Claude. The following prompts were used to generate the project requirements and development guidelines:

PRD Prompt

Help me create a Project Requirement Document (PRD) for a Python module called parcelextract that will take in a 4-dimensional Nifti brain image and extract signal from clusters defined by a specified brain parcellation, saving it to a text file accompanied by a json sidecar file containing relevant metadata. The tool should leverage existing packages such as nibabel, nilearn, and templateflow, and should follow the BIDS standard for file naming as closely as possible. The code should be written in a clean and modular way, using a test-driven development framework.

CLAUDE.md Prompt

Generate a CLAUDE.md file from the attached PRD that will guide Claude Code sessions on this project. Add the following additional guidelines:

## Development strategy

- Use a test-driven development strategy, developing tests prior to generating solutions to the tests.
- Run the tests and ensure that they fail prior to generating any solutions.
- Write code that passes the tests.
- IMPORTANT: Do not modify the tests simply so that the code passes. Only modify the tests if you identify a specific error in the test.

## Notes for Development

- Think about the problem before generating code.
- Always add a smoke test for the main() function.
- Prefer reliance on widely used packages (such as numpy, pandas, and scikit-learn); avoid unknown packages from Github.
- Do not include any code in init.py files.
- Use pytest for testing.
- Write code that is clean and modular. Prefer shorter functions/methods over longer ones.
- Use functions rather than classes for tests. Use pytest fixtures to share resources between tests.

## Session Guidelines

- Always read PLANNING.md at the start of every new conversation
- Check TASKS.md and SCRATCHPAD.md before starting your work
- Mark completed tasks immediately within TASKS.md
- Add newly discovered tasks to TASKS.md
- use SCRATCHPAD.md as a scratchpad to outline plans

PLANNING.md Prompt

Based on the attached CLAUDE.md and PRD.md files, create a PLANNING.md file that includes architecture, technology stack, development processes/workflow, and required tools list for this app.

Followup: use ruff for formatting and style checking instead of flake8/black

TASKS.md Prompt

Based on the attached CLAUDE.md and PRD.md files, create a TASKS.md file with bullet points tasks divided into milestones for building this app.

About

Example of AI-assisted coding workflow

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages