ISMRM 2015 Tractography Challenge Scoring Analysis¶

Overview¶

The ISMRM scoring data provides ground truth tractography for validating fiber tracking algorithms. Based on analysis of scoring_data_Renauld2023/.

Data Structure¶

Ground Truth Bundles (`bundles/`)¶

Format: TrackVis .trk files (binary streamline format)

22 Major White Matter Bundles:

Commissural: CA, CC, CP, MCP (corpus callosum variants)
Association: Cingulum (L/R), ILF (L/R), OR (L/R), SLF (L/R), UF (L/R)
Projection: BPS (L/R), ICP (L/R), SCP (L/R)
Special: Fornix

Bundle Statistics (from file sizes):

Largest bundles:
- Cingulum_right: 29.4 MB (~30K streamlines)
- OR_right: 9.8 MB (~10K streamlines)
- MCP: 30.3 MB (~31K streamlines)
- CC: 24.2 MB (~25K streamlines)

Smallest bundles:
- CP: 331 KB (~340 streamlines)
- CA: 526 KB (~540 streamlines)
- Fornix: 4.1 MB (~4K streamlines)

ROI Masks (`ROI/`)¶

Purpose: Define anatomical constraints for bundle segmentation

Three types of masks:

all_masks/ (26 files): All streamlines must pass through these regions
- Example: CA.nii.gz, CC_temporal.nii.gz, Fornix.nii.gz
any_masks/ (4 files): At least one streamline point must intersect
- Used for loose inclusion criteria
- Example: MCP_any_mask.nii.gz, ICP_left_any_mask.nii.gz
endpoints/ (45 files): Start/end regions for streamlines
- *_head.nii.gz: Starting points
- *_tail.nii.gz: Ending points
- Shared endpoints: brainstem.nii.gz, occipital_left.nii.gz

Configuration Files¶

`config_file_segmentation.json`¶

Purpose: ROI-based bundle segmentation rules

Structure per bundle:

{
  "bundle_name": {
    "all_mask": "path/to/required_mask.nii.gz",
    "any_mask": "path/to/inclusion_mask.nii.gz",  // Optional
    "head": "path/to/start_region.nii.gz",
    "tail": "path/to/end_region.nii.gz",
    "length": [min, max],  // mm, optional
    "length_x": [min, max],  // X-axis constraints, optional
    "length_y": [min, max],  // Y-axis constraints, optional
    "length_x_abs": [min, max]  // |X| constraints, optional
  }
}

Example - Corpus Callosum U-shaped:

{
  "CC_u_shaped": {
    "all_mask": "ROI/all_masks/CC_u_shaped.nii.gz",
    "any_mask": "ROI/any_masks/CC_u_shaped_inclusion_mask.nii.gz",
    "head": "ROI/endpoints/CC_striatal_left.nii.gz",
    "tail": "ROI/endpoints/CC_striatal_right.nii.gz",
    "length": [70, 1000],      // 70-1000mm total length
    "length_y": [0, 32],        // Max 32mm anterior extent
    "length_x_abs": [35, 1000]  // Must span >35mm laterally
  }
}

Geometric Constraints (7 bundles use these):

length: Total streamline length in mm
length_x, length_y, length_z: Extent in specific axes
length_x_abs: Absolute X-axis span (for bilateral bundles)

`config_file_tractometry.json`¶

Purpose: Map bundle names to ground truth files for tractometry analysis

Structure:

{
  "bundle_name": {
    "gt_mask": "bundles/bundle_name.trk"
  }
}

Simpler format - just links each bundle to its ground truth .trk file.

Scoring Methodology¶

Two Scoring Methods¶

1. ROI-Based Scoring (Segmentation)¶

Method: Filter user tractography using anatomical ROI constraints

Process:

Load user's whole-brain tractography
Apply ROI constraints from config_file_segmentation.json:
- Keep only streamlines passing through all required masks
- Keep only streamlines with at least one point in any masks
- Keep only streamlines with endpoints in head/tail regions
- Apply geometric length constraints
Compare filtered user tracks to ground truth
Compute metrics (see below)

Advantages:

Tests anatomical accuracy of streamlines
Evaluates tractography algorithm's ability to follow known pathways
Robust to seeding strategy differences

Disadvantages:

Requires precise ROI mask alignment
Sensitive to coordinate system errors (see warning below)

2. RecoBundles Scoring (Bundle Recognition)¶

Method: Use machine learning to identify bundles in user tractography

Process:

Load ground truth bundles as training data
Use RecoBundles algorithm to find similar streamlines in user tractography
Compare recognized bundles to ground truth
Compute metrics

Advantages:

More forgiving of ROI misalignment
Tests bundle recognition capability
Simulates clinical bundle identification workflow

Disadvantages:

Requires Dipy RecoBundles implementation
Less direct test of anatomical accuracy

Evaluation Metrics (Typical)¶

Based on standard tractometry evaluation:

Valid Bundles (VB): % of submitted bundles passing validity checks
Valid Connections (VC): % of correct endpoint connections
Invalid Bundles (IB): % of bundles with anatomically impossible paths
Bundle Coverage (BC): % of ground truth covered by submission
Bundle Overreach (BO): % of submission not in ground truth
Weighted Dice: Spatial overlap weighted by streamline density

Formula (typical):

Score = (VB + VC - IB + BC - BO) / normalization_factor

Critical File Format Warning¶

From the user's note:

IMPORTANT: As of 08.2022, tractograms are loaded through Dipy's Stateful Tractogram functions. Previous versions applied automatic 0.5 voxel shifts when loading TRK files. The new Python3 version DOES NOT do this anymore.

What This Means¶

Old behavior (pre-2022):

# TrackVis TRK files stored in voxel corner coordinates
# Old code automatically shifted by 0.5 to get voxel centers
track_coords_old = raw_trk_coords + 0.5

New behavior (2022+):

# Dipy StatefulTractogram expects correct space attributes
# NO automatic shifting - relies on header information
track_coords_new = raw_trk_coords  # Uses header space/origin

Impact on HINEC Pipeline¶

HINEC tractography output (nim_tractography_standard.m):

Tracks stored as voxel indices (1-based MATLAB)
Coordinates are in voxel space, not world space
Example: [47.0, 44.5, 27.9] = voxel coordinates

ISMRM ground truth:

TRK files with world space coordinates
Header contains voxel-to-world transform
StatefulTractogram handles coordinate conversion

THE PROBLEM: If you save HINEC tracks as TRK files without proper header setup, the scoring scripts will misinterpret coordinates:

HINEC voxel coords treated as world coords → misalignment
Or: Missing 0.5 shift causes half-voxel offset
Result: Bundle ROI filtering fails → score = 0

How to Use ISMRM Scoring with HINEC¶

Option 1: Convert HINEC Tracks to TRK Format¶

Required: Set up proper TRK header with space attributes

% In nim_tractography_standard.m or save function:

% Load reference NIfTI to get affine transform
ref_nii = niftiinfo('reference.nii.gz');
affine = ref_nii.Transform.T';  % 4x4 voxel-to-world matrix

% Save tracks with proper header
% Use nibabel in Python:
% - Set header['voxel_to_rasmm'] = affine
% - Set space = 'rasmm' (world space)
% - Set origin = 'nifti' or 'trackvis'

Critical: Ensure voxel coordinates are transformed to world space OR header specifies voxel space correctly.

Option 2: Use Python Scoring Script¶

Create MATLAB-to-Python bridge:

% Save HINEC tracks in compatible format
function save_for_ismrm(tracks, reference_nii, output_trk)
    % 1. Extract voxel-to-world transform
    % 2. Convert track coordinates if needed
    % 3. Call Python script to create TRK with proper header
    system(sprintf('python hinec_to_trk.py %s %s %s', ...
        tracks_mat, reference_nii, output_trk));
end

Option 3: Adapt ISMRM Scoring for HINEC Format¶

Modify scoring scripts to accept MATLAB .mat files:

# In scoring script:
if input_file.endswith('.mat'):
    # Load MATLAB tracks
    mat_data = scipy.io.loadmat(input_file)
    tracks = mat_data['tracks']
    # Convert to StatefulTractogram with reference NIfTI
    sft = StatefulTractogram(tracks, reference_nii, space='vox')
else:
    # Standard TRK loading
    sft = load_tractogram(input_file, reference_nii)

Testing Coordinate Alignment¶

Diagnostic Script (recommended):

#!/usr/bin/env python3
import nibabel as nib
from nibabel.streamlines import load, save
import numpy as np

# Load HINEC tracks (converted to TRK)
sft_hinec = load('hinec_tracks.trk')

# Load ISMRM ground truth
sft_gt = nib.streamlines.load('scoring_data_Renauld2023/bundles/CA.trk')

# Check coordinate ranges
hinec_coords = np.vstack([s for s in sft_hinec.streamlines])
gt_coords = np.vstack([s for s in sft_gt.streamlines])

print("HINEC coord range:")
print(f"  X: [{hinec_coords[:,0].min():.1f}, {hinec_coords[:,0].max():.1f}]")
print(f"  Y: [{hinec_coords[:,1].min():.1f}, {hinec_coords[:,1].max():.1f}]")
print(f"  Z: [{hinec_coords[:,2].min():.1f}, {hinec_coords[:,2].max():.1f}]")

print("\nGround truth coord range:")
print(f"  X: [{gt_coords[:,0].min():.1f}, {gt_coords[:,0].max():.1f}]")
print(f"  Y: [{gt_coords[:,1].min():.1f}, {gt_coords[:,1].max():.1f}]")
print(f"  Z: [{gt_coords[:,2].min():.1f}, {gt_coords[:,2].max():.1f}]")

# Check header info
print("\nHINEC header space:", sft_hinec.space)
print("Ground truth header space:", sft_gt.space)

# Visual alignment check
print("\nTo verify alignment:")
print("1. Load both tractograms in TrackVis or MI-Brain")
print("2. Overlay with T1.nii.gz from scoring_data")
print("3. Check if streamlines follow anatomical structures")

Recommendations for HINEC Pipeline¶

Immediate Actions¶

Add TRK export function with proper header setup
Implement coordinate space conversion (voxel → world if needed)
Create alignment test script (Python-based)
Document coordinate system in HINEC pipeline

Medium-term Improvements¶

Bundle segmentation module using ISMRM ROI configs
Automated scoring integration with ISMRM ground truth
Coordinate system validator (detect and fix misalignments)

Long-term Goals¶

ISMRM benchmark suite for HINEC validation
Performance comparison against 2015 challenge submissions
Bundle-specific optimization based on ISMRM metrics

Key Differences: HINEC vs ISMRM Format¶

Aspect	HINEC	ISMRM Ground Truth
File format	`.mat` (MATLAB)	`.trk` (TrackVis)
Coordinate system	Voxel indices (1-based)	World coordinates (RAS+)
Space attribute	Implicit (voxel)	Explicit (header)
Track storage	Cell array of Nx3 matrices	Binary streamline format
Metadata	nim structure	TRK header (affine, dims, etc)
Compatibility	MATLAB-specific	Universal (Dipy, TrackVis, DSI Studio)

Scoring Accuracy Assessment¶

How accurate is ISMRM scoring?

Strengths¶

Anatomical Ground Truth: Based on actual white matter anatomy, not synthetic
Multiple Metrics: Tests different aspects (coverage, validity, connections)
Standardized: Used in major challenge, results published
ROI-based: Tests anatomical knowledge, not just visual similarity

Limitations¶

ROI dependency: Requires precise anatomical masks (manual work)
Single dataset: Based on one brain (may not generalize)
Coordinate sensitivity: Alignment errors cause disproportionate penalty
No ground truth certainty: Even "gold standard" has anatomical uncertainty

Validation Against Other Methods¶

Compare with:

IronTract: Anatomical tracing (more accurate but lower throughput)
Tractometer: Synthetic phantoms (perfect ground truth but unrealistic)
Clinical consensus: Expert manual segmentation (subjective but relevant)

Recommendation: Use ISMRM as one of multiple validation methods:

ISMRM: Anatomical accuracy
IronTract: Biological validation
Synthetic: Algorithm verification
Clinical: Real-world applicability

Next Steps¶

✅ Analyzed ISMRM data structure
✅ Documented scoring methodology
⏳ Create HINEC-to-TRK converter
⏳ Test coordinate alignment
⏳ Run HINEC tracks through ISMRM scoring
⏳ Compare results with challenge submissions