Mathematical Foundations in HINEC¶
This document consolidates all mathematical formulas, derivations, and numerical methods used throughout the HINEC pipeline. It serves as a self-contained reference for researchers who want to understand the mathematics without reading source code.
1. Diffusion Physics and Signal Model¶
The Bloch-Torrey Equation¶
The physical foundation of diffusion MRI is the Bloch-Torrey equation, which extends the Bloch equation to include the effects of molecular diffusion on the transverse magnetization \(M_{xy}\):
where:
- \(\gamma\) is the gyromagnetic ratio (\(\gamma / 2\pi \approx 42.58\) MHz/T for hydrogen)
- \(\mathbf{G}(t)\) is the time-varying magnetic field gradient
- \(\mathbf{r}\) is the spatial position vector
- \(\mathbf{D}\) is the local diffusion tensor
The first term encodes spatial position via phase accumulation from the applied gradient. The second term models signal attenuation from molecular diffusion. In regions where diffusion is anisotropic (e.g., white matter), \(\mathbf{D}\) is a full \(3 \times 3\) tensor rather than a scalar.
Stejskal-Tanner Equation¶
For a pulsed gradient spin-echo (PGSE) sequence, integrating the Bloch-Torrey equation under the narrow-pulse approximation yields the Stejskal-Tanner equation:
where:
- \(S_0\) is the non-diffusion-weighted signal (\(b = 0\) image)
- \(\mathbf{g} = [g_x, g_y, g_z]^\top\) is the unit gradient direction vector
- \(\mathbf{D}\) is the \(3 \times 3\) symmetric positive-definite diffusion tensor
- \(b\) is the b-value (s/mm\(^2\)), encoding gradient strength and timing
The B-value¶
The b-value is not a single parameter but a composite that encodes the gradient waveform:
where:
- \(G\) is the gradient amplitude (T/m)
- \(\delta\) is the gradient pulse duration (s)
- \(\Delta\) is the time between the leading edges of the two gradient pulses (s)
- The factor \((\Delta - \delta/3)\) is the effective diffusion time, accounting for the finite pulse width
Typical clinical DTI uses \(b \approx 1000\) s/mm\(^2\), corresponding to \(G \approx 40\) mT/m, \(\delta \approx 30\) ms, \(\Delta \approx 40\) ms.
Apparent Diffusion Coefficient¶
The quadratic form \(\mathbf{g}^\top \mathbf{D} \mathbf{g}\) represents the apparent diffusion coefficient (ADC) along direction \(\mathbf{g}\):
The ADC is the directional projection of the diffusion tensor --- it measures how fast water diffuses along a specific direction. In white matter, the ADC is large along the fiber direction and small perpendicular to it.
Log-Linear Formulation¶
Taking the natural logarithm of the Stejskal-Tanner equation and rearranging:
Expanding the quadratic form with the symmetric tensor \(\mathbf{D}\):
This expression is linear in the 6 unique tensor elements. Defining:
- \(\mathbf{Y}\): the vector of observed ADC values, \(Y_i = \ln(S_0 / S_i) / b_i\)
- \(\mathbf{d} = [D_{xx}, D_{yy}, D_{zz}, D_{xy}, D_{yz}, D_{xz}]^\top\): the 6 independent tensor elements
We obtain the linear system:
Design Matrix Construction¶
The design matrix \(\mathbf{H} \in \mathbb{R}^{N \times 6}\) (where \(N\) is the number of gradient directions) is:
Implementation (nim_dt.m:27-31, nim_dt_spd.m:27): Both tensor calculation functions construct \(\mathbf{H}\) identically:
The signal \(\mathbf{Y}\) is computed from the log-ratio (nim_dt_spd.m:22-24):
2. Tensor Estimation¶
Ordinary Least Squares Solution¶
Given the overdetermined system \(\mathbf{Y} = \mathbf{H}\mathbf{d}\) (typically \(N = 30\)--\(60\) equations, 6 unknowns), the ordinary least squares (OLS) solution minimizes the sum of squared residuals:
Setting the gradient to zero yields the normal equations:
with solution:
In MATLAB, this is computed via the backslash operator (nim_dt.m:48, nim_dt_spd.m:73):
Weighted Least Squares¶
In the log-linear model, the noise in \(\mathbf{Y}\) is heteroscedastic --- the variance depends on the signal magnitude. A weighted least squares (WLS) formulation accounts for this:
where \(\mathbf{W} = \text{diag}(S_1^2, S_2^2, \ldots, S_N^2)\) weights each observation by its signal intensity squared. HINEC currently uses OLS, which is a reasonable approximation at typical clinical SNR levels.
The Non-Positive-Definite Problem¶
The diffusion tensor must be symmetric positive-definite (SPD) --- all eigenvalues must be strictly positive --- because diffusion coefficients are physical quantities that cannot be negative. However, noise in the measured signal can produce tensors with negative eigenvalues via least squares fitting.
A tensor \(\mathbf{D}\) is SPD if and only if:
Equivalently, \(\mathbf{D}\) is SPD if all its eigenvalues \(\lambda_1, \lambda_2, \lambda_3 > 0\).
In nim_dt.m, non-SPD tensors are noted but not corrected. In nim_dt_spd.m, they are corrected using BFGS optimization.
SPD Constraint via Cholesky Parameterization¶
The key insight for enforcing the SPD constraint is the Cholesky decomposition: any SPD matrix can be uniquely written as:
where \(\mathbf{L}\) is a lower triangular matrix with positive diagonal entries:
This gives the tensor elements:
By optimizing over the 6 free parameters of \(\mathbf{L}\) rather than directly over \(\mathbf{D}\), positive-definiteness is guaranteed for any value of the parameters.
BFGS Optimization¶
When least squares produces a non-SPD tensor, nim_dt_spd.m invokes BFGS optimization (vox_dt_bfgs.m) to find the nearest SPD tensor that fits the data. The objective function minimizes the fitting error:
where \(\mathbf{d}(\mathbf{l})\) maps the Cholesky parameters to the 6-element tensor vector.
BFGS (Broyden-Fletcher-Goldfarb-Shanno) is a quasi-Newton method that maintains an approximate inverse Hessian \(\mathbf{B}_k\), updated at each iteration by:
where:
- \(\mathbf{s}_k = \mathbf{l}_{k+1} - \mathbf{l}_k\) is the parameter step
- \(\mathbf{y}_k = \nabla f(\mathbf{l}_{k+1}) - \nabla f(\mathbf{l}_k)\) is the gradient change
The search direction is \(\mathbf{p}_k = -\mathbf{B}_k \nabla f(\mathbf{l}_k)\), followed by a line search to determine the step length.
Implementation flow (nim_dt_spd.m:76-98):
- Compute OLS solution: \(\mathbf{D} = \mathbf{H} \backslash \mathbf{Y}\)
- Check eigenvalues: \([\mathbf{Q}, \mathbf{\lambda}] = \text{eig}(\text{reshape}(\mathbf{D}))\)
- If \(\exists \, \lambda_i < 0\): run BFGS with Cholesky parameterization
- Validate: reject if NaN, Inf, or eigenvalues outside \([0, \, 0.01]\)
- Sort eigenvalues descending: \(\lambda_1 \geq \lambda_2 \geq \lambda_3\)
- Store sorted eigenvectors and eigenvalues
6-Element Tensor Representation¶
The symmetric \(3 \times 3\) tensor is stored as a 6-element vector:
The reconstruction to the full matrix (nim_reshape_d.m) is:
3. Eigendecomposition¶
Symmetric Matrix Eigenvalue Problem¶
For a real symmetric \(3 \times 3\) matrix \(\mathbf{D}\), the eigenvalue decomposition solves:
The eigenvalues are roots of the characteristic polynomial:
where the tensor invariants are:
These invariants are rotationally invariant --- they do not change when the coordinate system is rotated. For a real symmetric matrix, all three eigenvalues are guaranteed to be real.
This yields three real eigenvalue-eigenvector pairs: \((\lambda_1, \mathbf{e}_1)\), \((\lambda_2, \mathbf{e}_2)\), \((\lambda_3, \mathbf{e}_3)\).
Sorting convention: Eigenvalues are sorted in descending order (nim_eig.m, nim_dt_spd.m:104):
using MATLAB's maxk():
Physical Interpretation¶
| Component | Symbol | Physical Meaning |
|---|---|---|
| Primary eigenvalue | \(\lambda_1\) | Axial diffusivity (along fiber) |
| Primary eigenvector | \(\mathbf{e}_1\) | Estimated fiber direction |
| Secondary eigenvalue | \(\lambda_2\) | Intermediate diffusivity |
| Secondary eigenvector | \(\mathbf{e}_2\) | Secondary diffusion axis |
| Tertiary eigenvalue | \(\lambda_3\) | Minimum diffusivity (perpendicular to fiber) |
| Tertiary eigenvector | \(\mathbf{e}_3\) | Direction of maximum restriction |
Tensor Reconstruction from Eigendecomposition¶
The tensor can be reconstructed from its spectral decomposition:
where \(\mathbf{Q} = [\mathbf{e}_1 \; \mathbf{e}_2 \; \mathbf{e}_3]\) and \(\mathbf{\Lambda} = \text{diag}(\lambda_1, \lambda_2, \lambda_3)\).
Tensor Geometry¶
The eigenvalues define the shape of the diffusion ellipsoid --- the surface of constant mean squared displacement:
where \(t\) is the diffusion time. The semi-axes of this ellipsoid have lengths \(\sqrt{6t\lambda_i}\) along directions \(\mathbf{e}_i\).
- Prolate (\(\lambda_1 \gg \lambda_2 \approx \lambda_3\)): Cigar-shaped, coherent single-fiber bundle
- Oblate (\(\lambda_1 \approx \lambda_2 \gg \lambda_3\)): Pancake-shaped, crossing fibers or planar structure
- Spherical (\(\lambda_1 \approx \lambda_2 \approx \lambda_3\)): Isotropic diffusion, CSF or gray matter
Westin Shape Measures¶
Westin et al. (1997) defined three shape measures that sum to 1, providing a geometric decomposition of the tensor:
with \(C_L + C_P + C_S = 1\). In a coherent white matter voxel, \(C_L \gg C_P, C_S\). At a fiber crossing, \(C_P\) becomes significant.
4. Diffusion Scalar Metrics¶
Fractional Anisotropy (FA)¶
FA quantifies the degree of anisotropy on a normalized scale \([0, 1]\):
Equivalence with the variance formulation: Defining the mean eigenvalue \(\bar{\lambda} = (\lambda_1 + \lambda_2 + \lambda_3)/3\), an equivalent expression is:
Proof of equivalence: Expanding both forms shows they differ only by algebraic rearrangement. Note that:
and:
which confirms \(\frac{1}{2} \times (\text{pairwise form}) = \frac{3}{2} \times (\text{variance form})\).
In terms of tensor invariants: FA can also be written using the tensor invariants, making it manifestly rotation-invariant:
where \(I_1 = \text{tr}(\mathbf{D})\) and \(I_2\) is as defined in Section 3.
Implementation (nim_fa.m:23-24):
denom = sqrt(l(1)^2 + l(2)^2 + l(3)^2);
if denom > 0
nim.FA(x,y,z) = sqrt(1/2) * sqrt((l(1)-l(2))^2 + (l(2)-l(3))^2 + (l(3)-l(1))^2) / denom;
end
Division by zero is protected: \(\text{FA} = 0\) when all eigenvalues are zero.
Mean Diffusivity (MD)¶
MD equals one-third of the trace, which is rotationally invariant. Typical values: \(\sim 0.7\)--\(1.0 \times 10^{-3}\) mm\(^2\)/s in healthy white matter, \(\sim 3.0 \times 10^{-3}\) mm\(^2\)/s in CSF.
Axial Diffusivity (AD) and Radial Diffusivity (RD)¶
FA can be expressed in terms of AD and RD in the special case \(\lambda_2 = \lambda_3 = \text{RD}\):
5. FACT Tractography Algorithm¶
The FACT (Fiber Assignment by Continuous Tracking) algorithm, implemented in nim_tractography_standard.m, uses discrete voxel directions with exact boundary intersection.
Streamline ODE Formulation¶
Deterministic tractography formulates streamline tracing as an initial value problem (IVP) for an ordinary differential equation:
where:
- \(\mathbf{r}(s) = [x(s), y(s), z(s)]\) is the streamline position parameterized by arc length \(s\)
- \(\mathbf{e}_1(\mathbf{r})\) is the primary eigenvector field (a unit vector field defined at every voxel)
- \(\mathbf{r}_0\) is the seed point
The FACT algorithm solves this ODE using a piecewise-constant approximation of \(\mathbf{e}_1\) within each voxel.
Algorithm Overview¶
Input: Seed point r_0, eigenvector field e_1(x,y,z), FA field
Output: Track T = [r_0, r_1, r_2, ...]
1. Initialize: r = r_0, d_prev = e_1(voxel(r_0))
2. While termination criteria not met:
a. Get current voxel indices: (i, j, k) = floor(r)
b. Get direction: d = e_1(i, j, k) [discrete, no interpolation]
c. Ensure consistency: if dot(d, d_prev) < 0, flip d = -d
d. Compute boundary exit: r_exit = find_boundary_exit(r, d)
e. Step to boundary: r = r_exit + epsilon * d [enter next voxel]
f. Check termination: FA(new voxel) < threshold? angle > limit?
g. Store point: append r to T
h. Update: d_prev = d
3. Repeat tracking in opposite direction from r_0
4. Concatenate: T = [reverse(T_backward), T_forward]
Voxel Boundary Intersection¶
Given position \(\mathbf{p} = (p_x, p_y, p_z)\) and direction \(\mathbf{d} = (d_x, d_y, d_z)\) within a voxel with boundaries \([x_\min, x_\max] \times [y_\min, y_\max] \times [z_\min, z_\max]\), the ray-box intersection is computed parametrically.
For each axis \(\alpha \in \{x, y, z\}\), compute the parametric distances to the boundaries:
(swapping near/far if \(d_\alpha < 0\)). The exit parameter is:
The exit point is:
Direction Consistency¶
Eigenvectors are bidirectional (\(\mathbf{e}_1\) and \(-\mathbf{e}_1\) are both valid). The dot product check ensures consistent propagation:
Angular Stopping Criterion¶
The angle \(\theta\) between consecutive directions is checked against a threshold \(\theta_\max\):
If \(\theta > \theta_\max\) (typically 25--60 degrees), the streamline is terminated. Pre-computing \(\cos\theta_\max\) avoids the expensive \(\arccos\) operation at every step.
Bidirectional Tracking¶
Each seed point generates two half-tracks. The backward track starts by negating the initial direction. The final track is:
Track Length¶
The total arc length of a track with \(N\) points is:
Tracks shorter than a minimum length (typically 10--30 mm) are filtered out as likely spurious.
6. Numerical Integration Methods¶
HINEC's high-order tractography (nim_tractography_hinec.m) replaces the piecewise-constant FACT approach with numerical ODE integration on a continuous interpolated direction field:
Euler Method (Order 1, integration_order = 1)¶
The simplest explicit method:
- Local truncation error: \(O(h^2)\)
- Global error: \(O(h)\) --- first-order accuracy
- One function evaluation per step
- Fast but least accurate; equivalent to standard step-based tracking
RK2 Midpoint Method (Order 2, integration_order = 2)¶
Uses a trial step to the midpoint for better accuracy:
- Local truncation error: \(O(h^3)\)
- Global error: \(O(h^2)\) --- second-order accuracy
- Two function evaluations per step
RK4 Classical Method (Order 4, integration_order = 4)¶
The standard fourth-order Runge-Kutta method:
- Local truncation error: \(O(h^5)\)
- Global error: \(O(h^4)\) --- fourth-order accuracy
- Four function evaluations per step
- Excellent accuracy-to-cost ratio; the default HINEC method
RKF45 Adaptive Method (Order 4-5, integration_order = 5)¶
The Dormand-Prince embedded pair provides automatic step size control by computing both a 4th-order and 5th-order solution using shared function evaluations.
Stage evaluations (7 stages, \(i = 1, \ldots, 7\)):
where the coefficients \(a_{ij}\) are given by the Dormand-Prince Butcher tableau.
Two solutions from shared stages:
Error estimate:
Step size control:
If \(\text{err} \leq \text{tol}\): accept step, use \(\hat{\mathbf{y}}_{n+1}\), adjust \(h\) for next step. If \(\text{err} > \text{tol}\): reject step, reduce \(h\), retry.
HINEC defaults: \(\text{tol} = 0.01\) voxels, \(\text{safety} = 0.9\), \(h_\min = 0.01\), \(h_\max = 1.0\).
For the complete Dormand-Prince coefficients and derivation, see RKF.md. For usage examples and parameter guidelines, see RKF_Usage.md.
Error Analysis Summary¶
The local truncation error (LTE) for an order-\(p\) method scales as:
where \(C\) depends on the \((p+1)\)-th derivative of the solution. The global error after \(N = L/h\) steps over total arc length \(L\) accumulates to:
This explains why the global accuracy order equals \(p\), one less than the local order.
Comparison of Methods¶
| Method | Order | Evaluations/Step | Relative Cost | Relative Accuracy | Best For |
|---|---|---|---|---|---|
| Euler | 1 | 1 | \(1\times\) | Low | Quick exploration |
| RK2 | 2 | 2 | \(2\times\) | Moderate | Fast, reasonable accuracy |
| RK4 | 4 | 4 | \(4\times\) | High | Default, good cost/accuracy |
| RKF45 | 4-5 | 6-7 | \(6\)--\(7\times\) | Highest (adaptive) | Publication quality |
Higher-order methods are especially valuable in regions of high fiber curvature, where lower-order methods accumulate significant tracking error.
7. Interpolation Methods¶
Trilinear Interpolation¶
The default interpolation method in HINEC. For a query point \(\mathbf{p} = (x, y, z)\) inside a voxel, the interpolated value is a weighted average of the 8 surrounding voxel centers.
Given voxel centers at integer coordinates with values \(V(i,j,k)\), and fractional coordinates:
The trilinear interpolation can be written as a tensor product of 1D linear interpolations:
where \(w_0(t) = 1 - t\) and \(w_1(t) = t\). Expanding:
Implementation: HINEC uses MATLAB's griddedInterpolant with 'linear' method for 2--5\(\times\) faster interpolation compared to interp3 (nim_tractography_hinec.m:239-242):
nim.FA_interp = griddedInterpolant(grid_vectors, nim.FA, 'linear', 'none');
nim.v1_x_interp = griddedInterpolant(grid_vectors, nim.v1_x, 'linear', 'none');
nim.v1_y_interp = griddedInterpolant(grid_vectors, nim.v1_y, 'linear', 'none');
nim.v1_z_interp = griddedInterpolant(grid_vectors, nim.v1_z, 'linear', 'none');
Each eigenvector component is interpolated independently, and the result is renormalized to unit length.
Cubic Spline Interpolation¶
Available via interp_method = 'cubic'. Uses cubic B-splines for \(C^2\)-continuous interpolation, providing smoother direction fields at the cost of additional computation. The cubic B-spline basis function is:
Uses the same griddedInterpolant framework with 'cubic' method.
High-Order Spectral Interpolation¶
The nim_interp.m utility implements spectral element interpolation using Gauss-Lobatto-Legendre (GLL) quadrature nodes:
- Expand eigenvectors from cell centers to vertices
- Place \((p+1)^3\) interpolation nodes per voxel using GLL quadrature
- Perform 3D spline interpolation within each voxel
GLL nodes are the roots of:
where \(P_N'\) is the derivative of the \(N\)-th Legendre polynomial. These nodes include the endpoints \(\xi = \pm 1\) and cluster near element boundaries, providing optimal interpolation accuracy with minimal Runge phenomenon.
The Lagrange interpolant through GLL nodes \(\xi_0, \ldots, \xi_N\) is:
Uniform nodes from zwuni.m provide an alternative with equally-spaced points.
Direction Field Normalization¶
After interpolation, the direction vector must be renormalized because linear interpolation of unit vectors does not produce unit vectors:
The interpolation error in the direction (angular deviation) is bounded by:
where \(\Delta x\) is the voxel size. This is why trilinear interpolation works well: for smoothly varying fiber directions, the angular error per voxel is small.
8. Registration Mathematics¶
Affine Transformation¶
An affine transformation maps coordinates between spaces using a \(4 \times 4\) matrix (12 degrees of freedom):
The \(3 \times 3\) submatrix decomposes into rotation (3 DOF), scaling (3 DOF), and shear (3 DOF). The translation vector \(\mathbf{t} = [t_1, t_2, t_3]^\top\) adds 3 DOF.
A rigid-body transformation (6 DOF) restricts to rotation + translation only:
Registration Cost Functions¶
HINEC supports two registration backends:
Correlation Ratio (FSL FLIRT, same-modality):
where \(Y\) is the target image, \(X\) is the source partitioned into \(k\) bins, \(n_k\) is the count in bin \(k\), \(\sigma_k^2\) is the variance of \(Y\) within bin \(k\), and \(\sigma_Y^2\) is the total variance.
Normalized Mutual Information (quality assessment in registration_utils.m):
where the entropies are:
\(p(a)\) and \(p(a,b)\) are estimated from 64-bin histograms. NMI \(= 1\) indicates no shared information; NMI \(> 1\) indicates increasing alignment.
Multi-Modal Registration Chain¶
HINEC constructs a chain of transforms between three coordinate spaces:
Forward chain (DWI \(\to\) MNI): compose transforms
Inverse chain (MNI \(\to\) DWI): apply inverse transforms in reverse order
This chain is used by nim_apply_transforms.m to bring atlas labels from MNI space to DWI space.
Nearest-Neighbor Resampling for Labels¶
When transforming parcellation labels, nearest-neighbor interpolation is required to preserve discrete integer values:
Linear or spline interpolation would produce non-integer values that corrupt label identity.
9. Tissue Segmentation Thresholds¶
FA-based tissue classification (preproc_tissue_segmentation.m) uses empirical thresholds:
The WM mask is eroded (morphological erosion with spherical kernel, radius=1) to remove boundary voxels where partial volume effects contaminate the mask:
where \(\mathcal{B}_1\) is a unit ball structuring element and \(\ominus\) denotes morphological erosion.
Masks are mutually exclusive and constrained by the brain mask:
Expected tissue proportions: WM \(\sim\)40--45%, GM \(\sim\)40--45%, CSF \(\sim\)10--20% of brain volume.
10. Connectivity Matrix Construction¶
Given a set of tracks \(\mathcal{T} = \{t_1, t_2, \ldots, t_M\}\) and a parcellation with \(N\) regions (nim_plot_connectivity_matrix.m):
where \(R_i\) denotes parcellation region \(i\), and \(\text{start}(t_k)\), \(\text{end}(t_k)\) are the first and last points of track \(t_k\).
For a symmetric (undirected) connectivity matrix:
Optional normalization:
The resulting matrix \(\mathbf{C} \in \mathbb{R}^{N \times N}\) represents the structural connectivity between brain regions, where \(C_{ij}\) is the number of streamlines connecting region \(i\) to region \(j\).
Cross-References¶
- RKF45 complete derivation and coefficients: RKF.md
- RKF45 usage guide and parameter selection: RKF_Usage.md
- High-order methods implementation details: High_Order.md
- Scientific context for these methods: SCIENCE.md
- Architecture and data flow: ARCHITECTURE.md
- Complete function reference: API_REFERENCE.md