Introduction

Fatigue is among the most disabling symptoms in multiple sclerosis (MS)1, and is associated with disease progression2. Neural, immune, endocrine and metabolic mechanisms have all been proposed to play a role in the development of fatigue1. Neuroimaging studies have associated fatigue with brain damage in MS patients1,3,4,5,6,7,8,9,10,11,12,13, but the anatomical patterns were not consistent between studies, and a few studies could not show a structural association, at all14,15,16,17,18.

A previous study suggested that fatigue is highly variable over time: 54% of MS patients fluctuated between “fatigued” or “non-fatigued” states, 27% were persistently “fatigued” and 19% were persistently “non-fatigued” over a course of 2 years, during which fatigue was assessed every 6 months19. Therefore, a single assessment might not be sufficiently representative and robust to categorize a patient into a “fatigued” or “non-fatigued” group. A limitation of previous MRI studies has been the lack of accounting for fluctuations of fatigue over time, which may explain discrepancies between their results.

We hypothesize that the pathogenesis of persistent fatigue differs from that of fluctuating fatigue: Persistent fatigue over years is more likely to be caused by irreversible neurodegeneration, whereas fluctuating fatigue may reflect reversible pathobiological changes (e.g. inflammatory cytokine and hormone levels). We therefore defined the following three patient groups considering longitudinal fatigue assessments over the course of up to 14 years: never fatigued (NF), sustained fatigue (SF) over the most recent two years, and reversible fatigue (RF) (presently not reporting fatigue, but did in the past). We anticipated that SF patients would show more pronounced gray matter (GM) damage than RF and NF patients. Since depression is a common comorbidity1, we also investigated the effects of depression and medications, that may influence the perceived level of fatigue and/or depression, on the relationship between fatigue and GM damage.

Results

There was no significant difference between the SF, RF and NF groups in age, sex, disease duration, EDSS, and time between MFIS assessment and MRI scan (Table 1). At the most recent measurement, SF patients showed significantly higher total and subscale MFIS scores (p < 0.001 vs RF, p < 0.001 vs NF); total CES-D score (p < 0.001 vs RF, p < 0.001 vs NF); as well as CES-D subscale scores, i.e. somatic symptoms (p < 0.001 vs RF, p < 0.001 vs NF), depressed affect (p = 0.032 vs RF, p < 0.001 vs NF), anhedonia (p = 0.020 vs RF, p < 0.001 vs NF) and interpersonal concerns score (p = 0.020 vs RF, p < 0.001 vs NF) compared to the other two groups (Table 1). 20 out of the 98 patients (14 SF, 5 RF, 1 NF) had clinically significant (CES-D ≥16) depression. These variables were not significantly different between RF and NF, but there was a trend showing higher scores in RF patients (Table 1).

Table 1 Comparison of demographic and clinical variables of the CLIMB cohort as well as MS patients with sustained, reversible or no fatigue selected from the CLIMB cohort.

Disease duration (p < 0.0001) and female to male ratio (p = 0.005) were significantly higher in the pooled SF + RF + NF cohort comparted to the CLIMB cohort, while no significant difference was present for age and EDSS (Table 1). The selected dataset was well matched to the QOL subset of the CLIMB for total score and physical and psychosocial subscale scores of MFIS, but had higher cognitive subscale scores (p = 0.035) (Table 1).

In the pooled SF + RF + NF cohort, total MFIS showed significant correlation with CES-D (p < 0.0001, rho = 0.51) and EDSS (p = 0.044, rho = 0.20), but CES-D and EDSS were not significantly inter-correlated (p = 0.64, rho = −0.05). In the QOL subset, total MFIS score was significantly correlated with CES-D (p < 0.0001, rho = 0.90) and with EDSS (p = 0.0001, rho = 0.14), but CES-D and EDSS (p < 0.0001, rho = 0.16) also showed significant, albeit weak correlation.

Total brain WMLL was significantly higher in SF versus RF and NF patients (p = 0.018), but there was no difference between RF and NF patients (Table 1).

VBM analysis adjusted for age, sex, disease duration, and EDSS showed significantly lower volumes in several cortical regions encompassing all four brain lobes and the insula, along with subcortical structures (caudate, putamen, thalamus, amygdala and hippocampus) on both sides in SF compared to NF patients (Fig. 1 and Table 2). Comparison between RF and NF patients showed signal (ie, lower GM volume) only in bilateral frontal cortical areas (Fig. 2 and Table 2). We found no significant differences between SF and RF patients. SF patients showed atrophy in 33 GM areas (29 bilateral), whereas RF patients showed atrophy in 4 GM areas (4 bilateral) (Table 2). The total number of significantly different GM voxels was 20-times larger in the SF versus NF contrast compared to the RF versus NF contrast (Table 2).

Figure 1
figure 1

Spatial distribution of clusters with significant atrophy overlaid on the ICBM 152 template in MS patients with sustained fatigue (SF) compared to never fatigued (NF) MS patients. Correction was made for age, sex, disease duration and Expanded Disability Status Scale (EDSS) score (top), and for Center for Epidemiological Studies - Depression score (CES-D) (middle), as well as for medication (bottom). (red labels = family-wise error + Bonferroni-corrected p value < 0.017).

Table 2 Brain GM areas with significant volume loss in SF versus NF patients as well as in RF versus NF patients when controlling for age, sex, disease duration, EDSS ± CESD ± medication (FWE + Bonferroni-corrected p < 0.017).
Figure 2
figure 2

Spatial distribution of clusters with significant atrophy overlaid on the ICBM 152 template in MS patients with reversible fatigue (RF) compared to never fatigued (NF) MS patients. Correction was made for age, sex, disease duration and Expanded Disability Status Scale (EDSS) score (top), and for Center for Epidemiological Studies - Depression score (CES-D) (bottom), as well as for medication (bottom) (red labels = family-wise error + Bonferroni-corrected p value < 0.017).

When controlling for CES-D (in addition to age, sex, disease duration, and EDSS), SF versus NF patients showed significant atrophy in 23 GM areas (19 bilateral), whereas RF versus NF patients showed significant atrophy in 7 GM areas (5 bilateral) (Table 2). The total number of significantly different GM voxels in the SF versus NF contrast was 4-times larger compared to the RF versus NF contrast (Table 2). Of note, one patient’s CES-D was not measured at the time of the MFIS assessment. Instead, we used a CES-D score obtained 34 months earlier.

When controlling for medication (in addition to age, sex, disease duration, EDSS and CES-D), SF versus NF patients showed significant atrophy in 21 GM areas (14 bilateral), whereas RF versus NF patients showed significant atrophy in 7 GM areas (5 bilateral) (Table 2). The total number of significantly different GM voxels in the SF versus NF contrast was nearly 3-times larger compared to the RF versus NF contrast (Table 2).

Bonferroni correction resulted in a 35% reduction in the total number of significantly different GM voxels in the SF versus NF contrast, as well as in an 89% reduction in the RF versus NF contrast when controlling for age, sex disease duration and EDSS. A prominent negative confounding effect of depression was observed bilaterally in the cerebellar cortex in the SF versus NF and in the RF versus NF contrasts, which did not survive Bonferroni correction (Supplementary Figs 1 and 2). This is of note, given the extent and anatomical coherence of the result (Supplementary Figs 1 and 2).

Discussion

A neurogenic component of MS-related fatigue has been supported by 10 out of 13 previous studies using unbiased image analyses, such as voxel-, tensor-, or automated segmentation-based techniques4,5,6,7,8,9,10,11,12,13. Overall, our VBM results support an association between neurodegeneration and fatigue in MS patients.

Previous studies categorized patients into “fatigued” or “non-fatigued” groups based on a single time-point fatigue assessment, and their results showed heterogeneity in regional atrophy patterns potentially relevant to fatigue4,5,6,7,8,9,10,11,12,13.

Our study design used multiple longitudinal assessments of fatigue to improve robustness of group assignment. Stratification of patients according to historical fatigue scores may inform on the different mechanisms involved in the pathophysiology of fatigue. Since inflammatory cytokines, hormones or metabolic factors are likely to induce RF, whereas neurodegeneration may cause SF, we expected that SF patients would show more pronounced GM damage than RF and NF patients.

Our results showed that both SF and RF were associated with neurodegeneration in all GM regions known to be associated with fatigue from previous studies, independently from age, sex, disease duration, EDSS, CES-D and medication. Compared to NF patients, the total number of significantly different GM voxels was more than twenty-times larger in SF than in RF patients, but direct comparison of SF with RF patients showed no significant voxel-wise differences. WM changes (measured by brain WMLL) were significantly more pronounced in SF compared to RF and NF patients, and there was a trend for higher WMLL in RF versus NF patients. These findings suggest that the same neuronal circuitries are affected in RF as in SF patients, albeit to a lesser extent in the former.

Several previous structural MRI studies of fatigue investigated the association of fatigue with WM lesions in MS. However, only a few of these studies found significant association between fatigue and total brain WMLL4,20,21, or regional brain WMLL in frontal8,22, temporal8, parietal, internal capsular and periventricular areas20, while other studies using a similar approach failed to do so16,17,18. Our results support the notion that both GM and WM damage play a role in the development of fatigue in MS.

Chaudhuri and Behan associated “central fatigue” with the failure of the non-motor function of the cortico-striato-thalamic loop23. This hypothesis has been supported by several neuroimaging studies in MS1,3. Other networks, including temporal, parietal and occipital connections may also play a role in the development of fatigue, according to diffusion tensor MRI studies9,24 and other MRI studies investigating the localization of MS lesions4. We robustly replicated the anatomical patterns of GM atrophy described in previous work. Our findings support the hypothesis that all of the above-mentioned networks, including limbic (frontal-orbital and cingulate cortices), primary sensory-motor (pre- and postcentral gyri), associative (frontal, temporal, parietal, insular and occipital) cortical and subcortical regions (striatum, thalamus, amygdala) might have a role in the development of fatigue in MS. In addition, our study is the first to report association between fatigue and hippocampal atrophy in MS. The prefrontal cortex-hippocampus circuit plays a role not only in memory, attention and decision-making25, which are major components of cognitive fatigue, but is involved also in reward mechanisms25,26 providing support for the effort-reward imbalance theory of fatigue3.

Fatigue and depression scores were highly correlated in our cohort, consistent with previous findings1,27. It is worthwhile noting that all subscale scores of the CES-D were higher in SF compared to the RF and NF groups. Therefore, we don’t ascribe the correlation between CES-D and MFIS to the overlap in presence of questions related to somatic symptoms in both CES-D and MFIS questionnaires. Our results support that depression is a significant co-morbidity in fatigued patients and may suggest that fatigue and depression might indeed be mediated by damage to shared pathways. The above-mentioned studies assessed depression using various questionnaires. Nine studies excluded patients based on high depression scores4,6,7,8,9,15,28 or concomitant therapy with anti-depressants13 or history of psychiatric disorders29. One study compared patients with only fatigue, only depression and both10 and one made no correction for depression12. The potential confounding effect of depression in the context of the association between brain damage and fatigue was investigated by 2 previous studies. One showed significant association of fatigue with caudate and accumbens atrophy when controlling for depression and EDSS11, while the other study found no significant GM atrophy related to fatigue when accounting for depression14. Our voxel-based MRI analyses showed that depression has a significant confounding effect in several frontal, temporal, parietal, occipital and deep GM areas. In the SF versus NF contrast, significant positive confounding effect of depression was observed in the bilateral superior, middle, inferior frontal gyri, the pre- and postcentral gyri, accumbens, and right lateral occipital cortex, supramarginal and angular gyri (Table 2 and Fig. 1). In the RF versus NF contrast, negative confounding effect of depression was observed in the thalamus (Table 2 and Fig. 2). Our results suggest that damage to these areas may play a role in the co-morbid development of fatigue and depression in MS patients. We noted a negative confounding effect on cerebellar cortex. This finding did not survive Bonferroni correction, but, given the extent of the resulting cluster of voxels and its striking anatomical coherence outlining a large part of the cerebellar cortex, further attention is warranted in future studies.

The presence/absence of anti-fatigue, anti-depressant and/or anxiolytic treatments also showed a significant positive confounding effect in SF, and to a lesser extent, in RF patients. The most prominent positive confounding effect was observed in the caudate and putamen in the SF versus NF contrast. Our results suggest that pharmacological treatment is a significant confounder of MS-related fatigue. This observation may pave the way for future studies which aim to investigate the association of global or local (ie, GM region or WM tract-specific) brain damage with anti-fatigue treatment response in MS.

It has been hypothesized that lateralization may exist in fatigued MS patients30 based on the findings of Riccitelli et al.7 who demonstrated correlation between fatigue and atrophy in left precentral gyrus and central sulcus. However, most of our findings were bilateral, including the pre- and postcentral gyri (Table 2). Several previous studies showed both bilateral and unilateral findings4,5,6,7,8,9,10,11,12,13, but the unilateral ones were not consistently reproduced providing no clear evidence regarding the lateralization of fatigue in MS. In fact, the observed imbalance in the extent and distribution of GM atrophy between SF and RF patients (i.e., 29 out of 33 GM areas showed bilateral spatial pattern in SF, while in RF all 4 GM areas were involved bilaterally) (Table 1).

In previous studies, both RF and NF patients would have been stratified as “non-fatigued’ MS patients. Our results suggest that previous existence of clinically significant fatigue in currently “non-fatigued” patients is associated with GM atrophy, potentially explaining inconsistent findings of previous studies that stratified MS patients using a single fatigue assessment.

Our groups were selected from the CLIMB cohort based on longitudinal MFIS scores and matched based on age, gender, disease duration and EDSS. Due to this selection and matching process, the pooled SF + RF + NF cohort showed significantly higher disease duration, female-to-male ratio and cognitive fatigue compared to the CLIMB cohort.

Our study has limitations, including: (1) time between MRI and fatigue assessment varied across participants, with MRI scans within a month of MFIS assessment in only 57 out of 98 patients. (2) In our statistical analyses, correction was made only for age, sex, disease duration, EDSS and depression, but not for other potential confounders of fatigue, such as anxiety, physical activity and sleep problems. (3) Treatment with anti-fatigue, anxiolytics, anti-depressant, disease-modifying drugs, monthly iv steroids and/or immunosuppressants were not exclusion criteria. (4) While this may be the first time that patients were classified according to fatigue patterns derived from repeated measures, the frequency of these measures was constrained by the retrospective nature of this work, and requires further consideration. (4) We used the Bonferroni method to correct for multiple comparisons (in addition to FEW correction), which is applicable when the number of tests is less than 5. However, this method is not as powerful as the Tukey method31.

Future prospective studies of MS-related fatigue should take into consideration temporal patterns of fatigue. However, the most adequate frequency of fatigue assessments for optimal, pathologically relevant patient stratification remains to be determined. Novel approaches to fatigue assessment, including the use of mobile technologies for frequent, real-time assessments, are likely to lead to better understanding of the pathophysiology of fatigue and other interrelated symptoms.

Materials and Methods

Participants

MS patients were selected from the Quality Of Life (QOL) subset of our longitudinal cohort study of over 2000 MS patients, named Comprehensive Longitudinal Investigations of MS at the Brigham and Women’s Hospital (CLIMB) (http://partnersmscenter.org/clinical-programs/climb-study/) (Table 1). The QOL subset (n > 800) undergoes annual MRI and neurological examination and biennial QOL assessments, including fatigue and depression measurements, using the Modified Fatigue Impact Scale (MFIS)32,33 and the Center for Epidemiological Studies Depression Scale (CES-D)34, respectively. The MFIS has three domains (i.e., cognitive, physical and psychosocial) and its cut-off for clinically relevant fatigue is 38 (total score including all domains)33, whereas CES-D has four domains (i.e, somatic symptoms, depressed affect, amhedonia, interpersonal concerns)35 and scores ≥16 (including all domains) are in the depressed range36. The QOL assessments were performed on the day of the neurological visit.

Definition of fatigue subgroups

In the current study, the following patient groups were defined based on retrospective longitudinal MFIS scores: (i) SF: last two consecutive MFIS ≥38, (ii) RF: most recent MFIS <38 and at least one prior MFIS ≥38; (iii) NF: no MFIS ≥38 (minumum 5 assessments needed). We queried the CLIMB database on 02/19/16 and found 123 SF, 98 RF and 238 NF patients out of the QOL subgroup of 859 MS patients. Patients without 3T MRI were excluded and the closest 3T MRI scan to the latest MFIS measurement was selected for image analysis in the remaining patients. Further restriction criteria were applied: no clinically isolated syndrome; no history of psychotic disorder, major neurologic disorder (other than MS) or malignancies; EDSS ≤6; no clinical relapse/acute intravenous streroid treatment within 90 days before the MFIS assessment or the MRI scan or between the MFIS and MRI assessments. To maximize the number of SF patients, the maximum difference in time between the last MFIS assessment and MRI scan was set at 15 months and the upper limit of age at 66 years. Then, we matched the SF group with the other 2 groups based on age, sex, disease duration and EDSS. We identified 30 SF, 31 RF and 37 NF patients (Table 1). Several patients were treated with anti-fatigue medications (modafinil, armodafinil, amphetamine, amantadine or methylphenidate) (47% of SF, 23% of RF, 30% of NF patients); anxiolytics (27% of SF, 16% of RF, 8% of NF patients); anti-depressants (47% of SF, 26% of RF, 22% of NF) and 67% of SF, 68% of RF and 57% of NF patients received at least one of these drugs. Most of the patients were on disease-modifying treatment (87% of SF, 80% of RF, 92% of NF patients), 2 SF and 2 RF patients received monthly intravenous steroids, and 1 SF and 1 NF patients were on immunosuppressant (mycophenolate mofetil) treatment. This study was approved by the Institutional Review Board of our Institution (Partners Healthcare) and all research was performed in accordance with relevant guidelines and regulations. The CLIMB Study group obtained informed consent from all patients whose clinical and MRI data were analyzed in this study.

Magnetic resonance imaging

Brain images were acquired using a 3 Tesla Siemens Skyra scanner as follows: (1) Sagittal 3D T1-weghted MPRAGE: TR/TE/TI = 2300/2.96/900 ms, voxel sixe = 1 × 1 × 1 mm3, FOV = 256 mm, flip angle = 9 deg, matrix size = 256 × 240 and (2) Sagittal 3D T2-weighted FLAIR: TR/TE/TI = 5000/389/1800 ms, voxel size = 1 × 1 × 1 mm3, FOV = 256 mm, flip angle = 120 deg, matrix size = 256 × 240.

Lesion segmentation

White matter (WM) lesions were segmented using the lesion growth algorithm module of the lesion segmentation toolbox (LST v2.0.15) in Statistical Parametric Mapping (SPM)1237. This algorithm first segments the T1-weighted images into three main tissue classes (GM, WM, cerebrospinal fluid (CSF)), then this information is combined with the coregistered FLAIR intensities to calculate lesion belief maps. Based on visual evaluation, a kappa value of 0.1 was selected as optimal threshold for the computation of binary lesion maps. The automatically-generated lesion maps were inspected and manually edited in 3D-Slicer (https://www.slicer.org) to erase false positives only in GM areas. Total brain WM lesion load (WMLL) was calculated using 3D-Slicer. Finally, the lesions were filled on T1-weighted images using the lesion filling module of the LST in preparation for voxel-based morphometry (VBM).

Voxel based morphometry

VBM was performed on T1-weighted images to detect differences in brain GM atrophy between the three groups38. Supra- and infratentorial WM, brainstem, and lesions were masked, and excluded from analysis. T1-weighted images were preprocessed using the VBM toolbox in SPM1239. We used diffeomorphic anatomical registration through exponentiated lie algebra (DARTEL)40 to create a study-specific template and register the images to the ICBM 152 template (MNI space). The images were then merged using fslmerge, and smoothed with a Gaussian kernel (σ = 4 mm) using fslmaths (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/Fslutils).

Cerebellum segmentation

T1-weighted images were processed with CERES41, an automated atlas-based cerebellum segmentation tool to calculate total cerebellum volume, cerebellar GM volume and cerebellar cortical thickness.

Statistical analysis

Comparison of continuous clinical variables (1) between SF, RF and NF groups was performed using one-way ANOVA and (2) between the pooled SF + RF + NF cohort and the CLIMB study cohort was performed using Wilcoxon rank sum test. Differences in male/female ratio were assessed by Pearson’s chi-square test. We assessed associations among MFIS, CES-D and EDSS scores using Spearman’s rank correlation. WMLL was compared between the groups using one-way ANOVA.

For voxel-based analysis, we used nonparametric permutations (n = 5000) implemented in FSL-Randomise (https://fsl.fmribox.ac.uk/fsl/fslwiki/Randomise). Threshold-free cluster enhancement was used to adjust for family-wise error (FWE) correction for multiple comparisons42,43. In addition, Bonferroni correction was used to correct for the number of pairwise comparisons (ie, SF versus NF, SF versus RF, RF versus NF). Accordingly, voxels with a FWE + Bonferroni-corrected p < 0.017 were considered significant.

To investigate the effect of depression on the relationship between fatigue and GM atrophy, we performed secondary analysis controlling for CES-D (in addition to age, sex, disease duration and EDSS). We also assessed the voxel-wise association between depression severity (continuous CES-D score) and brain GM atrophy in the pooled patient cohort, controlling for age, sex, disease duration and EDSS. To account for the effects of medications that may lower fatigue and/or depression levels, in a separate model, we added medication, as a dichotomous variable: 1 = received anti-fatigue and/or anti-depressant and/or anxiolytic treatment, 0 = received none of these medications. Stata13 (StataCorp, College Station, Texas, USA) was used for all statistical analyses except for VBM.