|Home | About | Journals | Submit | Contact Us | Français|
Learning leads to rapid microstructural changes in grey (GM) and white (WM) matter. Do these changes continue to accumulate if task training continues, and can they be reverted by sleep? We addressed these questions by combining structural and diffusion weighted MRI and high-density EEG in 16 subjects studied during the physiological sleep/wake cycle, after 12h and 24h of intense practice in two different tasks, and after post-training sleep. Compared to baseline wake, 12h of training led to a decline in cortical mean diffusivity. The decrease became even more significant after 24h of task practice combined with sleep deprivation. Prolonged practice also resulted in decreased ventricular volume and increased GM and WM subcortical volumes. All changes reverted after recovery sleep. Moreover, these structural alterations predicted cognitive performance at the individual level, suggesting that sleep's ability to counteract performance deficits is linked to its effects on the brain microstructure. The cellular mechanisms that account for the structural effects of sleep are unknown, but they may be linked to its role in promoting the production of cerebrospinal fluid and the decrease in synapse size and strength, as well as to its recently discovered ability to enhance the extracellular space and the clearance of brain metabolites.
Sleep deprivation has long been known to result in longer and/or deeper sleep. Recent studies, however, show that sleep need increases not only with the duration of wake, but also with its “intensity”, and specifically with the amount of experience-dependent plasticity and learning, a finding confirmed in insects, rodents, and humans (Tononi and Cirelli, 2014). Fruit flies, for instance, sleep longer after being awake in an enriched environment than in isolation (Bushey et al., 2011; Donlea et al., 2009; Ganguly-Fitzgerald et al., 2006). In mammals, slow wave activity (SWA), the EEG power between 0.5 and 4.5 Hz during non-rapid eye movement (NREM) sleep, is an established marker of sleep need and intensity, since it increases with wake duration, declines in the course of sleep, and is positively correlated with arousal threshold during sleep (Vyazovskiy et al., 2011). Rats that spent time exploring new objects later show higher NREM SWA relative to rats that ignored the objects, even though wake duration was the same in all animals (Huber et al., 2007). In humans, high-density EEG (hd-EEG) experiments also show that SWA can be regulated locally, depending on the specific wake experience. For example, SWA peaks in left frontal cortex after training in a language task, and in parietal regions after learning a visuo-motor task (Huber et al., 2004; Hung et al., 2013). Thus, there is strong electrophysiological evidence that wake-related learning and sleep need are linked.
Long or enriched wake also leads to structural changes in neurons. In the fly brain, dendritic branches and synaptic puncta increase with wake and decrease with sleep (Bushey et al., 2011; Donlea et al., 2009; Donlea et al., 2011). In the adolescent mouse cortex, wake leads to net spine formation while sleep results in net spine elimination (Maret et al., 2011; Yang and Gan, 2012). Electron microscopy studies also show that wake/sleep dependent structural changes also occur in astrocytes. Thus, astrocytic processes move closer to the synaptic cleft after short sleep deprivation, and astrocytic coverage of cortical spines increases after chronic sleep loss (Bellesi et al., 2015). While direct evidence for similar changes in the human brain cannot be easily achieved, diffusion weighted imaging (DWI) is widely used to derive several indices reflecting the micron-scale density and organization of brain tissues (Sagi et al., 2012). In particular, mean diffusivity (MD) – a measure of tissue density based on the rate of water diffusion – has been proposed as a potential marker for the detection of relatively rapid changes in the microstructure of grey matter (GM) and white matter (WM). Indeed, careful investigations have shown that MD decreases in hippocampus, parahippocampus and fornix after just a few hours of visuo-spatial training (Hofstetter et al., 2013; Sagi et al., 2012; Tavor et al., 2013). Similar changes occur in the rat hippocampus, presumably due to an increase in glial cell volume and/or a decrease in extracellular space (Hofstetter et al., 2013; Sagi et al., 2012; Tavor et al., 2013). These findings, however, raise some fundamental questions. If the brain's ultrastructure can be altered so quickly after a few hours of training, what happens if subjects continue to practice? Do these structural changes continue to accumulate if subjects are kept awake and continue to practice a task at night, when they would normally be asleep? And if so, can sleep revert them? To address these questions we performed structural Magnetic Resonance Imaging (MRI) and DWI during the physiological sleep/wake cycle, after 12-24h of intense task training, and after post-training sleep.
Sixteen healthy volunteers (age 24.0 ± 3.4 years, 8 females; 13 right-handed) were recruited from the University of Wisconsin-Madison campus. All participants had sleep duration of ~7h/night, consistent bed/rise times, no daytime nap habit, no excessive daytime sleepiness (total scores in the Epworth Sleepiness Scale ≤ 10) and no history of sleep, medical, or psychiatric disorders as assessed by a clinical interview and by one 8h night sleep recording with hd-EEG. Polysomnographic parameters (see Table 4), including total sleep time and the percentage of different sleep stages were comparable to those of healthy individuals of similar age (Ohayon et al., 2004). Sleep scoring was performed over 30 sec epochs according to standard criteria by a sleep medicine board certified physician (Silber et al., 2007). Subjects were asked to maintain a regular sleep-wake schedule for at least one week before each experiment, and compliance was verified with sleep diaries and wristworn actimeters (Actiwatch 64, MiniMitter). Use of alcohol and caffeine-containing beverages was prohibited starting the day of the first MRI scan and throughout each experiment. The study was approved by the local IRB. Each participant signed an IRB approved informed consent form before enrollment into the study.
This study was part of a project assessing the effects of extended wake with training and post-training sleep on EEG, behavioral, and structural measures (Bernardi et al., 2015). Each subject participated in two experiments (DS and EF, see below), spaced at least 2 weeks apart (Fig. 1). Each experiment included 5 consecutive MRI sessions (every ~12h) with both functional and structural scans, all occurring in quiet wake: 1) WB (wake baseline) at ~7pm, after a wake day spent outside the lab without any specific training; 2) SB (sleep baseline) the next morning at ~8am, after subjects slept at home as usual; 3) WT12 (wake with training) ~8pm, after 12h of wake with extensive training in the lab; 4) WT24 (extended wake with training) ~8am, after 24h of continuous wake with extensive training in the lab; 5) SR (sleep recovery) ~8pm, after ~8h of recovery sleep with hd-EEG recording in the lab (256 channels; Electrical Geodesics Inc.; recovery sleep onset ~10am). During the 24h of continuous wake all subjects completed six 2h-training sessions (12h total of training) of either a mouse-controlled driving simulation game (DS experiment), or a battery of tasks based on impulse control, decision-making and conflict resolution (executive functions, EF experiment). As discussed in previous work (Bernardi et al., 2015), the two tasks were selected to involve cognitive domains and brain cortical networks that were as distinct as possible, namely a bilateral occipito-parietal and motor network for DS and a network that includes inferior frontal gyrus, medial prefrontal cortex, cingulate cortex and pre-supplementary motor area for EF. The order of the two experiments was randomly assigned and counterbalanced across subjects. During the 24h of wake subjects alternated between 2h-training sessions of task practice (DS or EF) and ~1h blocks of behavioral tests and hd-EEG recordings. Each test-block included two 4min eyes-open and eyes-closed recordings, a 5min psychomotor vigilance test (PVT), 3 trials of a response inhibition test, 3 trials of a visuo-motor coordination test, and self-rating questionnaires that were used to assess subjective sleepiness (Bernardi et al., 2015). Two experimenters took turns attending to the participants to prevent them from falling asleep and to ensure adherence to the protocol throughout the experiment.
During each MRI session (3T scanner, Discovery MR750, GE Healthcare) subjects underwent a 5min eyes-closed EPI resting-state scan (as reported in previous work Bernardi et al., 2015) and a high-resolution 3D inversion-prepared fast spoiled gradient echo (IR-fSPGR) T1-weighted (T1w) anatomical scan (inversion time: 450ms, repetition time: 8.2 ms, echo time: 3.2 ms, flip angle: 12°, voxel size: 1×1×1 mm, in-plane matrix: 256×256, number of slices: 156). DWI data were acquired using repetition time = 7000 ms, echo time = 66.3 ms, flip angle = 90°, acquisition matrix = 96×96, field of view = 230 mm, in-plane resolution = 2.396 × 2.396 mm (resolution after on-scanner interpolation = 0.898 × 0.898 mm), and slice thickness = 2.3 mm with no gap. Diffusion-sensitizing gradient encoding was applied in 52 directions with three different diffusion-weighted factors, corresponding to b = 400, 800 and 1200 s/mm2. Six images (b0 image) were acquired without use of a diffusion gradient. For each encoding direction, 61 axial images were acquired to cover the entire brain. Due to technical problems, MRI data were not obtained in subjects S05 and S14 (males, 1 left handed) during experiment EF. In addition, structural data were not obtained in SB of subjects S07 and SR of subject S15 during experiment EF (females, right handed), and in session WB of subject S08 (female, right handed) during experiment DS.
High-resolution T1w images were automatically processed using the Freesurfer longitudinal pipeline (Reuter et al., 2012). Differently from common procedures for structural analysis, the longitudinal approach allows to obtain more reliable cortical and subcortical morphological measurements by incorporating temporal information. Specifically, an unbiased within-subject template was created using robust, inverse consistent registration (Reuter et al., 2010). Then, subsequent preprocessing steps, including skull stripping, standard-space transformation, atlas registration and spherical surface maps generation and parcellations were performed using common information from the within-subject template (Reuter et al., 2012). Thus, for each subject and time-point, the software automatically assigned neuroanatomical labels to each brain location using probabilistic information estimated from both geometric data derived from the cortical model and neuroanatomical convention obtained from a pre-labeled training set (Desikan et al., 2006; Fischl et al., 2004). Importantly, this parcellation strategy allowed to minimize potential biases related to inter-subject anatomical differences and alignment issues caused by data transformation in a common reference system, as well as to take into account relative within subject structural variations.
Analysis of diffusion images was performed using the FSL software package (Smith et al., 2004). For each session and subject, all diffusion weighted and b0 images were affinely coregistered to the b0 image of the first repetition using FLIRT (FMRIB's Linear Image Registration Tool; Jenkinson and Smith, 2001), to correct both for eddy current induced distortion (eddy_correct tool) and subject's motion effects. Moreover, computed motion parameters were used to adjust the direction of gradient vectors. A brain mask was created from the first b0 image using BET (Brain extraction Tool; Smith, 2002) and used to constrain the tensor fitting within voxels of interest. A linear least squares (LLS) approach was used to fit the tensor models at each voxel (FDT, FMRIB's Diffusion Toolbox; Behrens et al., 2003) and compute the MD maps (of note, partially different absolute MD values, but analogous statistical results were obtained if the weighted linear least squares approach was used instead). Finally, DWI data were aligned to the Freesurfer within-subject template using a two-step procedure. First, using an affine linear registration (FLIRT, 12 degrees of freedom), the average b0 image of each DWI scan was coregistered to the b0 image obtained in the same MRI session of the T1w image representing the reference of the anatomical template. Then, 3dQwarp (Cox, 2012) was used to perform a constrained non-linear registration to the obtained b0 images and match the within-subject anatomical template. Importantly, a cost function that is insensitive to contrast differences (mutual information) was adopted to optimize correction of geometrical distortions in DWI data (Gholipour et al., 2006). Resulting transformation matrices and deformation fields were then concatenated and applied to individual MD maps (Fig. S1).
MD was measured in four large regions of interest (ROIs; Fig. 2): cortical grey matter (GM), subcortical GM, white matter (WM) and ventricles. In the same ROIs we also measured volume (cortical and subcortical GM, WM and ventricles). Cortical GM thickness was measured instead of cortical volume because the latter is affected both by cortical thickness and by area, and thus its variations may be more difficult to interpret (Winkler et al., 2010). These measures were extracted for each subject, experimental condition (DS, EF) and available time point (1-5). To improve the accuracy of MD calculation for each ROI, structure-specific masks were created for each time-point and subject, using the segmentation information previously obtained in Freesurfer. In addition, to minimize biases related to potential changes in the number of voxels included in each time point, we created a unique conjunction mask (logical AND) for each subject and structure of interest (Table S1). Thus, only voxels for which the ‘structural labeling’ did not change across scans were included in the MD analysis. Mean MD values were subsequently extracted from the obtained masks. For measures of MD, volume, and cortical thickness, we performed a three-way repeated measures (rm)-ANOVA in SPSS Statistics 21 (IBM Corporation), including training condition (WB, SB vs. WT12, WT24), time-of-day (8am vs. 8pm), and task (DS vs. EF) as within-subjects factors. To ensure a balanced design with no missing data, only the 12 subjects having all time points were included in these analyses. However, to better characterize reliability of potential findings, analyses were also repeated using analytical models allowing for the inclusion of all subjects (N=16) and time-points (Linear Mixed Effect Analysis, LME). Results of these auxiliary analyses are reported in Table S2. Planned post-hoc comparisons tested the effects of normal sleep (WB vs. SB), 12h of wake with training (WB vs. WT12), and sleep deprivation with training (WT12 vs. WT24), as well as differences between morning and evening time points (morning: SB vs. WT24; evening: WB vs. WT12). Statistical significance accounting for multiple comparisons was assessed by applying the Bonferroni-Holm adjustment. Finally, the effect of recovery sleep was investigated through paired t-tests comparing WT24 and SR.
To investigate smaller scale structural modifications and evaluate local effects related to the practiced task we also performed additional ROI-based analyses using the 40 cortical and subcortical regions of the Desikan-Killiany Atlas (Fischl et al., 2004). Given that no lateralization of the effects of interest was expected, extracted values of MD, cortical thickness, or volume were averaged across homologue areas of the two hemispheres. Then, independent rmANOVAs were performed at each ROI for either thickness/volume or MD and the obtained p-value of each tested effect was subsequently adjusted to account for multiple comparisons. Specifically, a Bonferroni-Holm correction was applied across the 40 ROIs and significance threshold was set to corrected p < 0.05.
MD measurements are known to be potentially affected by the so called “partial volume effect”, that is, the presence of multiple tissue types within the same voxels (Alexander et al., 2001). Specifically, CSF contamination in GM or WM may be responsible for erroneous estimations of diffusivity measures. We addressed this issue using a combination of different approaches. First, we performed a serial visual inspection of MD distributions in each ROI to exclude the possible influence of partial volume effects. Specifically, for each subject and time point we divided MD values in 0.1*10-3 mm2/s bins (from 0 to 3*10-3 mm2/s) and determined the percentage of voxels included in each bin. The distributions of the group-averaged values in the baseline condition without training (SB) and in the experimental condition with training (WT12, WT24) were plotted and qualitatively compared. Second, in order to further evaluate the possible influence of partial volume artifacts on our results, we repeated the cortical GM analysis using an approach based on a “skeletonized” cortical mid-GM mask excluding voxels with a high probability of containing multiple tissues (Ball et al., 2013). Specifically, segmentation maps obtained in Freesurfer were initially used to identify the WM-GM and the GM-CSF boundaries. Then, a mid-GM mask was defined as the line passing at equal distance from the two boundaries (i.e., 50% of cortical thickness) in each point. Finally, voxels lying along this line, and included in the previously defined conjunction cortical mask, were used to create a final “mid-GM mask” (Fig. S2 and S3). This procedure allowed to retain only voxels relatively distant from both the WM-GM and the GM-CSF boundaries, and thus characterized by a minimal probability of containing mixed tissues. The obtained mid-GM mask included 60.4 ± 1.4 % less voxels than the conjunction cortical mask. Cortical MD values were extracted from this new mask and analyzed as previously described. Moreover, given that a few voxels affected by CSF contamination may have remained included also in the mid-GM mask, the above calculations were repeated after further discarding voxels containing MD values greater than 1.0*10-3 mm2/s.
Recent findings suggest that the human brain may undergo morning-to-evening size variations and it has been suggested that these changes may depend on a redistribution of body fluids (Nakamura et al., 2015). However, global or structure-specific volumetric variations may also reflect other underlying phenomena potentially associated with changes in water diffusivity (e.g., compression). In these conditions MD and volumetric measures are expected to covary. These observations highlighted the need to evaluate global brain volumetric changes in our samples and to explore their possible relationship with MD measures. Importantly, however, recent studies also suggested that head movement could lead to erroneous estimations of diffusivity-based measures and recommended to include head motion as a nuisance variable in statistical analyses to reduce the risk of biases related to this potential confounding factor (Yendiki et al., 2014). Therefore, statistical LME models were specifically computed with the inclusion of covariates represented by the total in-scanner head movement (estimated from the coregistration of DWI volumes using the RMS deviation measure; Jenkinson, 1999) and either the total brain volume (calculated here as the sum of the volumes of cortical GM, subcortical GM and WM) or the total volume of each examined structure (e.g., cortical GM). The models included the same within subject factors introduced in the rmANOVAs (task, time-of-day, training condition). Finally, relative global volumetric changes were also investigated using a rmANOVA, as previously described.
We tested whether global changes in MD, volume or cortical thickness that occurred after recovery sleep (SR vs. WT24) correlated with different sleep parameters, including total SWA and changes in SWA and slow waves amplitude. EEG recordings were first-order high-pass filtered (0.1 Hz) and band-pass filtered between 0.5 and 58 Hz. For scoring purposes, four of the 256 electrodes placed at the outer canthi of the eyes were used to monitor eye movements (electro-oculography), while electrodes located in the chin-cheek region were used to evaluate muscular activity (electromyography). Due to technical problems during the recordings, EEG data were not available in subjects S03 and S07 (experiments EF and DS, respectively). Bad channels were visually identified, rejected, and replaced with data interpolated from nearby channels using spherical splines (NetStation, Electrical Geodesics Inc.). SWA activity was calculated for each NREM epoch as the spectral power in the range between 0.5 and 4.5 Hz. Specifically, after excluding electrodes located on the neck/face region, the signal of each channel was re-referenced to the average of the remaining 185 electrodes, and the power spectral density estimates were computed using the Welch's method (pwelch function, MATLAB signal processing toolbox) in 2sec data segments (Hamming windows, 8 sections, 50% overlap). The resulting power spectral densities in the SWA range were then averaged across the 185 electrodes and within each epoch.
For the slow wave detection procedure, preprocessed EEG signals from each NREM epoch and channel were initially referenced to the average of the two mastoid electrodes. Then, an automatic detection algorithm adapted from a previous study (Siclari et al., 2014) was applied. Specifically, we first created a single timing reference by calculating the negative-going signal envelope, defined as the 0.025 quantile of the signal values detected across all channels for each point in time. The resulting signal was broadband filtered (0.5-40 Hz, stop-band at 0.1 and 60 Hz) prior to the application of the wave detection. Only slow waves with a duration of 0.25-1.25 sec between consecutive zero crossings were further evaluated. Additional criteria were applied to exclude negative signal deflections of potential artifactual origin. Specifically, for each slow wave we first calculated the scalp involvement as the mean signal achieved in the 20ms around the wave peak in each channel. Then, the top 5% electrodes showing the lowest EEG signal values (maximal involvement) were identified. A slow wave was discarded if at least half of these electrodes were located in the neck/face regions (possible muscular artifact) or around both eyes but not in the medial and lateral frontal areas (possible ocular artifact).
Finally, for each subject and experimental condition, we calculated the total number of detected slow waves, the total SWA (defined as the mean SWA computed across all NREM epochs), and the variation in slow wave amplitude and SWA from the first to the last NREM sleep cycle (difference last-first). The Pearson's correlation coefficient was used to investigate the potential existence of a correlation between structural changes following recovery sleep and examined sleep parameters (p < 0.05, Bonferroni-Holm correction). The correlations with total sleep time and N2, N3 and REM time were also examined.
Vigilance levels were measured by calculating the mean reaction time during PVT (Bernardi et al., 2015) for test blocks completed in temporal proximity with each MRI scan except the first one (PVT was not performed before the first scan). A machine learning procedure was developed using MATLAB (The MathWorks, Inc.) and LibSVM (Wang et al., 2011) to assess the potential relationship between structural changes and variations in vigilance levels. Specifically, using support vector regression (SVR) machines (Drucker et al., 1997), PVT reaction times were predicted from the 8 measures derived from the global structural analyses (MD, volumes and cortical thickness for the 4 ROIs) in both DS and EF experiments. Reaction times and MRI measures were normalized (subtracting the mean and dividing it by the standard deviation) across sessions and within each subject. A leave-one-subject-out cross-validation procedure (based on the removal of all time-points of the test subject) was then used to train and test linear SVR machines, which resulted in predicted reaction times for the left-out subject across MRI sessions. The mean squared error (MSE) was estimated by comparing real and predicted data. Moreover, to assess the goodness of the prediction, a procedure based on permutation tests was developed (1,000 repetitions). Specifically, a MSE null distribution was obtained by training and testing SVR machines with data shuffled across subjects and within each time point. Collected MSE values from the permutation procedure were compared to the MSE estimated from real data with a one-tailed rank test (p < 0.05). Importantly, the leave-one-subject-out procedure allowed excluding a possible bias of subject-specific values on the prediction accuracy. Moreover, the null distribution obtained from the permutation test allowed ruling out a possible generalized mean effect related to the examined time points (i.e., an identical effect of the time-of-day in all subjects). The R2 coefficient between predicted and original data was calculated for each subject.
A machine learning procedure similar to the one applied to explore the relationship between structural changes and vigilance level was used on behavioral parameters reflecting the individual performance in the response inhibition test and in the visuo-motor test (Bernardi et al., 2015). As described in previous work, the response inhibition test consisted of a classical Go/NoGo test during which a stream of visual stimuli, that are capital letters X or Y, was presented in alternating order (1Hz). Subjects were requested to press a button for every stimulus that followed a different stimulus (Go), and to withhold their response each time two identical stimuli followed each other (NoGo). The proportion of commission errors (i.e., cases in which the subject responded despite a NoGo stimulus was presented) and the intraindividual coefficient of variation (ICV, defined as the standard deviation of reaction time divided by the individual mean) were used as indices of inhibitory efficiency. Differently, during the visuo-motor test, participants were required to perform straight, out and back movements of a tracker, held with the dominant hand, from a central starting area to one of 8 radial targets (time interval = 1.5 sec). The movement time (time from movement onset to reversal) and the linear error (distance of the reversal point from the center of the target) of each movement were collected and used as measures of visuo-motor control efficiency. As described above, impulse control and visuo-motor performance levels were calculated in test blocks completed in temporal proximity with each MRI scan except the first one.
Given that the present study design does not include independent datasets for training and testing the SVM classifier, a relative risk of “overfitting” is implied in the adopted leave-one-out procedure. Thus, in order to evaluate stability and reliability of obtained results, all analyses were repeated while replacing the leave-one-out method with a recursive half-split of the examined sample. Thus, for each of 1000 iterations, available subjects were equally divided in a training sample, used to train the classifier, and in a test sample, used to evaluate accuracy of the prediction of behavioral performance in the remaining participants. As described for the leave-one-out procedure, the goodness of the prediction was assessed using permutation tests through a shuffling of available data across subjects and within time-points. Results of this auxiliary analysis are reported in Table S9.
All subjects except one participated in two experiments (S14 completed only DS; see below). Each experiment included 5 MRI scans acquired following baseline wake without training (WB), baseline sleep (SB), 12h and 24h of wake with training (WT12 WT24), and post-training recovery sleep (SR) (Fig. 1). The only difference between the two experiments was in the practiced task, either a driving simulation task (DS) involving visuo-motor areas, or a battery of executive function tasks (EF) mainly relying on prefrontal cortex.
MD was measured in four large regions of interest (ROIs; Fig. 2, Table S1): cortical grey matter (GM), subcortical GM, white matter (WM) and ventricles. As a first step, a three-way repeated measures (rm)-ANOVA - including training condition (with, without), time-of-day (am, pm), and task (DS, EF) as within-subjects factors - was run using the 12 subjects for which time points from WB to WT24 (4 in DS, 4 in EF) were available (Table 1; also see Table S2). Neither the main effect of task, nor its interactions with other factors, reached statistical significance in any of the examined ROIs. By contrast, a significant main effect of training was identified in cortical GM (corrected p < 0.05). Importantly, this result was confirmed by statistical models that included total in-scanner head movement and either total brain volume or cortical volume as covariates (Table S3). No significant effects were seen in WM, subcortical GM and ventricles (Fig. S4B-D; Table 1). Next, planned post-hoc comparisons tested the effects of sleep and wake in the absence of training (SB vs. WB), training during the first 12h of wake, i.e. without sleep deprivation (WT12 vs. WB), and sleep deprivation with training (WT24 vs. WT12). Finally, the effect of recovery sleep was investigated through paired t-tests by comparing SR and WT24. Relative to baseline sleep (SB), a decrease in MD was already evident in cortical GM after the first 12h of wake with training (mean variation ± SE, computed across subjects after averaging across ROI voxels, SB-WT12 = -0.52 ± 0.17%), but not after 12h of baseline wake (Fig. 3A, Tables S4-S5). After 24h of training, cortical MD continued to decrease (SB-WT24 = -0.94 ± 0.15%), and this trend was reverted by subsequent recovery sleep.
Of note, typical MD values are substantially different in GM (~0.8*10-3 mm2/s), WM (~0.7*10-3 mm2/s), and most notably, cerebrospinal fluid (CSF, ~3.0*10-3 mm2/s). Thus, DWI analyses can be affected by confounds related to the partial volume effect, that is, the inclusion of voxels representing a mixture of multiple tissue types (Alexander et al., 2001), with erroneous estimations of diffusivity arising especially because of CSF contamination. However, a serial visual inspection of the distributions of MD values in the cortical ROI allowed to identify a clear shift of the histograms' peak after prolonged training: in fact, WT24 had a relatively higher percentage of voxels in the 0.6-0.8*10-3 mm2/s range, while SB had relatively more voxels in the 0.8-1.0*10-3 mm2/s range. No clear differences were observed in the tails of the distributions, which contain voxels with higher probability of being affected by partial volume artifacts. Moreover, we repeated the rmANOVA using MD values extracted from a cortical mid-GM mask that excluded voxels along the WM-GM and the GM-CSF interfaces (Ball et al., 2013). This approach confirmed the existence of a strong and significant effect of training condition in cortical GM, which survived after the further removal of voxels with MD value greater than 1.0*10-3 mm2/s (Table S6, Fig. 3B). Of note, the independence of detected MD variations from potential partial volume artifacts was also supported by the additional ROI analysis described below, in which we found that MD changes are relatively widespread (i.e., do not depend on few outlier regions). Finally, we found that MD changes are relatively independent from variations in brain or cortical volumes, and from relative changes in the extent of head movements (Table S3).
To test for regional effects, a three-way rmANOVA (training condition, time-of-day, task) was run using the 40 cortical and subcortical regions of the Desikan-Killiany Atlas (Fischl et al., 2004). Significant effects of training condition were found in several brain areas, including superior temporal sulcus, inferior temporal cortex, middle temporal cortex, lateral and medial orbitofrontal cortex (corrected p < 0.05; Table 2; Fig. S5). Moreover, the rmANOVA revealed a significant time by condition interaction in the superior temporal cortex and in the pars triangularis of the inferior frontal gyrus.
Analyses based on LME models taking into account changes in total head movement and in brain volume were also used to evaluate the potential influence of these possible confounding factors. Importantly, results of this additional analysis confirmed the effects detected in the inferior frontal gyrus and in the middle, inferior and superior temporal cortices. Differently, the main effects of experimental condition identified by the rmANOVA in superior temporal sulcus and medial/lateral orbitofrontal cortex did not reach the set threshold for statistical significance, although a clear trend was observed (uncorrected p < .008; Table S7). On the other hand, the LME analysis identified an additional condition effect in the fusiform gyrus, and time by condition interactions in the rostral anterior cingulate cortex and in the pars orbitalis of the inferior frontal gyrus. Overall, obtained results point to a distributed effect of the experimental condition on cortical MD, with the strongest variations in temporal and prefrontal brain areas. Finally, the LME models also detected significant main effects of the task (DS, EF) in mid/posterior cingulate cortex and in superior temporal cortex, although no interactions with other examined factors emerged.
Post-hoc tests confirmed the existence of a significant MD decrease after prolonged task practice in ROIs characterized by a significant condition effect or time by condition interaction. Recovery sleep was associated with a significant MD increase relative to the end of the sleep deprivation period.
We then focused on other structural parameters, namely cortical thickness and volume of subcortical GM, WM, and ventricles. The rmANOVA identified significant main effects of training on all 3 volumetric measures, but not on cortical thickness (corrected p < 0.05; Table 3; also see Table S2). A time of day effect was present in the ventricles, while neither the main effect of task, nor its interaction with other factors, was significant. Significant time by training interactions were present for cortical thickness and ventricular volume. Planned post-hoc comparisons identified no significant change in cortical thickness between baseline sleep and the first 12h of training (Fig. 4A, Tables S4-S5), although a relative increase was observed in most subjects (SB-WT12 = +0.26 ± 0.19). Similarly, there was no significant difference between 12h and 24h of training, although a trend towards a decrease was present (W12-WT24 = -0.88 ± 0.31%; p < 0.05, uncorrected). By contrast, a significant increase in cortical thickness was found after recovery sleep relative to a night of sleep deprivation (Fig. 4A).
Planned post-hoc comparisons for other structural parameters also found that relative to baseline sleep, volumetric changes were present after 12 and/or 24h of training in all three ROIs, but in different directions, with increases in subcortical GM and WM and decreases in the ventricles (Fig. 4B-D). All these changes were reverted by recovery sleep. No regional changes were found using the ROI-based analysis of thickness and volume in cortical and subcortical structures, with the exception of the thalamus, where a significant increase in volume was found after 24h of training (SB-WT24 = +0.64 ± 0.13%).
All described structural changes were not accompanied by significant variations in total brain volume (calculated as the sum of cortical GM, subcortical GM and WM). In fact, rmANOVA (task, time-of-day, training condition) failed to identify any significant effects (Table S8). Moreover, evening-to-morning changes - specifically explored through paired t-test in light of recent findings suggesting the existence of diurnal brain volumetric variations (Nakamura et al., 2015) - were not found in either the “baseline” condition (WB-SB: p > 0.18) or the “experimental” condition (WT12-WT24: p > 0.31).
All subjects had the possibility to sleep for ~8h during the day after the end of 24h of training (Table 4). Although recovery sleep occurred at the wrong circadian time (sleep onset ~10am), most of its features, including the proportion of NREM (N2+N3) and REM stages, were not different from those of baseline sleep during the night. However, in all subjects of both DS and EF the latency of the first REM sleep episode was significantly reduced (p < 0.05, Bonferroni-Holm correction), a sign of high REM sleep pressure likely due to the combined effect of sleep deprivation and sleeping during the day (Dijk and Czeisler, 1995). Moreover, in both experiments the proportion of time spent in deep sleep (N3) increased in recovery sleep as compared to baseline sleep, mainly at the expenses of the proportion of light sleep (N2). Actual sleep time, including all non-wake epochs, corresponded to 6.8 ± 1.3 h in DS, and to 7.1 ± 0.9 h in EF. Overall, no significant differences were observed between recovery sleep of DS and that of EF (all p > 0.05, uncorrected). In both experiments no correlation was found between total sleep time, or time spent in REM, N2 and N3, and any of the structural parameters that showed a significant change between 24h of training and recovery sleep (GM thickness and MD, subcortical GM volume and MD, WM volume, ventricular volume and MD). Total SWA, total number of detected slow waves, as well as changes in SWA and slow waves amplitude between the first and the last NREM sleep cycle also did not correlate with any structural measure of recovery (p < 0.05, corrected).
Next, we examined whether global structural variations were reflected in vigilance changes, measured using the mean reaction time during the PVT (Bernardi et al., 2015), a sustained vigilance task known to be highly sensitive to sleep loss (Basner and Dinges, 2011). In the previous study that used the same subjects we confirmed that PVT performance declines in the course of the 24h of practice and renormalizes after recovery sleep (Bernardi et al., 2015). Here mean PVT reaction times for test blocks completed in temporal proximity with each MRI scan were predicted from the 8 measures derived from the global structural analyses (MD, volumes and cortical thickness for the 4 large ROIs). The regression procedure identified a significant relationship between structural changes and variations in vigilance level in both DS (average R2 across subjects ± SD = 0.46±0.30; p < 0.001) and EF (R2 = 0.47±0.31; p < 0.001). Thus, variations in global structural measures were able to predict to some extent changes in sustained attention (Fig. 5; also see Table S9). Based on a qualitative evaluation across the two experimental conditions, we found that the most relevant predictors were represented by ventricular volume, MD of cortical GM and WM volume.
Finally, by applying the same procedure described above, we tested whether global structural changes were reflected in performance changes in two tests used to track variations in impulse control and visuo-motor coordination. In the previous study (Bernardi et al., 2015), we demonstrated that performance in the response inhibition test, as measured using the number of commission errors or the intraindividual coefficient of variation in reaction time (ICV), is characterized by a time-course similar to the one described for the PVT test, suggesting a relevant influence of the vigilance level. Indeed, ICV values were successfully predicted by global structural changes in both DS (R2 = 0.51±0.35; p < 0.001) and EF (R2 = 0.32±0.33; p = 0.017) experiments (Figure S6). Differently, prediction accuracy for commission errors reached significance in DS (R2 = 0.64±0.30; p < 0.001) but not in EF (R2 = 0.26±0.27; p = 0.053), perhaps due to lower statistical power in this latter condition (N=13 in EF; N=15 in DS). Of note, however, the quality of the prediction in EF may have been also negatively affected by local, experience-dependent changes in the impulse control network, which seem to be independent from changes in the general vigilance level (Hung et al., 2013). Indeed, in line with a possible dissociation between EF performance and vigilance levels, we previously showed that prolonged practice with tasks based on executive functions was associated with a relative performance impairment during the test-block corresponding to WT12, although no differences in the PVT performance were present between DS and EF in the same time-point (Bernardi et al., 2015). With regard to the visuo-motor test (Figure S7), both linear error (LE, a measure of movement accuracy) and movement time (MT, an index of eye-hand coordination efficiency) were not predicted by global structural changes in either DS (LE: R2 = 0.40±0.32; p = 0.288 – MT: R2 = 0.16±0.16; p = 0.233) or EF (LE: R2 = 0.33±0.39; p = 0.218 - MT: R2 = 0.28±0.28; p = 0.207). These findings are consistent with previous observations suggesting a lower vulnerability of the visuo-motor function to the effects of sleep deprivation (Bernardi et al., 2015), and potentially reflect a lower dependence of these parameters on the global vigilance state.
We found that MD declined in cortical GM after 12h of wake with intensive task practice, both compared to wake after a night of sleep and relative to 12h of wake without task practice. Moreover, we observed that cortical MD declined even further after 24h relative to 12h of practice, and that all changes were reverted by ~7h of sleep.
Recent studies in humans and rats had shown a link between short-term learning, such as practicing a visuo-spatial task for less than 2h, and decreases of MD (Hofstetter et al., 2013; Sagi et al., 2012). These studies also demonstrated that MD changes are due to learning rather than movement and general activity (Hofstetter et al., 2013; Sagi et al., 2012). Other studies had examined MD changes in the context of chronic sleep disorders and found heterogeneous local changes, including an increase in the upper brainstem with REM sleep behavior disorder (Scherfler et al., 2011), an increase in hypothalamus and frontal cortex with narcolepsy-cataplexy (Scherfler et al., 2012), and a decrease in some cortical and subcortical areas with obstructive sleep apnea, an effect that reverts after treatment (refs in Castronovo et al., 2014).
MD reflects tissue density, and thus its decrease in GM may result from several not mutually exclusive factors including increase in synapse size, cell swelling, changes in extracellular space, and/or increase in glial cell volume. Although synaptic strength, which is correlated with size (Meyer et al., 2014), is known to increase with wake and decrease with sleep (Tononi and Cirelli, 2014), synapses contribute little to the overall GM volume, making it unlikely that rapid changes in MD after training and extended wake can be accounted for primarily by synaptic changes per se. More likely candidates are microstructural changes that are triggered by and/or are associated with synaptic activity and plasticity, such as variations in the ratio between intra- and extracellular volumes and astrocytic changes (Assaf and Pasternak, 2008). For instance, neuronal activity leads to a decrease in extracellular space (Ransom et al., 1985), and learning, sustained neuronal activity, or induction of long-term potentiation result in astrocytic hypertrophy and increased astrocytic coverage of synaptic processes (Anderson et al., 1994; Bernardinelli et al., 2014; Genoud et al., 2006; Jones and Greenough, 1996; Wenzel et al., 1991). Using serial-block-face electron microscopy, we recently found that after extended wake peripheral astrocytic processes in mouse frontal cortex move closer to the synaptic cleft, expand, and increase their surface to volume ratio (Bellesi et al., 2015). These changes likely enhance the housekeeping functions of astrocytes and promote glutamate clearance from the cleft. At the same time, however, since the neuropil is filled with astrocytic processes, their wake-related “expansion” may impair the diffusion of water and other small molecules, potentially accounting for changes in MD.
In our paradigm, we did not detect a significant global change in MD after sleep following baseline wake. A previous study using voxel-based analyses found large local increases in brain diffusivity in the morning relative to the evening in the absence of training (4.4-5.6% increase in apparent diffusion coefficient) (Jiang et al., 2014). Voxel-based analysis may permit a more powerful detection of localized MD changes, but it can also lead to spurious results if partial volume effects or within-subject and across-subjects alignment issues are not adequately controlled for. Our analytical approaches were selected to minimize these problems.
While MD changes occur early during wake with practice and are sensitive to training per se, no significant global or regional changes in cortical thickness were found after 12h of training relative to baseline (after wake and/or after sleep), although there was a tendency towards an increase. The subsequent change from 12h to 24h of practice was significantly different from the variation in the normal sleep night (WB-SB), although we detected no significant differences between individual time points. Moreover, recovery sleep was associated with a significant cortical thickness increase. Changes in cortical thickness have been observed mainly during development, aging, or in response to chronic manipulations. A case in point is adolescence, during which cortical GM shows a progressive thinning, a phenomenon that has been linked to improvement in cognitive functions and may result from selective pruning of inefficient synaptic connections and increases in myelination (Schnack et al., 2014). Changes in cortical thickness have also been reported in chronic primary insomnia, but results are inconsistent (Dang-Vu, 2013). Finally, increases in GM thickness have been described after weeks or months of training (May, 2011), but not after short training.
In our case, cortical thickness trended in opposite directions after 12h (increase) and 24h (decrease) of practice. This result suggests that sleep loss per se may have a more prominent influence on this parameter than synaptic plasticity, but more experiments are needed to test this hypothesis.
Training and extended wake also led to increased volume in subcortical GM and WM and decreased volume in the ventricles, and all changes were reverted by recovery sleep. These findings may be related to the recently discovered role of sleep in modulating circulation of interstitial fluids (ISF) (Xie et al., 2013). In fact, CSF is known to interchange with the brain ISF, and their combined movement allows the clearance of solutes from the brain (Brinker et al., 2014; Iliff et al., 2012). CSF production as measured by phase-contrast MRI also shows a strong circadian pattern in humans, being lower during the day and peaking at night (Nilsson et al., 1992). Importantly, the CSF-ISF movement is favored by sleep and impaired by wake (Xie et al., 2013) and sleep deprivation (Plog et al., 2015). Given these premises, sleep deprivation at night may have significantly reduced CSF-ISF movement and CSF production, impairing the clearance of brain metabolites and causing a reduction in ventricular volume and a compensatory expansion, or a swelling, of nearby structures such as subcortical GM and WM. Recovery sleep would revert described changes by promoting CSF-ISF movement and CSF production.
Importantly, all described structural modifications do not simply reflect alterations determining global, uniform variations in brain size, as indicated by the absence of significant changes in total brain volume throughout the experiments. The apparent contrast with recent work suggesting the existence of brain diurnal volumetric fluctuations (Nakamura et al., 2015) may depend on several differences in experimental design, since we only included healthy young adult volunteers who were not assuming any medications, and who were kept in homogeneous and controlled (although “extreme”) experimental conditions. On the other hand, we cannot exclude that modest global volumetric variations could have remained undetected in our dataset due to our smaller sample size.
Structural changes associated with recovery sleep did not correlate with any single sleep parameter that we tested. This may suggest that sleep as a whole, or at least several sleep features together, contributed to the effect. Other factors however, cannot be ruled out, including a “ceiling effect” caused by the high efficiency of sleep in all subjects, most likely due to the combination of 24h of continuous wake with the intense task practice.
Previous studies suggested that specific structural measures taken at rest may reflect the degree of individual cognitive vulnerability to sleep deprivation (Cui et al., 2015; Rocklage et al., 2009). Here, we found that structural changes occurring during extended wake can be used to predict individual vigilance levels, and may partially contribute to predict variation in specific behavioral measures related to performance in an impulse control test. Thus, our results confirm and support the existence of a link between microstructural alterations and cognitive performance, and suggest that some individuals may be more resilient to sleep loss because of an optimal initial “structural reserve” that makes them less vulnerable to the microstructural alterations occurring during extended wake. Overall, they suggest that sleep' ability to counteract performance deficits is linked to its effects on the brain microstructure.
Our study has several limitations. First, MRI-based approaches can only provide indirect measures of microstructural changes occurring within the human brain. Consequently, hypotheses regarding the link between variations in MRI-based parameters and the underlying biological mechanisms will require verification in different experimental models. On the other hand, several previous studies support the reliability of MRI-derived indices for studying a variety of physiological processes, including experience-dependent learning (Johansen-Berg et al., 2012; Zatorre et al., 2012), thus providing a solid foundation for the interpretation of present findings.
MRI-based parameters, and in particular measures related to water diffusivity, can be influenced by many confounds, including the partial volume effect (Alexander et al., 2001). While there are currently no commonly accepted gold-standard methods to address this problem, here we applied several strategies to minimize the potential impact of artifacts related to the partial volume effect, including the exclusion of voxels with a high probability of CSF contamination, and the adjustment of statistical models based on volumetric changes (associated to relative variations of the GM-CSF boundary). The obtained results suggest that changes in CSF contamination cannot account, alone, for the observed variations in cortical MD. This conclusion is further supported by the observation, during prolonged wakefulness, of parallel changes in cortical MD and thickness, and by the detected relationship between changes in cortical MD and in vigilance level. We acknowledge, however, that future studies using additional methods to limit the partial volume effect are required to provide an independent validation of the present results.
Another potential issue is the occurrence of wake-sleep transitions during the acquisition of DWI-scans, especially because deep sleep is known to be associated with relative changes in participants' movements and in brain temperature (Franken et al., 1992; Ogilvie, 2001), which, in turn, may affect diffusivity measures. The most accurate approach to control for this confound is MRI-EEG co-registration. However, this solution was not appropriate in the context of the present study for at least two reasons: i) the complexity of the experimental setup, together with technical and temporal constraints, limited the possibility to prepare a MRI-compatible EEG registration in our samples; ii) the EEG-net is known to affect the MRI signal, potentially leading to alterations of the estimation of structural parameters (Klein et al., 2015; Luo and Glover, 2012). On the other hand, several observations suggest that changes in behavioral state (wake/sleep) cannot account for our findings. First, the inclusion of head motion as a nuisance variable in statistical analyses had no relevant impact on our results. Moreover, the influence of a general decrease in brain temperature appears implausible, because relative changes in MD occurred with different timings in different structures. Of note, the initial MD decrease in cortical GM was also observed at a time-of-day typically associated with the circadian minimum in sleep pressure (Lavie, 1986). In summary, several empirical and theoretical considerations indicate that the possible occurrence of a transition to sleep in some of the subjects would not be sufficient to account for our results, leaving the change in tissue microstructure as the most plausible explanation.
Supported by NIH (grant R01MH099231 to CC and GT), McDonnell Foundation (to MFG and GT), the Waisman Brain Imaging Core (supported by NIH P30 HD003352), the Swiss National Foundation and the Swiss Foundation for Medical-Biological Grants (Grants 139778 and 145763 to FS), and Fondazione Cassa di Risparmio di Lucca (Lucca, Italy). The authors thank Juan Benzo, Ching-Sui Hung, Jeffrey Guokas and Corinna Zennig for help with data collection and technical assistance.
Conflicts of interest. G. Tononi is involved in a research study in humans supported by Philips Respironics. This study is not related to the work presented in the current manuscript. R. Benca has served as a consultant to Merck and Jazz, and receives research funding from Merck. A. Alexander is part owner of inseRT MRI, Inc. This company is not related to the work and did not sponsor this research. The other authors have indicated no financial conflicts of interest.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.