|Home | About | Journals | Submit | Contact Us | Français|
It is becoming increasingly clear that combining multi-modal brain imaging data is able to provide more information for individual subjects by exploiting the rich multimodal information that exists. However, the number of studies that do true multimodal fusion (i.e. capitalizing on joint information among modalities) is still remarkably small given the known benefits. In part, this is because multi-modal studies require broader expertise in collecting, analyzing, and interpreting the results than do unimodal studies. In this paper, we start by introducing the basic reasons why multimodal data fusion is important and what it can do, and importantly how it can help us avoid wrong conclusions and help compensate for imperfect brain imaging studies. We also discuss the challenges that need to be confronted for such approaches to be more widely applied by the community. We then provide a review of the diverse studies that have used multimodal data fusion (primarily focused on psychosis) as well as provide an introduction to some of the existing analytic approaches. Finally, we discuss some up-and-coming approaches to multi-modal fusion including deep learning and multimodal classification which show considerable promise. Our conclusion is that multimodal data fusion is rapidly growing, but it is still underutilized. The complexity of the human brain coupled with the incomplete measurement provided by existing imaging technology makes multimodal fusion essential in order to mitigate against misdirection and hopefully provide a key to finding the missing link(s) in complex mental illness.
“Multimodal” is a widely used phrase in the context of brain imaging studies. Collecting multiple modalities of magnetic resonance imaging (MRI) data from the same individual has been popular in brain imaging studies. There is increasing evidence that multimodal brain imaging studies can help provide a more complete understanding of the brain and its disorders, for example it can inform us about how brain structure shapes brain function, in which way they are impacted by psychopathology and which functional or structural aspects of physiology could drive human behavior and cognition.
In this paper we first provide some basic motivation regarding the benefits of multimodal imaging and also introduce some basically terminology for characterizing multimodal data analysis. Next, we review a large class of multivariate approaches for performing multimodal data fusion, the most powerful type of multimodal analysis. Followed by this we survey some of the existing articles that have applied multimodal imaging to study psychopathology. Finally, we discuss some exciting emerging trends and approaches and conclude.
We now present some basic terminology with which to describe existing multimodal imaging work. On one end of the spectrum is visual inspection which is basically inferring the multimodal information by separately visualizing results from essentially unimodal analyses. This is the least informative, but is used quite extensively, and can highlight the different results that are provided by each modality in a qualitative manner. An alternative approach—which we call data integration(1-3)—is to analyze each data type separately and overlay them—thereby not allowing for an examination of interaction among data-types. For example, a data integration approach would not detect a change in gray matter concentration between patients and controls that is related to fMRI activation maps, as shown in the example. A third approach, called one-sided or asymmetric data fusion is the use of one data set to constrain another as in diffusion MRI (dMRI)(4-6) or magneto/electroencephalography (M/EEG)(7-9) being constrained by structural MRI (sMRI) or fMRI data. While these techniques are powerful, a restriction is that they impose potentially unrealistic assumptions upon the dMRI or EEG data, which are of an essentially different nature than fMRI data. Finally, symmetric data fusion, utilizes and treats multiple image-types equally to take fully advantage of the joint information in multiple data sets. The approaches just described are shown in Figure 1. The use of joint information is only qualitatively used on the far left, and maximally used on the far right.
Surprisingly, the maximal use of multimodal data via data fusion is still not a universally accepted way to study brain function. This is true, despite the common understanding that any brain-imaging modality alone provides only a limited view into brain function; despite growing availability of multimodal datasets; and despite general acceptance of the complementary nature of information hidden in various modalities.
Currently, a large number of studies are collecting multimodal brain imaging data and information from the same participants. These imaging data types should be leveraged to extract the complementary information. For example, fMRI measures the hemodynamic response related to neural activity in the brain dynamically; sMRI enables us to estimate the tissue type for each voxel in the brain [gray matter (GM), white matter (WM), cerebrospinal fluid (CSF)]. Diffusion MRI can additionally provide information on the integrity of white matter tracts and structural connectivity. A key motivation for jointly analyzing multimodal data is to leverage the cross-information in the existing data, thereby revealing important relationships that cannot be detected by using a single modality.
But, practically speaking, why should we analyze multimodal imaging data jointly, instead of just analyzing each domain separately? Consider identifying a single relevant feature from one modality (say volume of the hippocampus) and correlating it with all brain voxels across subjects in the other modality (say a default mode connectivity map), then testing for group differences in this correlation. This is not the same as separately evaluating which brain volumes show group-related activity changes and which regions in the default mode network show group-related differences. The former analysis can be considered a type of data fusion, because both data sets are used to estimate a joint result. Such approaches have in many cases enhanced our ability to distinguish patients versus controls (see Figure 2)(10).
Capturing joint information from multiple data sets can be done using univariate approaches (such as correlation) or multivariate approaches (such as independent component analysis (ICA)). Multivariate approaches(11-16) have a unique advantage as their focus is on inter-related patterns rather than unrelated points (see Figure 3). This makes them ideally suited for detecting complicated, and potentially weak, effects hidden in a high-dimensional data sets. In contrast to univariate analyses, multivariate approaches estimate all the variables jointly. This provides a couple major benefits. First, it helps with interpretation, as one can accurately assume the region in a given component covary together. Secondly, it can provide robustness to noise. For example, correlation-based approaches can be ‘tricked’ by phenomena such as phase randomized noise which can appear to represent real signal(17), whereas approaches that focus on identifying ‘patterns’ are better able to distinguish randomized noise from real signal. This does not mean that multivariate methods are impervious to noise, but they do tend to be more robust than univariate correlation in many cases, because they are working with patterns instead of just paired relationships.
Among the multitude of reasons preventing researchers from practicing multimodal analysis, arguably, two reasons contribute the most: 1) a doubt that modalities that are very different contain any common information to reinforce their signal, and 2) an assumption that what is learned from one modality is at worst, incomplete, but not incorrect. It is quite easy to show specific examples where both reasons are incorrect. For example, in Figure 4(left) shows an example where a combined EEG (high temporal information) and fMRI (localized spatial information) analysis results in overlapping intrinsic brain activity(18). We can also easily demonstrate a case in which unimodal analysis can be misleading without any way for one to know it. Figure 4(right) shows how separate analysis of MEG and fMRI collected from the same subjects in identical experiments leads to strictly opposite conclusions about effective connectivity and thus brain function. Both of these examples motivate the importance of analysis frameworks that can correctly account for the properties of each of the data sources and also for taking multiple data sources into account whenever possible.
One of the issues to consider is whether there is a downside to a focus on multimodal data fusion. We focus on two areas here, the first being data collection and the second being data analysis. For the first, one obviously needs to collect multimodal data, which can add time and effort to the data collection process. However this particular down side is in many cases not a barrier as virtually all imaging studies collect at least two modalities, an often many more than two (e.g. sMRI, rest fMRI, task fMRI, dMRI). Regarding the analytic aspect, there are real barriers here. In terms of the analytic approach, understanding the various assumptions of a given algorithms (among many choices) is important but not trivial. Fortunately, the requirement for number of subjects is similar to that for a unimodal approach, as most approaches first reduce the data down into a smaller number of joint components, however the investigator would also need to be familiar with how to preprocess multiple modalities (or just rest fMRI and task fMRI which are less different). This may require learning different types of software tools and expanding the overall workload on the team. Secondly, once the data are preprocessed and analyzed, interpretation involves a learning curve. For example, one might need to consider multiple regions in two or more data sets, the relationship between the covariation of the data within these regions, and ultimately, what this can tell us about the particular problem being studied. This can take time, and unfortunately, there are not many multimodal training courses available, thus most of the knowledge is gained by interacting with labs that have expertise or learning from the beginning within a given lab. However, as we hope to show in this paper, we believe the advantages to be gained by taking a multimodal approach to one's data are many and worth the initial investment to learn how to perform such analyses.
Approaches to data fusion can be conceptualized as having a place on an analytic spectrum with detailed large-scale computational modeling at one end and highly distilled data at the other end(19,20). In between are methods that attempt to perform direct data fusion on high-dimensional summary measures(3,21-23). In this intermediate approach one extracts features by preprocessing the data, and employs these to examine the inter-modality relationships at the group level (i.e., variations among individuals). Here a “feature” is a distilled dataset representing the interesting part of each modality(3,24) and it is used as the input to the fusion analysis for each modality and each subject. Examples of features include a component image such as the default mode network resulting from a group ICA(25), a fractional amplitude of low frequency fluctuations (fALFF) map calculated from resting-state fMRI, a fractional anisotropy (FA) from dMRI measures, or segmented gray matter (GM) from sMRI data. The main reason to use features is to provide a more concise/focused, but still informative, space in which to link the data. This approach has several advantages in that it 1) allows us to take advantage of the ‘cross’-information among data types(1,2) and 2) enables indirect or direct associations to be inferred on putative structure-function relationships(26) in a way that does not require these modalities to have been measured simultaneously. The trade-off is that some information may be lost, e.g., GM does not directly measure cortical thickness or volume, and FA does not provide directional information; however, one key advantage is with features we can directly leverage the extensive amount of work focused on unimodal analysis(3,27,28), for example widely replicated patterns of gray matter differences in schizophrenia including temporal lobe and medial frontal cortex(29) and consistent replication of the widely studied default mode network including demonstration of heritability(30). Figure 5 provides a direct view of several multivariate voxel-wise data fusion approaches.
We start with the division of multivariate fusion approaches into two main classes: model-driven and data-driven. Model-driven approaches are those that incorporate specific knowledge about the problem, for example specific influence of one brain region upon another. They include approaches based on the general linear model(31,32), dynamic causal modeling (DCM), and confirmatory structural equation modeling (SEM)(33). Model-driven approaches have the advantage of: 1) enabling testing of particular hypotheses about interaction among the identified networks/regions; 2) simultaneous assessment of multiple connectivity links, moving beyond a one-by-one assessment of covariance(34). However, such approaches may miss important relationships that are not included in the prior hypotheses, and typically do not enable examination of full inter-voxel relationships(35,36).
Data-driven approaches include, but are not limited to, principal component analysis (PCA), ICA, canonical correlation analysis (CCA), and partial least squares (PLS). These methods do not require a priori hypotheses about the inter-relationships and thus are useful for exploring the entire data set. They typically work with the entire set of voxels rather than a selected set of regions. It is also possible to use a data-driven approach to test hypotheses if one can design a hypothesis about the output that is independent of the data. For example, one can certainly hypothesize that patients will show a breakdown in the link between function and structure estimated from a joint ICA algorithm.
A model-based approach is most useful if you know enough about the problem being studied to incorporate this information as assumptions in the algorithm. To the degree the assumptions made are correct, model-based approaches typically perform better and to the degree the assumptions are incorrect, the model-based method will perform worse. Data-driven approaches make fewer assumptions about the structure of the problem, and thus are most useful when one does not want to commit to strong assumptions about the data. Given our knowledge about the human brain and complex mental illness is so incomplete, there is great benefit in making fewer strong assumptions up front, but the two approaches are of course quite complementary to one another.
Given the wide array of approaches that have been applied, any subdivision is limited, however, we can divide multivariate approaches based on the incorporation of priori knowledge and the dimension of the MRI data used:
The optimization strategies of several of the above mentioned multivariate models are displayed in Figure 6. We provide a slightly more detailed summary of several blind fusion models in the supplemental. We also compare several multivariate multimodal fusion methods including their statistical assumptions, strengths and limitations, and multi-modal neuroimaging applications.
In the next sections we review some psychosis-related data fusion studies, starting first from those that focus more on spatial overlap and then moving to those that perform symmetric data fusion. A brief comment on our review methodology, we searched pubmed for the terms ‘multimodal fusion’, ‘multimodal’, ‘multimodal modalities’, and then narrowed these to studies that actually used one of the afore mentioned fusion-based approach. We also narrowed the topics to studies that focused on specific disorders.
Though a data-driven symmetric approach is the most informative and makes fewer assumptions about the specific relationships among data sets, the spatial overlap approach has been one of the most widely used to date. As such, we review some of the relevant work in the context of spatial overlap between function and structure including cross-modal connectivity.
A central assumption of systems neuroscience is that the structure of the brain can predict and/or is related to brain function. The findings of (52) support this hypothesis, which generally show that each single structural component derived from ICA usually corresponds to several resting-state functional components. On the other hand, functional information can help improve the correspondence of functional boundaries across subjects compared to standard structural normalization, as reported in (53), Many psychopathological studies have already indicated the spatial overlap between brain structure and function in mental disorders. Specifically, Salgado-Pineda P. et al (54) found three regions including the thalamus, the anterior cingulate and the inferior parietal that showed both structural and functional impairments associated with attentional processing in schizophrenia. A follow up study of the same group (55) also found both functional alterations in a facial emotion task and GM volume reductions in the DMN in schizophrenia. Phillips ML et al(56) described a novel noninvasive approach for relating brain structure and function with diffusion spectrum imaging (DSI) and fMRI, and revealed co-active areas of task-relevant functional brain activity are anatomically connected by WM tractography, creating a “circuit diagram” in different cognitive tasks. Sasamoto A. et al. (57) postulated a global association between pathologies of GM cortical thickness and FA in schizophrenia and found the mean of both measures were significantly lower in patients with schizophrenia (SZ). Moreover, only in patient group the mean cortical thickness and mean FA showed significant positive correlations in both hemispheres, suggesting GM and WM pathologies of schizophrenia are intertwined at the global level. In a combined dMRI-GM study in medicated-naïve chronic schizophrenia(58), Liu X. et al. observed that patients possessed lower FA values in the left inferior fronto-occipital fasciculus and left inferior longitudinal fasciculus, along with smaller GM volume and cortical thinning in temporal lobe than healthy controls, which reflected the interdependent WM and GM disruption that contributed to the disease. In sum, while it is too early to identify a systematic and replicable link between brain function and structure, it is clear that there is a strong inter-relationship which merits additional study.
Inter-hemispheric disconnection or misconnection has been implicated in the disconnection hypothesis in psychosis(59-61). Recent advances in dMRI allow subtle white matter abnormalities in schizophrenia to be captured, which cannot be detected by structural MRI alone. Combining dMRI tractography with sMRI, Miyata et al(62) tested the inter-hemispheric disconnection hypothesis by parcellating the corpus callosum into functionally-anatomically relevant sub-regions and discovered that inter-frontal commissural fibers are specifically reduced in SZ. This is not only consistent with the disconnection hypothesis, but also specifies the locus of disconnection in a functionally-anatomically relevant way. A recent study of Koch et al(63) showed that increased radial diffusivity of the left superior temporal gyrus is associated with reduced neuronal activation in lateral frontal and cingulate cortex, suggesting a key role for white matter connectivity in determining the pattern and intensity of functional activation related to decision making.
FMRI has also been used to examine the functional disconnection hypothesis of schizophrenia(64). However, the number of studies that have combined fMRI with other modalities is limited. One study using rsfMRI and dMRI found decreased functional connectivity in regions including the right thalamus, which was associated with decreased structural connectivity in the left superior cerebellar peduncle(65). Another study found concordant reductions in functional and anatomical connectivity in the medial frontal and anterior cingulate regions in SZ compared with healthy controls (HC)(66). Schlosser et al. observed a direct correlation in schizophrenia between frontal FA reduction and fMRI activation in regions of the prefrontal and occipital cortices(67). This finding highlights a potential relationship between anatomical changes in a frontal-temporal anatomical circuit and functional alterations in the prefrontal cortex. Staempfli et al. discovered that dMRI- and fMRI-derived topologies are similar, and the combination of fMRI and dMRI can provide extra information to help select appropriate seed regions for identifying functionally relevant networks, and validate reconstructed WM fibers(68). Such studies combining functional and structural connectivity provide clear evidence of a complex interaction between modalities, though in most cases the directionality is similar in that decreased structural connectivity is associated with decreased functional connectivity.
We now review a number of studies, mostly using data-driven approaches to study associations among modalities in the context of psychopathology. The multimodal studies reviewed are summarized and classified into different categories in Table 1. Generally speaking, most of the studies we reviewed demonstrate congruent effects across modalities and multimodal fusion almost always provided more power to differentiate disease than unimodal approaches.
There has been a rapid growth in the use of multimodal fusion approaches. Figure 7 shows a summary of multiple Pubmed searches on various terms including 2-way and N-way multimodal fusion (2-way means two modalities or tasks are analyzed, N-way means more than two modalities or tasks are analyzed). In all cases there has been a rapid increase in the past few years, though, as we argue, more of these type of approaches are needed.
Multimodal MRI can reveal insightful information about key clinical aspects of schizophrenia(69,70). As the most widely studied psychosis, schizophrenia has served as the test bed for almost all above mentioned fusion approaches(71). Specifically, Sui et al examined the linked cognitive biomarkers of schizophrenia by combining fALFF, GM and FA measures from 3 MRI modalities via MCCA(72), suggesting linked functional and structural deficits in distributed cortico-striato-thalamic circuits may be closely related to cognitive impairments measured via the MATRICS battery(73). Using a similar method, Correa et al.(74) identified differences in the co-variation of fMRI and EEG data in SZ versus HC during an auditory oddball task(75). Significant group differences were found in the bilateral temporal lobe/middle anterior cingulate region in fMRI, associated with the N2 and P3 peak in EEG.
Sugranyes et al.(76) examined interactions between fMRI contrast maps from a working memory task and dMRI data by joint ICA, and characterized linked functional and WM changes related to working memory dysfunction, including fMRI hypoactivation of SZ in anterior cingulate and ventrolateral prefrontal cortex and reduced FA localized in the splenium and posterior cingulum. Similarly, Stephen et al.(77) employed joint ICA to investigate the link between magnetoencephalography (MEG) and dMRI data, pinpointing to dysfunction in a posterior visual processing network in schizophrenia, with reduced MEG amplitude, reduced FA and poorer overall cognitive performance. Xu et al also identified four linked gray–white matter networks that were significantly associated with SZ using joint ICA(78), included: 1) temporal-corpus callosum 2) occipital/frontal-inferior fronto-occipital fasciculus 3) frontal/ parietal/occipital/temporal-superior longitudinal fasciculus and 4) parietal/frontal-thalamus, reflecting the widespread nature of the disease.
N-way aberrant brain alterations in SZ(79,80) were also investigated by MCCA+jICA, a tool optimized for identifying correspondence across modalities/tasks, which has been applied to multitask fMRI, fMRI-sMRI-dMRI or fMRI-sMRI-methylation combinations. Given the complexity of mental illness, such combinations are essential evidence to help us better understand psychopathology, but at this point investigations are limited to a relatively spares set of combinations. This is very much a ‘big data’ problem as well, as we need approaches to utilize all available data while also helping us to summarize the complexity in order to find the key relationships that are impacted by mental illness. Another method to identify inter-correlations among gray matter and fMRI voxels within the whole brain was introduced by Michael et al.(81) for schizophrenia by reducing the cross-correlation matrix to histograms. Results showed that the linkage between gray matter and task-related functional activation in both an auditory sensorimotor task(82) and a working memory task(83) was weaker in SZ than HC. A multimodal voxel-based meta-analysis of structural and functional MRI studies at high genetic risk of developing schizophrenia(84) further showed that SZ relatives had decreased GM with functional hyper-activation in the left inferior frontal gyrus/amygdale, and decreased GM with hypo-activation in the thalamus.
In order to probe abnormalities in brain circuits underpinning episodic memory deficits in bipolar disorder (BD) one recent study jointly analyzed, fMRI, sMRI and dMRI(85). Multimodal changes in frontal and parietal areas were revealed and associated with poorer episodic memory. This group also conducted a multi-modal assessment using resting state fMRI, GM volume, WM fiber integrity, and neurobehavioral measures to examine possible shared alterations in BD and SZ patients in the hippocampus(86). Results imply that two disorders may share common alterations in all 3 modalities, but SZ patients showed more severe structural alterations within hippocampus than BD. Similarly, an fMRI-dMRI fusion study comparing SZ and BD(87), showed distinct brain patterns for both clinical groups but interesting they also revealed shared abnormalities in prefrontal thalamic WM integrity and in frontal brain mechanisms.
In major depression disorder (MDD), Vasic N. et al.(88) suggest that while changes of cerebral blood flow (brain perfusion) and GM volume co-occur in MDD patients, they appear to reflect distinct levels of neuropathology. Dutt A et al.(89) reported P300 latency, a measure of the speed of neural transmission, appeared to relate to the size of left hippocampus in schizophrenia, but not in psychotic bipolar disorder. The specificity of this brain structure-function association for schizophrenia opens the scope for further research using integration of multimodal biological data for objective categorization of psychosis. De Kwaasteniet B. et al.(90) concluded structural abnormalities in MDD are associated with increased functional connectivity between subgenual anterior cingulate cortex (ACC) and medial temporal lobe. In addition, a negative structure-function relation in MDD was positively associated with depression severity. Han KM et al.(91) demonstrated structural alteration in both gray and white matter in medication-naïve first episode MDD patients, such as reduced cortical volume of the caudal ACC and decreased WM integrity in the body of the corpus callosum. One of the most interesting aspects of these studies is that it shows that in some cases the directionality in modalities is not always the same (e.g. decreased structure can lead to increased function and vice versa).
One recent study(92) combined GM and dMRI to investigate obsessive-compulsive disorder (OCD) and discovered significant alterations of interrelated gray and white matter networks over occipital and parietal cortices, frontal inter-hemispheric connections and cerebellum. Additionally, white matter networks adjacent to basal ganglia correlated with obsessive-compulsive symptoms. Another study(93) reported linked alterations in GM and WM morphology in adults with high-functioning autism spectrum disorder (ASD) via linked ICA, ASD patients showed decreased GM volumes in bilateral fusiform gyri, orbitofrontal cortices, and pre-/post-central gyri, which were linked with a pattern of decreased FA in tracts of inferior longitudinal fasciculi, inferior fronto-occipital fasciculi, and corticospinal tracts, all bilaterally.
Although recent multimodal imaging results are promising(21,34), much work remains to be done. As the field of multimodal data fusion is still relatively new, many of the studies represent novel findings by using a variety of data combinations; however, replication is needed to draw general conclusions about structure-function relationships. Secondly, despite the many successes of multimodal fusion, fusing as many modalities/features as possible in the training sample does not guarantee the optimal discrimination or classification between groups, as reported in (3,94); thus it can be useful to incorporate a mixture of uni-modal and multimodal results, as done in (95). This work can be pursued in future by making use of larger data sets and various modalities. More complex models, such as those that can handle N-way multimodal fusion, are being introduced and may become one of the leading directions in future neuroimaging research given the predominance of multi-modal data acquisition(22).
Regarding classification, there are multiple studies demonstrating the combination of structural and functional data can improve brain disease classification. A strong demonstration of this is found in a recent multimodal classification challenge for schizophrenia versus controls using GM and rest fMRI connectivity received over 2000 submissions, most of which were able to achieve greater than 80% accuracy(96). Dai et al.(97) proposed an automatic classification framework which integrated multimodal image features using multi-kernel learning (MKL) for predicting attention deficit/hyperactivity disorder versus controls finding a very high classification performance. Using an ensemble feature selection strategy and an advanced support vector machine approach, Sui et al.(98) combined resting-state fMRI, EEG and sMRI data to classify schizophrenia from healthy controls and achieved the best performance with 91% accuracy compared to using a single modality. By adopting Gaussian process classifiers to evaluate the prognostic value of neuroimaging data and clinical characteristics, Schmaal et al.(99) discovered that prediction of the naturalistic course of depression over 2 years is improved by considering different task contrasts or data sources, especially those derived from neural responses to emotional facial expressions. Finally, Pettersson-Yeo et al.(100) used a multimodal SVM approach to examine the ability of sMRI, fMRI, dMRI and cognitive data to differentiate between ultra-high-risk (UHR) and first-episode (FEP) psychosis at the single-subject level, supporting clinical development of SVM to help inform identification of FEP and UHR. These findings strongly suggest that multi-modal classification facilitated by advanced modeling techniques can provide more accurate and early detection of brain abnormalities beyond approaches that use only a single modality.
Major advances in performance have been obtained in multiple domains, including brain imaging, via deep (multilayered) learning algorithms to capture nonlinear/higher order relationships. Recent work has shown the potential for such models in neuroimaging data(101-106) and provide a framework to extend promising approaches such as linear ICA(107-109). One potential issue is that training of deep models requires extensive amounts of data. However this issue can in part be overcome by training the models with realistic simulation data(29,103). Deep models have recently made significant advances, outperforming shallow models in multiple problem domains such as image classification(110). Our work shows that class separation improves with deep belief network (DBN) depth while DBNs uncover hidden relations within data and thus facilitate discovery(102,111). Specifically, we investigated if classification rates improve with depth by sequentially investigating DBNs of 3 depths. Figure 8a displays 2D maps of the raw data, as well as the depth 1, 2, and 3 activations: the deeper networks place schizophrenia patients and healthy control groups further apart for both training and validation data (for more details see (102)). Another benefit of deep learning models is their ability to automatically discover high level representation(102), which is especially important for multimodal analysis incorporating different data types (e.g., fMRI, sMRI, and EEG) that are unlikely to have a simple linear correspondence. An example of a multimodal deep learning architecture is shown in Figure 8b. There are many interesting emerging models, for example, motivated by the concept of brain function and structure representing static images (sMRI) annotated by sequential captions of the brain (fMRI), one can build a model to translate the relationship between brain structure and brain function using a recurrent neural network(112). Many more fusion approaches based on deep learning will emerge in the near future.
In sum, we are just beginning to unlock the potential of multimodal imaging, which offers unprecedented opportunities to further deepen our understanding of the brain disorders(113) based on various brain imaging measures. The most promising avenues for the future may lie in developing better models that can complement and exploit the richness of our data(114). We are able to image the brain from living humans with multiple modalities, each providing a unique perspective. In order to minimize incorrect conclusions about mental illness, or even perhaps enabling us to identify the missing links between the brain and mental illness, multimodal data fusion is not only important, it is necessary.
The authors thank Sergey Plis for helpful input on the manuscript. The work was in part funded by NIH via a COBRE grant P20GM103472 and grants R01EB005846 and 1R01EB006841; the “100 Talents Plan” of the Chinese Academy of Sciences, the Chinese National Science Foundation grant No. 81471367, the State High-Tech Development Plan (863) No. 2015AA020513 and the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB02060005).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
7. Financial disclosures
The authors report no biomedical financial interests or potential conflicts of interest.