|Home | About | Journals | Submit | Contact Us | Français|
Accelerating insight into the relation between brain and behavior entails conducting small and large-scale research endeavors that lead to reproducible results. Consensus is emerging between funding agencies, publishers, and the research community that data sharing is a fundamental requirement to ensure all such endeavors foster data reuse and fuel reproducible discoveries. Funding agency and publisher mandates to share data are bolstered by a growing number of data sharing efforts that demonstrate how information technologies can enable meaningful data reuse. Neuroinformatics evaluates scientific needs and develops solutions to facilitate the use of data across the cognitive and neurosciences. For example, electronic data capture and management tools designed to facilitate human neurocognitive research can decrease the setup time of studies, improve quality control, and streamline the process of harmonizing, curating, and sharing data across data repositories. In this article we outline the advantages and disadvantages of adopting software applications that support these features by reviewing the tools available and then presenting two contrasting neuroimaging study scenarios in the context of conducting a cross-sectional and a multisite longitudinal study.
Making data from biomedical studies freely available to the research community is an increasingly prevalent mandate of funding agencies (Collins & Tabak, 2014) and publishers (Bloom, Ganley, & Winker, 2014). For example, the National Institutes of Health (NIH)1 stated in 2003 that “all investigator-initiated applications with direct costs greater than $500,000 in any single year will be expected to address data sharing in their application” and that “the timely release and sharing to be no later than the acceptance for publication of the main findings from the final data set.” More recently the NIH Director Dr. Collins commented on “…the failure of funding agencies to establish or enforce policies that insist on data access” (Collins & Tabak, 2014) and called to “embrace an era in which transparency and responsible data sharing are common value” (Hudson & Collins, 2015). These sharing directives are supported by recent neuroimaging studies demonstrating that the reusability of data is a key scientific resource (Breeze, Poline, & Kennedy, 2012; Mennes, Biswal, Castellanos, & Milham, 2013; Poline et al., 2012). The cultural shift from data ownership by a closed group toward data sharing within an open community is particularly relevant for recent “Big Data” studies (Fjell et al., 2012; D. S. Marcus et al., 2011; Mennes et al., 2013; Toga, Crawford, Alzheimer’s Disease Neuroimaging Initiative, 2010), neuroimaging data repository efforts (Gorgolewski et al., 2015; Hall, Huerta, McAuliffe, & Farber, 2012; Poldrack et al., 2013), and authors who publish their work at journals such as Proceedings of the National Academy of Sciences (Cozzarelli, 2004), Journal of Neuroscience (Shepherd, 2002), Journal of Cognitive Neuroscience (D’Esposito, 2000), and Public Library of Science (Bloom et al., 2014). In summary, a key element for high impact research in neuroimaging is becoming the integration of data sharing into the study design. This article supports this task by reviewing software tools, including data repositories, aiding in electronic data capture, management, and sharing within the neuroimaging community.
The motivation behind data sharing requirements is likely driven by the promise to maximize the knowledge gleaned from neuroimaging studies through exploration of ‘reusable data’ (Breeze et al., 2012; Kennedy, Haselgrove, Riehl, Preuss, & Buccigrossi, 2015; Poldrack & Gorgolewski, 2014; Poline et al., 2012). Reusable data includes primary data (i.e., raw observations such as brain images or neuropsychological measures) and secondary data (i.e., derived measurements such as image segmentations or composite scores) that are curated in a format easily and freely accessible to the research community. Reusable data can augment the information available in databases, such as BrainMap (Fox & Lancaster, 2002; Laird, Lancaster, & Fox, 2005) and NeuroSynth (Yarkoni, Poldrack, Nichols, Van Essen, & Wager, 2011), allow researchers to explore alternative hypothesis, and reduce concerns about the reproducibility of discoveries by performing independent replication studies (Ioannidis, 2005). To date, however, neuroimaging studies generally reduce access to their data to summary statistics published in journal papers, such as p-values associated with brain atlas coordinates (Lancaster et al., 2000; Tzourio-Mazoyer et al., 2002). Without access to neurocognitive data (i.e., primary and secondary data) the potential to aggregate datasets, boost statistical power of findings (Button et al., 2013), or to inspect the dataset for non-significant findings omitted from the original manuscript (David et al., 2013; Ioannidis, 2011) is mostly limited to meta-analysis, whereby the results of comparable studies are examined collectively to corroborate findings (Caspers, Zilles, Laird, & Eickhoff, 2010; Salimi-Khorshidi, Smith, Keltner, Wager, & Nichols, 2009).
Sharing neurocognitive data is generally a resource intensive activity as it requires curating data so that it is meaningful to the research community (Howe et al., 2008). This situation poses a quandary for principal investigators of neuroimaging studies needing to choose between using their assets to deliver reusable data or to pursue new scientific questions and hypotheses, particularly when only the latter may lead to new funding opportunities. To reduce barriers associated with creating reusable data, the Neuroinformatics community provides software for electronic data capture, management, and sharing (Poline et al., 2012). The software packages are generally based on best practices developed by that community, which include standardized data formats and metadata representations (Bjaalie & Grillner, 2007).
Neuroinformatics started in the 1990’s, when scientists applied the principles of Biomedical Informatics (Kulikowski et al., 2012) to develop reusable tools supporting the data analysis needs of the Human Brain Project (HBP) (JE Brinkley & Rosse, 2002; Huerta & Koslow, 1996; Shepherd et al., 1998) and individual labs investigating relationships between brain and behavior (Young & Scannell, 2000). In the neuroimaging domain, those data analysis tools lowered computational barriers to advances in brain science by enabling investigators without training in software development to harness increasingly complex analysis methodologies (Cox, 1996; Fischl et al., 2004; Smith et al., 2004). Since the 90’s, the scale and scope of neuroimaging studies have dramatically increased as have the number and complexity of the tools needed to process those data (Ferguson, Nielson, Cragin, Bandrowski, & Martone, 2014; Gomez-Marin, Paton, Kampff, Costa, & Mainen, 2014; Van Horn & Toga, 2014). In addition to data analysis, early Neuroinformatics efforts focused on experiment management systems that helped to address challenges of complex studies for which the state-of-the-art at the time (i.e., spreadsheets and images stored in directory structures on a file system) became insufficient for effectively fulfilling study objectives (JE Brinkley & Rosse, 2002). Today, Neuroinformatics continues to adapt to the changing requirements of neuroimaging studies by providing software tools that simplify the capture, management, and sharing of neurocognitive data.
This article reviews the state-of-the-art for capturing, managing, and sharing data of neuroimaging studies. Specifically, Section 2 provides an overview of the informatics approaches designed to facilitate electronic data capture and management (Section 2.1), complying with data sharing policies (Section 2.2), and repositories specializing on distributing neurocognitive data (Section 2.3). We then embed these tools in a practical setting by reviewing two study scenarios (Section 3) that contrasts a cross-sectional study (Section 3.1) with a multisite longitudinal study (Section 3.2). In each scenario, we present a study description, requirements, and neuroinformatics approaches available to assist researchers in developing best practices in their own lab and for adhering to more stringent data sharing policies. We complete this review with a discussion of the relative advantages and disadvantages of deploying the systems detailed herein with the goal of helping neuroimaging labs make informed decisions on choosing neuroinformatics tools for electronic data capture, management, and sharing.
Studies that examine brain and behavior relations are increasing in complexity with the agencies funding projects with a larger number of subjects, cognitive tests, and imaging modalities (Jack et al., 2008; Thompson et al., 2014; Van Essen et al., 2012). As mentioned earlier, neuroimaging labs traditionally rely on spreadsheets and a file system to capture, manage, and share data (Poline et al., 2012). Increasing the scale of research projects, many research labs are confronted with new technical (e.g., data size, quality control, analysis complexity) and social (e.g., employee turnover, data sharing requirements) challenges (Buckow, Quade, Rienhoff, & Nussbeck, 2014). To address these challenges, many labs develop homegrown data management systems to create immediate solutions that may not address long-term socio-technical issues related to scalability (Franklin, Guidry, & Brinkley, 2011). For example, a lab may develop a system for data entry where information from paper forms is typed into a single “data entry computer.” As the lab grows or the complexity of a given study so will the need for multi-user access to the database, paperless electronic data capture, and automated uploading of computerized neuropsychological assessments, for example see (Gur et al., 2010; Kane & Kay, 1992), to central data repositories (Hall et al., 2012). Rather than investing time in the development of new software, a more practical approach is to use existing solutions (Franklin et al., 2011). We now review these technologies by describing electronic data capture and management systems (Section 2.1), data management plans complying with NIH policies regarding the protection, management, and sharing of data (Section 2.2), and data repositories (Section 2.3) that can be used to maintain and distribute the data.
Choosing the right electronic data capture and management systems (EDCMS) for a specific lab environment requires carefully evaluating current research workflow, the type of data to be captured and managed by the system (e.g., clinical/neuropsychological forms or medical imaging), and available information technology (IT) resources (e.g., networking and data management personnel). For example, studies with a small neuroimaging component and an extensive neuropsychological test battery administered using paper and pencil, such as in (Meier et al., 2012), may be best served by an EDCMS with excellent double-data entry support to reduce errors from manual data entry. Alternatively, multisite studies focusing on brain imaging, such as in (Fjell et al., 2012; Jack et al., 2008), may select a research Picture Archiving and Communications System (PACS) (Greenes & Brinkley, 2006) with enhanced support for the Digital Imaging and Communications in Medicine (DICOM) standard (Hussein, Engelmann, Schroeter, & Meinzer, 2004) to help automate, for example, the archival and de-identification of images (Haak, Page, Reinartz, Krüger, & Deserno, 2015). To gain access to such systems, labs with adequate informatics expertise and IT resources may choose to install and maintain an EDCMS on their own computer environment so that they can fine-tune and customize the deployed system. Alternatively, research labs might want to access an EDCMS hosted by another institution on the Web (e.g., access is provided by contract or fee) (Book et al., 2013; Scott et al., 2011; Van Horn & Toga, 2009) or provided by a service center within their own institution (Bernstam et al., 2009) to avoid the operating cost of maintaining an EDCMS. The remainder of this section reviews a subset of the most widely used tools and services targeted towards electronically capturing and managing neurocognitive research data (summarized in Table 1), which were selected from the Neuroimaging Informatics Tools and Resources Clearinghouse (Kennedy et al., 2015) and the Neuroscience Information Framework (Gardner et al., 2008) resource registries:
To summarize, we reviewed seven data capture and management systems that aid neuroimaging studies in electronically capturing and managing data. Each tool affords a Web-based interface enabling researchers to upload, manage, and share data from any computer connected to the Internet. They also provide a mailing list, user documentation, online demos, and an issue tracker with links available from their website. Not under active development anymore, HID provided a proof-of-concept for many other systems. COINS and IDA are unique in that they offer a purely centralized solution without the need to install and maintain the system locally. NiDB is the only system that can both be installed by an individual lab or hosted online for a fee. REDCap is the only system we reviewed that focuses primarily on non-imaging data and has been widely deployed for electronically capturing phenotypic data. XNAT is the most broadly deployed open source system for medical imaging data with strong community-based user support. Both REDCap and XNAT are the only systems providing an API for automating data management tasks. Given the strength of each system, their deployment within a research setting requires careful evaluation of the studies they should serve.
As stated by the NIH in 2003 (see Introduction), applications with direct costs greater than $500,000 in any single year must include a data management plan. The data management plan specifies the protection, management, and sharing of data collected by the proposed study in compliance with the funding agency. For example, the National Institute of Mental Health (NIMH) requires14 that clinical research with human subjects need to submit data to one of the NIMH Data Archive systems. To identify the correct repository, a flow chart is provided15. Another important part in the data management plan is specifying the de-identification of patient data so that the privacy and confidentiality of study participants is maintained. Human subjects’ data containing Personal Health Information (PHI) must first be sanitized of specific information before it can be made public according to the Health Information Accountability and Affordability Act16. Ignoring specific data types (e.g., genomic data and protected populations) for a moment, NIMH requires that the following information be removed before sharing the data: 17
Point 16 is particularly relevant for neuroimaging studies as the reconstructed 3D images of an anatomical scan can be used to render the facial features of a participant. One approach to resolving this issue is to apply a defacing algorithm to the imaging data to obscure any recognizable facial features (Milchenko & Marcus, 2012). To unlink the original with the de-identified data, the name of each participant in a study is replaced with a unique identification number that does not reveal any PHI. NIMH provides a tool for generating such Globally Unique Identifier (GUID)18. Finally, NIMH requires that Institutional Review Boards (IRB) contain specific language regarding the informed consent of sharing participant’s data in a data repository, for which NIMH has created a template19. Templates are also provided by the Data Management Planning Tool (DMPTool)20 that guides researchers through the process of creating a plan that will comply with their funding institution’s requirements.
The growing availability and accessibility of shared neuropsychological and neuroimaging data is due, in part, to centralized data repositories (Gorgolewski et al., 2015; Hall et al., 2012; D. S. Marcus et al., 2011; Poldrack et al., 2013; Toga et al., 2010). Neurocognitive data repositories not only provide a simple mechanism for archiving and distributing data but also can empower researchers to reproduce findings in the primary literature (Pernet & Poline, 2015) by providing access to neurocognitive data that can be reanalyzed, used for teaching data analysis methodologies, or to advance hypothesis and data-driven discovery (Biswal et al., 2010). The primary function of a data repository is to provide users with a mechanism to retrieve datasets. From a neuroinformatics perspective, the simplest form of data repository is a collection of datasets that can be downloaded in bulk; however, information systems can improve the experience for users searching for specific datasets and contributors wanting to upload data. For example, a query interface that enables a user to browse for and download neurocognitive data from subjects that completed an anatomical MRI between the ages of 18–25 is an improvement over the bulk download scenario. The functionality of these systems is complementary to the data management tools discussed in Section 2.1 in that the focus is to curate large amounts of heterogeneous data (Hall et al., 2012) rather than support day to day research operations; however, EDCMS may also support data repository efforts (e.g., COINS, IDA, and XNAT). This section focuses on population and modality specific neurocognitive data repositories (i.e., databases that distribute medical images and neuropsychological test scores) using subset of representative data sharing projects selected from the Neuroimaging Informatics Tools and Resources Clearinghouse (Kennedy et al., 2015) and Neuroscience Information Framework (Gardner et al., 2008) resource registries. These data repositories can roughly be divided into two types:
Each of these repositories constitute as a data sharing initiative. The policies for disseminating data from these repositories generally require accepting a data use agreement that limits who and how a dataset can be used. Examples of data use agreements are Open Access Data, which is typically de-identified data that can be easily downloaded after filling out a form online (Di Martino et al., 2014), and Restricted Data, which may require institutional review board approval, demonstration of qualified research credentials, or review of an application before gaining access to the data (Hall et al., 2012). Table 2 contains a listing of data repositories and distributors including summary information on the categories listed above.
This section presented Neuroinformatics resources that ranged from tools designed to manage neuropsychology data in a single lab to data repositories that are used to distribute data from thousands of research participants throughout the scientific community. In the context of managing a study, these resources can be leveraged to streamline the collection of high quality data and enhance research productivity by minimizing data management activities. The execution of data management plans are needed to insure the patients’ privacy and neuroinformatics data repositories can simplify the submission process and shoulder the burden of data storage and longevity.
We now present two scenarios for applying the EDCMS described in Section 2.1 to neuroimaging studies. For each scenario, we specify the study design, the resulting requirements for data capture and management, and a neuroinformatics approaches that meets those requirements. Specifically, Scenario A (Section 3.1) describes a cross-sectional brain-behavior study with EDCMS being optional and Scenario B (Section 3.2) presents a case study on a multi-site, longitudinal study with complex requirements that necessitates an EDCMS. These scenarios are meant to highlight the challenges encountered when executing studies of different scales and to gauge when an EDCMS may or may not help to overcome these challenges.
We now present a hypothetical scenario of a typical brain behavioral study to highlight basic neuroinformatics requirements and approaches used to address issues in data capture, management, and sharing. In this scenario, a lab is conducting a cross-sectional study that is funded by the NIMH to examine the relationship between neuroanatomical volumes extracted from MRIs and measures from the Autism Diagnostic Observation Schedule (ADOS) (Lord et al., 1989). The data are collected from a population of participants with Autism Spectrum Disorder (n=40) and healthy controls (n=40). These two sample sets are age and sex matched. The study takes place in the medical school that the lab is part of. The school has technical personnel on staff for implementing scanner sequences and acquiring imaging data from the participants. The lab itself consists of a principle investigator, two graduate students, and an undergraduate research assistant. Graduate student A performs the basic image processing of the anatomical MRI, graduate student B is trained in administering the ADOS, and the research assistant performs the ADOS scoring and data entry. Specifically, student B administers the ADOS modules and records the behavioral observations via paper and pencil form. The research assistant uses these records to manually score and enter the ADOS data into a spreadsheet, whose variable names were defined by the principal investigator. After a participant is scanned, graduate student A obtains a USB thumb drive with the anatomical MRI data. She transfers the data from the USB thumb drive to the file system of a lab workstation and process the imaging data via FreeSurfer (Fischl et al., 2002) to extract regional brain volumes. Once data acquisition and processing of the study is completed, the principal investigator performs the hypothesis driven analyses and drafts a manuscript for publication. Upon publication, this NIMH funded study is required to be uploaded and shared through NDAR.
To successfully complete this study, the electronic data capture and management requirements are fairly minimal. Given the relatively small sample size, it is quite reasonable to first record observations via paper and pencil and then captures those observations electronically via data entry into a spreadsheet. However, the lack of double data entry can introduce errors and may impact data analysis (Day, Fayers, & Harvey, 1998). For the imaging data, a directory structure will need to be defined to store the subject data and neuroinformatics tools will need to be installed for converting and processing the data. Finally, the data (e.g. spread sheet) will need to be prepared according to the NDAR guidelines so that it can be uploaded to the corresponding repository.
Given the simple design of the study in this scenario, an informatics evaluation of the requirements for this study would not warrant manually installing an EDCMS; however, there are improvements that can be explored without heavy overhead. First, the investigator could explore the resources available at their institution to identify if an EDCMS is already hosted. Today, many medical schools provide EDCMS as a service that is funded by Clinical and Translational Science Awards (Bernstam et al., 2009). If the EDCMS does not support imaging data, a directory structure (such as the Brain Imaging Data Structure (BIDS)28) on a file system will be adequate. If the lab does not want to use an EDCMS for the ADOS data, spreadsheet software will suffice but the investigator may want to consider using the data collection forms provided by the PhenX Toolkit (Stover, Harlan, Hammond, Hendershot, & Hamilton, 2010). The lab should also adopt the standard data dictionaries for variables that are provided by NDAR. Once the results are published, this will ease uploading the data to a repository as mandated by the NIMH.
The second case study is based on our own work where data management tools are deployed to conduct research on neurodevelopment in adolescence. This study is motivated by the observation that alcohol and marijuana remain the most commonly used central nervous system-active substances in the teen years (Johnston, OMalley, Miech, Bachman, & Schulenberg, 2015). To study the influence of adolescent alcohol and marijuana abuse on neurodevelopment, the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) is a multisite, longitudinal, study that recruited 831 participants (ranging from 12–22 years old) across five data collection sites nationwide (Brown et al., In press.).
Each of the five data collection sites carried out the same core assessment and worked in pairs to conduct additional studies (e.g., overnight sleep evaluation and recovery during monitored abstinence). The 831 study participants completed a core data acquisition protocol at baseline and will complete three annual follow-ups, each of which include a neuropsychological (NP) test battery, neuroimaging session (MRI, DTI, and rsfMRI), bio-samples for genetic analysis, a comprehensive assessment of substance use, psychiatric symptoms and diagnoses, functioning in major life domains, and one parent of each youth completes an interview on the youth and family environment. The NP test battery assesses seven major functional domains including: general intelligence; executive functions; emotion regulation; multimodal and multiple component mnemonic processes; visuospatial abilities; basic visual acuity and color perception; and motor skills of eye-hand coordination, speed, and postural stability. In addition, a mid-year phone interview is conducted between each visit to track substance use. Upon completing data collection, the dataset is expected to reach approximately 6TB of primary data and nearly 20TB of derived data from neuroimaging analyses. In the sections below, we present an overview of the study requirements that needed to be addressed and the neuroinformatics approaches we used to implement a framework that enabled us to collect data rapidly, maintain quality control, and streamline data processing (Rohlfing, Cummins, Henthorn, Chu, & Nichols, 2013).
To realize the longitudinal experimental design of NCANDA, it was necessary to establish a framework capable of meeting the requirements to capture, integrate, and process multimodal data from five data collection sites. To be economical, we wanted to design a framework consisting of freely available data management tools. The guiding principles in the evaluation of those tools were 1) an active and supportive mailing list, 2) intuitive GUI with training materials for research staff, 3) support for customization for longitudinal data acquisition, and 4) the ability to automate tasks programmatically (e.g., quality control checks, test scoring) using an API. After evaluating available medical imaging data management systems and electronic data capture systems (see Section 2.1), we chose a solution coupling XNAT, which is targeted towards imaging studies, with REDCap, a data management system addressing the needs of the study with respect to NP test data. Both systems met the evaluation criteria and tested well with research staff during an initial evaluation. At the time this framework was developed in 2012, no single system solutions existed that fulfilled our evaluation criteria.
Building upon XNAT and REDCap, we designed a framework (Figure 1) that automated electronic data capture, management, harmonization, quality control, analysis, and distribution across the five data collection sites of the NCANDA consortium (Rohlfing et al., 2013). Specifically, the NCANDA sites collected the non-imaging data via the University of Pennsylvania Web-based Computerized Neurocognitive Battery (WebCNP) (Gur et al., 2010), LimeSurvey29, Blaise30, ePrime31, and REDCap. Test scores not collected directly through entry forms in REDCap were automatically transformed into a REDCap compliant format and uploaded from the laptops used for data capturing at the collection sites to the REDCap server hosted by the NCANDA Data Analysis Component at SRI International via encrypted connections to Subversion32, a secure and persistent data uploading system. Imaging data was first uploaded from the site-specific PACS to the XNAT server hosted at SRI International. All imaging data underwent a quality control that included automatic test scoring, range validation, and a neuroradiologist report for incidental imaging findings. Finally, the outcome of the quality control was uploaded into REDCap and merged with the corresponding non-imaging data for each session. Any updates to information in the REDCap database automatically triggered the generation of reports regarding data integrity. Identified issues were resolved with site consultation for scoring irregularities, incorrectly entered IDs, visit dates, and any data that were not uploaded properly. Once data passed the initial quality control, the data was processed in further analyses and backed it up via Amazon Web Services (AWS).
To distribute the collected data with the NCANDA consortium, the platform has an integrated data release mechanism. To create the release, all entries in REDCap are manually checked one more time for entry errors. Entries passing this quality control are immediately locked in the database (i.e., changes to these records required prior approval by the investigators of the NCANDA Data Analysis component at SRI International). With respect to incorrect or questionable entries, a data manager at SRI International resolves the issues by contacting the collection sites and locks the record once the error is resolved. After all entries requested for the data release are locked, the data is provided to the members of the NCANDA consortium via a set of comma-separated-value (CSV) files exported from REDCap with corresponding data dictionaries for each data element. Plans for sharing the data with the broader research community include technology to facilitate interoperability with neuroinformatics resources, such as the Neuroimaging Data Model (NIDM) standard for data exchange (Keator et al., 2013), the Cognitive Atlas ontology (Poldrack et al., 2011) for data annotation, and the Neuroimaging Informatics Resource Technology Clearinghouse Image Repository (NITRC) (Kennedy et al., 2015) and OpenfMRI (Poldrack et al., 2013) data repositories. The resulting organization of this data set would then align with the approach proposed by the Research Domain Criteria (RDoC) Initiative (Insel et al., 2010), where data can be explored at different levels of analysis (e.g., from circuit-level to family environment) and by broad domains of function (e.g., cognitive systems or working memory).
This manuscript provided an introduction to electronic data capture and management tools, data management plan, and data repositories to facilitate compliance of neurocognitive studies with data sharing mandates of funding agencies and publishers and to decrease the setup time and improve quality control of studies, and streamline the process of harmonizing, curating, and sharing data across data repositories. Today, many researchers see freely sharing data as a key scientific resource with the goal of maximizing the knowledge gleaned from neuroimaging studies. However, sharing neurocognitive data is generally viewed as a resource intensive activity as it requires curating data so that it is meaningful to the research community. The Neuroinformatics tools, data management plans, and repositories reviewed here aim to reduce this burden. Furthermore, they enable large-scale studies as highlighted by one of the neuroimaging study scenarios. Finally, readers wanting to gain a more complete view of this topic should visit resource registries (Belleau, Nolin, Tourigny, Rigault, & Morissette, 2008; Ferguson et al., 2014; Gardner et al., 2008; Kennedy et al., 2015; Stover et al., 2010), which catalogue shared data repositories (Gardner et al., 2008), data analysis software (Kennedy et al., 2015), ontology resources (Fox et al., 2005; Larson & Martone, 2013; B. N. Nichols et al., 2014; Poldrack et al., 2011), and utilities for simplifying system configuration (Stover et al., 2010); (Gershon et al., 2010).
This work was supported by the U.S. National Institute on Alcohol Abuse and Alcoholism (NIAAA) (U01 AA021697, R01 AA005965, R01 AA012388, U01 AA013521, U01 AA017347, U01 AA017923). It was also supported by the Creative and Novel Ideas in HIV Research Program (CNIHR) through a supplement to the University of California at San Francisco (UCSF) Center For AIDS Research funding (P30 AI027763). This funding was made possible by collaborative efforts of the Office of AIDS Research, the National Institutes of Allergies and Infectious Diseases, and the International AIDS Society.
2Collaborative Neuroinformatics Suite: http://coins.mrn.org
3Human Imaging Database: http://www.nitrc.org/projects/hid
4Image Data Archive: http://ida.loni.usc.edu
5Longitudinal Online Research and Imaging System: http://mcin-cnim.ca/neuroimagingtechnologies/loris
6Neuroinformatics Database: http://nidb.sourceforge.net
7Research Electronic Data Capture: http://www.project-redcap.org
8Extensible Neuroimaging Archive Toolkit: http://www.xnat.org/
21Autism Brain Imaging Data Exchange: http://fcon_1000.projects.nitrc.org/indi/abide
22Alzheimer’s Disease neuroimaging Initiative: http://adni.loni.usc.edu
23Consortium for Reliability and Reproducibility http://fcon_1000.projects.nitrc.org/indi/CoRR/html/index.html
24Human Connectome Project: https://humanconnectome.org
25National Alzheimer’s Coordinating Center: https://www.alz.washington.edu/WEB/researcher_home.html
26Neuroimaging Informatics Resource Technology Clearinghouse Image Repository: http://www.nitrc.org/ir
27Pediatric Imaging Neurocognition and Genetics: https://pingstudy.ucsd.edu
Conflict of interest: Neither author has conflicts of interest with the information presented herein.