Search tips
Search criteria 


Logo of jamiaAlertsAuthor InstructionsSubmitAboutJAMIA - The Journal of the American Medical Informatics Association
J Am Med Inform Assoc. 2013 Jan-Feb; 20(1): 1.
PMCID: PMC3555342

Sharing data for the public good and protecting individual privacy: informatics solutions to combine different goals

Lucila Ohno-Machado, Editor in chief

The bioethics advisory committee to the President has recently issued a report emphasizing the importance of protecting health information, particularly the data about an individual's genome.1 The report does not prescribe how to balance the need for sharing information to accelerate discoveries with the potential risk of privacy breach. However, it does mention the potential benefits of data sharing and calls for the development of solutions that minimize the risk of privacy breach. Similarly, a recent report from the NIH's ‘Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects’2 recommends that ‘sequence/phenotype/exposure data sets (be) deposited in one or several central databases.’ Studies on human genomes require good characterization of individual phenotypes, and some of these data may be retrieved from electronic health records. Lessons learned from over a decade of research in privacy technology can help guide solutions to the problem of combining phenotype and genome data in a way that preserves confidentiality. This issue of JAMIA, in addition to several articles we have been publishing in the past few years on technology and policy,3–12 explains regulatory constraints, presents a collection of the latest research results on privacy technology, and displays diverse perspectives on the topic of reusing clinical data for research, healthcare quality improvement, and public health.

I am grateful to guest associate editors Malin, O'Keefe, and El Eman for organizing a call for papers and the subsequent reviews for submissions on privacy technology and policy for this special focus issue. These articles represent a variety of subtopics and approaches, ranging from differential privacy (a technical framework that guides safe data disclosure by quantifying the risk of privacy breach to an individual) to discussions on legal aspects of protecting privacy. Malin and his co-guest editors discuss particular articles in their comprehensive editorial.13

This issue of the journal also contains reports on how different institutions have been approaching the utilization of clinical and genomic data for research and public health. Hripcsak (see page 117) discusses how EHRs need to evolve to fulfill current need for comprehensively phenotyping individuals, and Marsolo (see page 122) describes lessons learned when integrating a commercial EHR with research systems. In a provocative article, Witten and Tibshirani (see page 125) express concern about the traditional publishing models, particularly with regards to their inability to ensure that experiments are reproducible. The authors also discuss how the pre-publication review model may be antiquated in an era in which readers have an opportunity to comment on any published article. Farley (see page 128) proposes a platform for biomedical knowledge computing, and Cusack (see page 134) reports on AMIA recommendations for data capture and documentation.

Data sharing requires an environment in which the professionals who handle the data adhere to the highest ethical standards and implement systematic processes that (a) measure data quality, (b) respect to consumer preferences, (c) successfully identify research cohorts, and (d) are scalable. The articles by Weiskopf and Weng (see page 144), Ancker (see page 152), Ge (see page 157), Hurdle (see page 164), and Natter (see page 172), address each of these issues, respectively. Cumin (see page 180) provides an excellent example of data sharing, describing two available datasets for anesthetic records. Avillach (see page 184) describes a European experience for harmonization of the process involved in the identification of medical events in healthcare databases. Jones (see page 193) reuses administrative claims data to supplement a state disease registry. Sharing the knowledge obtained from the analyses of the shared data is an equally important endeavor: Kawamoto (see page 199) discusses a potential framework for knowledge sharing in the context of clinical decision support.

It is exciting to start the new year with an issue of JAMIA that will certainly generate a lot of discussion and lead to a potential re-examination of several current practices and paradigms. My goal is to continue to promote this discussion throughout the year and to keep an open mind to adapting the journal to new trends in scholarly publishing. New publishing models may serve not only our diverse informatics community, but also the scientific and lay communities at large, extending our journal beyond its current boundaries.

Finally, as I complete 2 years of service to JAMIA, I remain indebted to an outstanding editorial team, authors, reviewers, and readers who continue to provide valuable feedback so we can further improve the journal.


Funding: The author is partially funded by NIH grant U54HL108460.

Competing interests: None.

Provenance and peer review: Commissioned; not peer reviewed.


1. Presidential Commission for the Study of Bioethical Issues Privacy and Progress in Whole Genome Sequencing. (accessed 27 Nov 2012)
2. Report on the Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects. June 5–6, Rockville, MD
3. Malin B, Benitez K, Masys D. Never too old for anonymity: a statistical standard for demographic data sharing via the HIPAA Privacy Rule. J Am Med Inform Assoc 2011;18:3–10 [PMC free article] [PubMed]
4. Weitzman ER, Cole E, Kaci L, et al.  Social but safe? Quality and safety of diabetes-related online social networks. J Am Med Inform Assoc 2011;18:292–7 [PMC free article] [PubMed]
5. El Emam K, Hu J, Mercer J, et al.  A secure protocol for protecting the identity of providers when disclosing data for disease surveillance. J Am Med Inform Assoc 2011;18:212–7 [PMC free article] [PubMed]
6. Boxwala AA, Kim J, Grillo JM, et al.  Using statistical and machine learning to help institutions detect suspicious access to electronic health records. J Am Med Inform Assoc 2011;18: 498–505 [PMC free article] [PubMed]
7. Schweitzer EJ. Reconciliation of the cloud computing model with US federal electronic health record regulations. J Am Med Inform Assoc 2012;19: 161–5 [PMC free article] [PubMed]
8. Murphy SN, Gainer V, Mendis M, et al.  Strategies for maintaining patient privacy in i2b2. J Am Med Inform Assoc 2011;18(Suppl 1):i103–8 [PMC free article] [PubMed]
9. Ohno-Machado L, Bafna V, Boxwala AA, et al. iDASH team iDASH: integrating data for analysis, anonymization, and sharing. J Am Med Inform Assoc 2012;19:196–201 [PMC free article] [PubMed]
10. Weber SC, Lowe H, Das A, et al.  A simple heuristic for blindfolded record linkage. J Am Med Inform Assoc 2012;19(e1):e157–61 [PMC free article] [PubMed]
11. Wu Y, Jiang X, Kim J, et al. Grid Binary LOgistic REgression (GLORE): building shared models without sharing data. J Am Med Inform Assoc 2012;19:758–64 [PMC free article] [PubMed]
12. Vinterbo SA, Sarwate AD, Boxwala AA. Protecting count queries in study design. J Am Med Inform Assoc 2012;19:750–7 [PMC free article] [PubMed]
13. Malin BA, El Emam KE, O'Keefe CM. Biomedical data privacy: problems, perspectives, and recent advances. J Am Med Inform 2013;20:2–6 [PMC free article] [PubMed]

Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of American Medical Informatics Association