PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Methods Inf Med. Author manuscript; available in PMC 2011 March 19.
Published in final edited form as:
PMCID: PMC3034110
NIHMSID: NIHMS244589

A Characterization of Local LOINC Mapping for Laboratory Tests in Three Large Institutions

M.C. Lin, MD,1 D.J. Vreeman, PT, DPT,2,3 C.J. McDonald, MD,4 and S.M. Huff, MD1,5

Summary

Objectives

We characterized the use of laboratory LOINC® codes in three large institutions, focused on the following questions: 1) How many local codes had been voluntarily mapped to LOINC codes by the each institution? 2) Could additional mappings be found by expert manual review for any local codes that were not initially mapped to LOINC codes by the local institution? and 3) Are there any common characteristics of unmapped local codes that might explain why some local codes were not mapped to LOINC codes by the local institution?

Methods

With Institutional Review Board (IRB) approval, we obtained deidentified data from three large institutions. We calculated the percentage of local codes that have been mapped to LOINC by personnel at each of the institutions. We also analyzed a sample of unmapped local codes to determine whether any additional LOINC mappings could be made and identify common characteristics that might explain why some local codes did not have mappings.

Results

Concept type coverage and concept token coverage (volume of instance data covered) of local codes mapped to LOINC codes were 0.44/0.59, 0.78/0.78 and 0.79/0.88 for ARUP, Intermountain, and Regenstrief respectively. After additional expert manual mapping the results showed mapping rates of 0.63/0.72, 0.83/0.80 and 0.88/0.90 respectively. After excluding local codes which were not useful for inter-institutional data exchange, the mapping rates became 0.73/0.79, 0.90/0.99 and 0.93/0.997 respectively.

Conclusions

Local codes for two institutions could be mapped to LOINC codes with 99% or better concept token coverage, but mapping for a third institution (a reference laboratory) only achieved 79% concept token coverage. Our research supports the conclusions of others that not all local codes should be assigned LOINC codes. There should also be public discussions to develop more precise rules for when LOINC codes should be assigned.

Keywords: Controlled Vocabulary, LOINC, Evaluation Research, Clinical Laboratory Information Systems

Introduction

With the development of electronic health records, there is a strong need to establish standard vocabularies to record patient related data, especially in reporting laboratory results. Most laboratories use their local codes internally and use LOINC® codes or other standardized codes when there is a need to communicate outside of their own enterprise, e.g. returning results to an ordering physician, the submission of laboratory results to an insurance company, data sharing in a regional clinical data exchange network, or reporting required information to a public health department. Huff et al. noted that when LOINC achieved wide spread use, it would be important that sufficient LOINC codes existed to cover the needs of reporting patient data (1). Researchers have reported that mapping local codes to LOINC codes can be complex (2-4). Therefore, we were interested in learning:

  1. to what extent have local codes have been mapped to LOINC codes
  2. what volume of patient test result instances are covered by the mapped codes
  3. how many more local codes could be mapped by expert manual review
  4. how fast the number of local codes is increasing
  5. how fast the number of LOINC codes is increasing
  6. whether there were any common patterns or characteristics of local codes that were not mapped to LOINC that might identify systematic problems in using LOINC.

We did not evaluate the correctness of the local LOINC code mappings in this part of our research.

Background

1. Development of LOINC

Currently, Health Level Seven (HL7) (5) is the most common electronic message standard used in exchanging clinical data among hospitals, pharmaceutical manufactures, and public health departments. The observation segment of HL7 messages uses an EAV (entity-attribute-value triplet) (6) strategy to represent clinical data. For example, a serum sodium concentration measurement would be represented conceptually as “Laboratory Test (entity) has Test name = Serum Sodium Concentration (attribute); value =138 mmol/L (value)”. Here is an example of the actual syntax of an HL7 Version 2 OBX (observation/result) segment:

OBX|1|NM|2951-2^Serum Sodium Concentration^LN|1|138|mmol/L|||“ (1).

In this example, the LOINC code “2951-2” has been used as a standard code to represent the meaning of the serum sodium concentration measurement. LOINC was created to be a universal terminology for the electronic exchange of clinical observations for any kind of data exchange where the EAV approach is used. The intent was that different enterprises would map their local codes to LOINC, and then the LOINC codes would be used as the standard identifiers in data exchange. Essentially, the LOINC codes become the lingua franca for identifying observations in interoperable data exchange in health care.

The LOINC committee began to develop a universal vocabulary for reporting laboratory and clinical observations in February of 1994. It released the first version of LOINC codes in the spring of 1995 with about 6,000 laboratory test result codes (1, 7). The LOINC committee releases an updated version of the terminology twice each year. The current LOINC release (version 2.30, Feb 2010) contains 57,693 active codes, including both laboratory and clinical observation codes.

2. Current use of LOINC codes

Currently, LOINC is widely used in many organizations, including major laboratories (e.g. ARUP, Quest and LabCorp), hospitals, public health departments, health care provider networks (e.g. Indiana Network for Patient Care, INPC) (8), and insurance companies (e.g. United Healthcare) (9). The National Electronic Disease Surveillance System (NEDSS) of Centers for Disease Control and Prevention (CDC) of the United States recommends using HL7 messages with LOINC codes to submit electronic laboratory reporting and surveillance data to federal agencies and departments (10). Many studies have also evaluated how well LOINC has been applied to specific domains, such as nursing documents and standardized assessment measures and clinical data in hospital information systems (HIS) (6-8). Dugas et al. analyzed the coverage of LOINC codes for document types in a German HIS, and reported that more than 93% of the local HIS documents and local document types could be assigned a LOINC code (11).

3. Evaluating Terminological Systems

Terminological systems (TSs) can be evaluated from two main perspectives: 1) the content independent perspective and 2) the content – dependent perspective (12, 13). The “content independent” approach mainly discusses the requirements of terminology systems from a functional, structural, and policy perspective. Examples of content independent requirements include James Cimino’s desiderata for controlled medical vocabularies (14), and the technical specification “Health informatics – Controlled health terminology – Structure and high-level indicators” published by the International Standards Organization (ISO) (15). The “content dependent” approach mainly evaluates the use of terminology systems in specific domains. Examples of content dependent investigations include the evaluation of the coverage of the Unified Medical Language System (UMLS) for coding of concepts in the Gene Ontology (GO) (16), the evaluation of coding consistency of the Systemized Nomenclature of Medicine – Clinical Terms (SNOMED CT) in reporting rare diseases (17), and analyzing the coding consistency of LOINC in three hospitals (2).

By using the content dependent approach to analyze the coverage of TSs, Cornet et al. defined two types of coverage. 1) concept type coverage - the number of concepts in a collection of concepts (e.g. result descriptions in a laboratory test catalogue or dictionary) that can be mapped to concepts in a standard terminology. 2) concept token coverage - the volume of data instances covered by concepts in a standard terminology. For example if 10 instances (tokens) of hematocrit results are sent on an interface, all 10 instances are covered by the existence of a single hematocrit test code in the standard terminology. Concept token coverage means the percentage of laboratory test instances that have mappings in the standard terminology (Table I) (13). “Concept type coverage” is calculated by dividing the number of local codes that have been mapped to the reference terminology (i.e. concepts mapped to LOINC in the current study) by the total number of unique local codes. “Concept token coverage” is calculated by assessing instances of laboratory results and is the percentage of laboratory test instances whose code has been mapped to the reference terminology versus the total number of test instances. Compared to concept type coverage, concept token coverage can reflect what percentage of total volume of laboratories tests have LOINC mapping in daily use.

Table I
The definition of concept type coverage and concept token coverage as used in this article

4. Previous reports on LOINC mapping

Two large institutions (3, 4) have reported their LOINC mapping experiences. The common findings from these reports are:

  1. The current LOINC database is not yet comprehensive. : The LOINC database is still under active development and the number of LOINC codes has increased from about 6,300 to 53,000 from 1996 to 2009. Dugas et al. reported that when using the Regenstrief LOINC Mapping Assistant (RELMA®) the LOINC coverage for their hospital information system concepts increased from 77% to 93% between version 3.23 and 3.24 of RELMA (11). The LOINC committee recommends that any missing concepts be submitted to LOINC committee for creation of new LOINC codes.
  2. The frequency distribution of mapped local codes is highly skewed: concept type coverage was 46% and concept token coverage was 89.9% in the Department of Defense LOINC mapping project (4). High volume tests are mapped more often than infrequent tests.
  3. It is probably not appropriate to assign LOINC codes to all local codes: Some local codes do not carry any clinical information, e.g. an internal “Billed” flag - would not normally be exchanged between institutions. Also, local systems sometimes represent their content in ways that do not conform to HL7 best practices or to the LOINC model, e.g. “See Note”, “See Chart” or multiple narrative text results in a field where a single code was expected (3, 4). Local codes that violate the fundamental principles of unambiguous data exchange would also not be assigned LOINC codes

Methods

1. Data sources

The official LOINC database is stored in Microsoft Access™ 2003 format. We retrieved two fields, “date last changed (Add)” and “class types (laboratory class or clinical class)”, of data from the LOINC database between April 1995 and April 2008. The numbers of laboratory and clinical observation codes were catalogued in order to observe the increase in the number of LOINC codes over time.

After obtaining IRB approval, de-identified patient data were collected from three institutions: 1. Associated Regional and University Pathologists, ARUP Laboratories (Salt Lake City, UT) 2. Intermountain Healthcare, (Salt Lake City, UT) 3. Regenstrief Institute, Inc. (Indianapolis, IN). ARUP Laboratories is a national clinical and anatomic pathology reference laboratory and is owned and operated by the Pathology Department of the University of Utah. Intermountain Healthcare is a not-for-profit health care provider organization, with hospitals located in many major cities in Utah. Regenstrief Institute, Inc., is an informatics and health care research organization, that is located on the campus of the Indiana University School of Medicine in Indianapolis.

These three large institutions were founding members of the LOINC committee and have contributed terms and concepts to the LOINC coding system (7). These institutions represent quite different types of health care organizations. ARUP is a reference laboratory that receives samples from hundred of clients. Intermountain is a health care provider organization that sends laboratory orders and samples to several different laboratories. Regenstrief is a health care research organization that convened and operates a regional health information exchange called the Indiana Network for Patient Care (INPC). Though ARUP and Intermountain have a similar geographical location, they did not share their resources or dictionaries while performing LOINC mappings. Each of the institutions performed their mappings using internal staff and not by commercial coding service companies. Their experiences provide three independent perspectives of LOINC mapping and usage.

2. Data scope

This research focused on mappings related to laboratory LOINC codes. We chose laboratory test results because laboratory data is one of the most important kinds of data in the medical record and it has been mapped to LOINC codes more frequently than any other kind of data.

At ARUP and Intermountain, the de-identified patient data were collected for the month of April for five consecutive years (each April, from 2003-2007).

The data from Regenstrief came from the INPC, which presently includes data from more than two hundred source systems and eighteen different health systems. Regenstrief maps local system observation codes to terms in the INPC master dictionary, whose terms are also mapped to LOINC (3). De-identified patient data for a 13 month period (August, 2007 – August 2008) and the mappings of local codes to LOINC codes (via the INPC master dictionary terms) were extracted from the five founding INPC health systems.

In these three institutions, the mappings were done incrementally and stored in reference tables, which only contain the mappings between local codes and LOINC codes. The version of the LOINC database used and the timestamps of the mappings were not available in these three institutions.

3. Data collection and processing

The patient data were retrieved by administrative staff at each institution. Each individual test result included the following database elements: 1. Event ID 2. Observation ID (Local code) 3. Observation Description. No identifying information was included. To transform different formats of patient data of each institution to a common format, individual parsing programs were customized for each institution to generate standardized comma separated values (CSV) files (Fig. 1). LOINC mappings for local codes were added as a new column in the CSV files, with the LOINC mappings being provided from the reference file supplied by each institution. The CSV files were then scanned to calculate the following numbers: 1) numbers of unique local codes, 2) numbers of unique local codes having a LOINC code mapping, 3) total numbers of event IDs for each local code, and 4) total numbers of event IDs of each local code that was mapped to a LOINC code. Parsing programs were executed at each institution for processing patient data and only final statistical data was sent to the authors for analysis. After obtaining the primitive data as described above, concept type coverage and concept token coverage were calculated. In order to determine if the locally mapped tests were the most frequently resulted tests, cumulative concept token coverage of mapped and unmapped tests were calculated taking into consideration the frequency of the test.

Fig. 1
The steps in data processing. The patient data as initially stored in the source institutions in various formats, with data being stored in an Enterprise Data Warehouse, comma separated values (CSV) files, or HL7 messages. The data was transformed into ...

4. Manual review of unmapped codes

We wanted to estimate the number of local codes that were not mapped to LOINC codes that could theoretically be mapped by expert manual review of sample of unmapped local codes.

We used Version 2.22 (Released 12/03/2007) of the LOINC database as the target for mapping. To review those unmapped local codes, a ten percent sample (concept type coverage) of all local codes from each institution was generated and the identical sample was given to two reviewers for manual mapping. After manual mapping, reviewers rated results in two categories: 1) “Yes” - locally unmapped codes could be mapped manually, and 2) “NO” - locally unmapped codes could not be mapped manually. To evaluate the inter-rater agreement between two reviewers, the reviewed results were analyzed by using Fleiss’s kappa (18), which can handle fixed numbers of reviewers and categorical ratings. Disagreements of manual mapping results from the first two experts were reviewed by a third expert to establish the gold standard. Also, each unmapped code was grouped into one of five categories according to the possible reason that the local code was not mapped: 1) no analyte – no suitable analyte was found in LOINC, 2) ambiguous meaning – the meaning of the local code was not clear and could not be determined by the information available to the reviewer, 3) internal use only – the local code may represent internal laboratory processing status rather than patient data, 4) overly specific methods – the local test name may have an overly specific measurement method and 5) narrative results – the local code may represent a comment that is context specific to a single result. After assigning categories to each code, we calculated concept type coverage and concept token coverage for each category of unmapped codes.

After manual review, we recalculated concept type coverage and concept token coverage by two approaches: 1) Adding all newly mapped local codes from the manual review sample to the original mapped local codes: This approach addresses the question of the extent to which current local codes can be mapped to LOINC codes by expert manual review. 2) Excluding two types of local codes (“internal use only” and “narrative result”), where assigning LOINC codes is not needed for clinical data exchange. This approach can reveal how well LOINC codes cover just the set of concepts that are useful for clinical data exchange.

Results

1. The growth of local codes and LOINC codes

Since May 1998, the number of LOINC codes has grown steadily from 15,464 to 53,345 and the majority of LOINC codes are laboratory terms (Fig. 2). At the same time, the number of local codes has also increased continuously. In 2003, at Intermountain, there were 1,409 local codes which were mapped to 1,092 LOINC codes; in 2007, there were 1,667 local codes mapped to 1,302 LOINC codes (Fig. 3).

Fig. 2
The number of LOINC codes over time (May 1998 – Jan 2009)
Fig. 3
The number of local codes and LOINC codes used at ARUP and Intermountain (every April, 2003 - 2007)

2. The cumulative concept token coverage of mapped and unmapped tests

Fig. 4 shows the percent cumulative concept token coverage of mapped and unmapped tests at each institution in 2007. More than 70% of concept token coverage was accounted for by 200 locally mapped tests at Intermountain and Regenstrief.

Fig. 4
The cumulative percentage of concept token coverage of mapped and unmapped tests at Intermountain, ARUP and Regenstrief (*) in 2007. The three solid lines represent the cumulative concept token coverage of mapped tests and the three dotted lines represent ...

3. The concept type coverage and concept token coverage before and after manual review

Agreement among the two reviewers was calculated by using Fleiss’ kappa. The kappa value was 0.92 and interpreted as “almost perfect agreement” (19). The disagreement of results was reviewed by a third expert for generation of the gold standard.

The number (concept type) of local codes in samples from ARUP, Intermountain and Regenstrief were 4,321, 1,667, and 7,387 (Table II). Before sampling for manual review of unmapped codes, the concept type coverage and concept token coverage were 0.44/0.59, 0.78/0.78 and 0.79/0.88 for ARUP, Intermountain, and Regenstrief respectively.

Table II
The level of local mappings from each institution. The data sets of Regenstrief consist of local codes collected from five institutions. The numbers (concept type) from the individual institutions are: 1,311, 1,176, 1,471, 1,187 and 2,242

The one tenth sample of these data sets contain 432, 167, and 739 codes, respectively (Table III). An attempt was made to manually map all unmapped codes from the samples. After adding the new mappings to the originally mapped codes, concept type coverage and concept token coverage were 0.63/0.72, 0.83/0.80 and 0.88/0.90 respectively (Table IV).

Table III
The results of mappings before and after manual review of unmapped codes at each institution. After review, the number of new mappings found were 91, 8, and 75 respectively
Table IV
The percentage of local codes that had LOINC mappings in the original submissions and after manual mapping and review

4. The analysis of mapped and unmapped codes after review

Fig. 5 shows the frequency of initially unmapped local codes which could be mapped after manual review. The most frequently mapped and unmapped codes were listed and ordered based on their frequency in instance data (Table (TableV,V, ,VI).VI). After categorizing unmapped codes into the five categories of unmapped reasons, concept type coverage and concept token coverage of for all unmapped codes in each category were calculated (Table VII). The largest concept token coverage (0.64 and 0.92) of unmapped codes at Intermountain and Regenstrief was due to “narrative result”, e.g. “Comments Result, Qualitative for GFR”, “Interp Gliadin/Gluten IgA”; at ARUP the largest concept token coverage of unmapped codes (0.57) was due to “no analyte”, e.g. “NB C12-OH”. Across the three institutions, “internal use only”, e.g. “Report Status, Qualitative”, is a common reason for unmapped codes. After excluding two types of local codes (“narrative results” and “internal use only”) from the dataset, concept type coverage and concept token coverage were 0.73/0.79, 0.90/0.99 and 0.93/0.997 respectively (Table IV).

Fig. 5
The histogram of concept token coverage of originally unmapped codes which were manually mapped to LOINC at ARUP. The frequency is normalized by the biggest frequency of the test (NB Glycine).
Table V
The top 10 newly mapped local terms after manual review are listed by their ranks (based on use in instances of data) in the three institutions. In the Intermountain sample, the number of mapped codes is less than 10
Table VI
A sample of unmapped concepts showing the categorization of reasons that the codes were not mapped. There are five categories: 1) A- no analyte, 2) M – meaning is not clear, 3) I – internal use, 4) O – overly specific method and ...
Table VII
The concept type coverage and concept token coverage of unmapped codes in each category. A- no analyte, M – meaning is not clear, I – internal use, O –overly specific method and N – narrative result. The bold number indicates ...

Discussion

1. Local mapping is incomplete

Concept type coverage of mapping increases from 0.44 to 0.63, 0.78 to 0.83 and 0.79 to 0.88 at ARUP, Intermountain and Regenstrief respectively, which means the local mappings were incomplete in each institution. Some possible reasons were: 1) mapping is a labor intensive job, so mapping is not performed on all local codes. Fig. 4 also shows that frequent tests are more commonly mapped. 2) New local codes and LOINC codes continue to be created and the mapping process does not keep up. It is hard to keep local mappings up to date on the latest LOINC version. 3) Not everyone is using LOINC codes to exchange data yet, therefore there is no urgency to do the LOINC mappings. Although concept type coverage is not 100% yet, these institutions can still report patient data using internal codes.

2. Not all local codes should be assigned a LOINC code

Assigning LOINC codes to local codes like “narrative results” does not help create interoperable data exchange. For example, local observations like “Seq. HLA-B Interp” and “DOCTOR REVIEW - PT PCR”, usually have values that are comments or directions to a human reader like “See Note” or “See Chart”. LOINC is designed to carry clinical data using the EAV strategy, but narrative results sometimes contain a mix of different kinds of information: analyte names, actions, people’s names, and date and time information. A real example of a narrative example is, “Colony Bacillus species. Results called to and read back by John 10/02/2008 14:41:56”. This result value does not follow the EAV style it is probably not useful to try to assign LOINC codes that could capture the context of this statement. These kinds of local codes carry important information, but it can only be read and understood by human users. A better strategy is to break the information into discrete data elements so it can be used by automated decision support processes. Terminologists and system developers should avoid using narrative text to encode clinical data for medical exchange and follow the style of discrete EAV data (20).

Assigning LOINC codes to “internal use” codes like “RETICRTR BILL”, which has values of “Billed”, and “Confirmed”, would not typically be useful for inter-enterprise data exchange because they do not carry any clinical data.

At Intermountain and Regenstrief, the main two reasons for unmapped codes are “narrative results” and “internal use only”. Assuming that these local codes are not appropriate for inter-enterprise data exchange, a flag could be added to the lab reference table to indicate a “Do not map” status for those items (3, 4). After excluding “narrative” and “internal use” codes, coverage increased to 0.73/0.79, 0.90/0.99 and 0.93/0.997 respectively. At Intermountain and Regenstrief, the current LOINC database contains codes that could cover about 99% of volume of laboratory tests. New LOINC codes will need to be created for ARUP content if concept token coverage for ARUP is to reach the same level of coverage as currently exists for Regenstrief and Intermountain.

3. Creation of new LOINC codes

The unmapped local codes in the “no analyte” category should be submitted to the LOINC committee for the creation of new LOINC codes. The unmapped tests which are due to “overly specific method”, e.g. “HLA-DR DQ Hi Res Amp2” or “HLA-DR DQ Hi Res Amp1” pose a different problem. These local codes include very specific information about the method. We would propose that if it is desirable to include highly specific method information with the patient result, then the method be sent as coded data in a special “method type” field in the result message, rather than pre-coordinating the method name into the test code. We also noted inconsistency across institutions regarding specificity of mappings as they relate to methods. It appears that sometimes mappers link the method specific codes to a more general LOINC code, and at other times they link to a method specific LOINC code. This causes inconsistency in mappings across institutions. A comprehensive analysis of these inconsistencies is beyond the scope of this paper, but we would like to examine this issue in future work.

The current process of submitting requests for new LOINC codes asks users to provide information for the 5 primary axes of the LOINC code definition (21). However, the creation of local codes is often a separate process from mapping to LOINC codes or submitting requests for new LOINC codes, and different people are usually responsible for these separate activities. Therefore, it is often the case that it requires extra effort to gather the information to submit new local codes for the assignment of LOINC codes. People do not always go to the extra effort to submit requests for new LOINC codes to match new local codes. At Regenstrief, they have deployed an Exception Browser (3) to monitor all of the INPC data streams. If there is a new local code which cannot be found in their master dictionary, the Exception Browser generates an exception and requires further actions by a human to deal with the new codes. They can either request new LOINC codes or make a notation in the mapping file that the new local code is to be ignored. This kind of automation can facilitate the appropriate creation of new LOINC codes.

4. Version control of LOINC mappings

The version of the LOINC database used for mapping was not available from the three institutions. Newer versions of the LOINC database have the possibility of affecting the calculation of concept type coverage and concept token coverage following manual review of initially unmapped local codes. Because the new database has more codes, it could be that an unmapped code can now be mapped whereas at the time of initial mapping no matching concept existed in the older version of the LOINC database. Use of the newer version of the LOINC database could change the number of unmapped local codes in the “no analyte” and of the “overly specific method” categories, but these changes would only make small differences in our overall statistics. Our goal was to estimate the maximal level of LOINC mapping that could reasonably be achieved, and we believe our method leads to a good estimate of the maximum mapping that can be achieved in the current database.

5. The frequency distribution of local codes that are mapped to LOINC is highly skewed

In a previous study of INPC laboratory data, it was concluded that the 244 to 517 local codes represented 99% of the volume from all institutions and there were 97 local codes that were common to all five institutions (22). This conclusion also coincides with our observation that only a small number of tests account for a large portion of the volume at Intermountain and Regenstrief, and that about 200 locally mapped tests account for more than 70% of test volume. At ARUP, it takes a larger number of tests to account for the same total volume. A possible reason is that Intermountain and Regenstrief, which are general health care provider organizations, use more common tests, e.g. general biochemistry, but ARUP, which is a reference laboratory, has a greater preponderance of rare tests, e.g. allergen tests, as compared to the other two institutions. Based on these observations, we would predict that at general health care organizations, mapping a relatively small number of tests (less than 500) will cover a large volume of the common laboratory tests. Since concept token coverage is higher than concept type coverage, we can infer that on average mapped local codes occur more often in instances of patient data than the unmapped local codes. To extend this research, we plan to pool all frequent tests and their LOINC mappings from the reference tables of each institution to generate a master index file containing the most frequent local codes and their mappings. This file could then be used by institutions as they begin to map their local codes, and they would initially only need to map the codes which are listed in the master index file. They should be able to reach a high concept token coverage without spending a lot of time mapping all local codes (22).

Limitation

The three organizations examined in this study have been intimately involved in LOINC development, and they may be more likely to have local names that match LOINC content and have a better understanding of how to do LOINC mappings. Thus, the three institutions are not representative of institutions in the US or worldwide. The implication is that the percent of locally mapped local codes and the coverage of local codes in these three institutions is probably higher than would be expected in other institutions. Finally, we did not verify the accuracy and consistency of the mappings of local codes to LOINC codes in this phase of our research, and more work is needed to gain insight into these aspects of mapping across institutions.

Conclusions

The number of local codes and LOINC codes continues to grow, which means that each institution needs a process to maintain their local LOINC mappings. For general health care providers, concept token coverage can reach about 99% for daily use. The reference laboratory has a greater number of rare tests, which will require creation of new LOINC codes to reach the same level of concept token coverage. Our research also supports the conclusions of others that not all local codes should be assigned LOINC codes. There should be public discussions about how laboratory processes could be further standardized so that the results produced are more consistent and interoperable. There should also be public discussions to develop more precise rules for when LOINC codes should be assigned. Extending this research to examine the consistency and accuracy of local mappings across institutions will be an important next step in evaluating whether LOINC is meeting its goal of being a universal coding system for observation identifiers.

Acknowledgements

The authors would like to thank Brian Jackson, MD and Alan Terry, MS from ARUP for retrieving and processing the ARUP data set for this research and Chia-Cheng Lee, MD from the Biomedical Informatics Department of the University of Utah for reviewing the sample data.

References

1. Huff SM, Rocha RA, McDonald CJ, De Moor GJ, Fiers T, Bidgood WD, Jr., et al. Development of the Logical Observation Identifier Names and Codes (LOINC) vocabulary. J Am Med Inform Assoc. 1998 May-Jun;5(3):276–92. [PMC free article] [PubMed]
2. Baorto DM, Cimino JJ, Parvin CA, Kahn MG. Combining laboratory data sets from multiple institutions using the logical observation identifier names and codes (LOINC) Int J Med Inform. 1998 Jul;51(1):29–37. [PubMed]
3. Vreeman DJ, Stark M, Tomashefski GL, Phillips DR, Dexter PR. Embracing change in a health information exchange; AMIA Annu Symp Proc; 2008.pp. 768–72. [PMC free article] [PubMed]
4. Lau LM, Banning PD, Monson K, Knight E, Wilson PS, Shakib SC. Mapping Department of Defense laboratory results to Logical Observation Identifiers Names and Codes (LOINC); AMIA Annu Symp Proc; 2005.pp. 430–4. [PMC free article] [PubMed]
5. Health Level Seven International . Reference Information Model. Health Level Seven, International; 2009. Available from: http://www.hl7.org.
6. Nadkarni PM, Marenco L, Chen R, Skoufos E, Shepherd G, Miller P. Organization of heterogeneous scientific data using the EAV/CR representation. J Am Med Inform Assoc. 1999 Nov-Dec;6(6):478–93. [PMC free article] [PubMed]
7. Forrey AW, McDonald CJ, DeMoor G, Huff SM, Leavelle D, Leland D, et al. Logical observation identifier names and codes (LOINC) database: a public use set of codes and names for electronic reporting of clinical laboratory test results. Clin Chem. 1996 Jan;42(1):81–90. [PubMed]
8. McDonald CJ, Overhage JM, Barnes M, Schadow G, Blevins L, Dexter PR, et al. The Indiana network for patient care: a working local health information infrastructure. An example of a working infrastructure collaboration that links data from five health systems and hundreds of millions of entries. Health Aff (Millwood) 2005 Sep-Oct;24(5):1214–20. [PubMed]
9. McDonald C, Huff S, Suico J, Hill G, Leavelle D, Aller R, et al. LOINC, a universal standard for identifying laboratory observations: a 5-year update. Clinical Chemistry. 2003;49(4):624. [PubMed]
10. Potential effects of electronic laboratory reporting on improving timeliness of infectious disease notification--Florida, 2002-2006. MMWR Morb Mortal Wkly Rep. 2008 Dec 12;57(49):1325–8. [PubMed]
11. Dugas M, Thun S, Frankewitsch T, Heitmann KU. LOINC codes for hospital information systems documents: a case study. J Am Med Inform Assoc. 2009 May-Jun;16(3):400–3. [PMC free article] [PubMed]
12. Arts DG, Cornet R, de Jonge E, de Keizer NF. Methods for evaluation of medical terminological systems -- a literature review and a case study. Methods Inf Med. 2005;44(5):616–25. [PubMed]
13. Cornet R, de Keizer NF, Abu-Hanna A. A framework for characterizing terminological systems. Methods Inf Med. 2006;45(3):253–66. [PubMed]
14. Cimino JJ. Desiderata for controlled medical vocabularies in the twenty-first century. Methods Inf Med. 1998 Nov;37(4-5):394–403. [PMC free article] [PubMed]
15. ISO/TC215 Health informatics -- Controlled health terminology -- Structure and high-level indicators. 2002 Report NO.:17117.
16. Bodenreider O, Mitchell JA, McCray AT. Evaluation of the UMLS as a terminology and knowledge resource for biomedical informatics; Proc AMIA Symp; 2002.pp. 61–5. [PMC free article] [PubMed]
17. Andrews JE, Richesson RL, Krischer J. Variation of SNOMED CT coding of clinical research concepts among coding experts. J Am Med Inform Assoc. 2007 Jul-Aug;14(4):497–506. [PMC free article] [PubMed]
18. Fleiss J. Measuring nominal scale agreement among many raters. Psychological Bulletin. 1971;76(5):378–82.
19. Landis J, Koch G. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–74. [PubMed]
20. Coyle JF, Mori AR, Huff SM. Standards for detailed clinical models as the basis for medical data exchange and decision support. Int J Med Inform. 2003 Mar;69(2-3):157–74. [PubMed]
21. LOINC Committee . LOINC Submissions. Indianapolis, IN: 2008. updated 2008 Dec 23. Available from: http://loinc.org/submissions.
22. Vreeman DJ, Finnell JT, Overhage JM. A rationale for parsimonious laboratory term mapping by frequency; AMIA Annu Symp Proc; 2007.pp. 771–5. [PMC free article] [PubMed]