SLE is known to occur in families more frequently than in the general population; 5–10% of SLE patients have a second family member with SLE [1
] and the sibling recurrence rate (prevalence in siblings of SLE affected/population prevalence) is estimated to be between 8 and 30% [2
]. The monozygous twin concordance rate of SLE is 24%, with a lifetime rate concordance of ~30% [3
]. Collectively, these observations strongly suggest that an important understanding of the pathogenesis of SLE could be initiated through the discovery of genes that increase familial susceptibility to the disease.
In the early 1990s, Dr John Harley, at the Oklahoma Medical Research Foundation (OMRF), began assembling multiplex families for SLE, and caught the attention of the National Institute of Arthritis, Musculoskeletal and Skin Diseases (NIAMS) at the American College of Rheumatology (ACR) annual meeting in the fall of 1993. This was the conceptual birthplace of what became the Lupus Multiplex Registry and Repository and led to contractual funding from the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) in the fall of 1995.
Establishing a formal registry led to an increased effort in recruitment of multiplex families with SLE, along with more rigorous and detailed data collection and organization, eventually becoming a centralized repository of data, sera, plasma, DNA, transformed B-cell lines and peripheral blood mononuclear cells. The LMRR offered approved scientists access to more than 5
000 data points for each sample, including clinical information, demographics, lupus serology and genotyping data from microsatellites, and later, single-nucleotide polymorphisms (SNPs).
Lupus is a disease of disparities: it is estimated that nine out of ten people diagnosed with lupus are women, and four of ten are African-Americans who are at least three times more likely to have lupus than European-Americans of the same sex [6
]. Recent data from admixture studies show that American-Indian ancestry is almost 8-fold more important in generating lupus than is European ancestry [7
]. Finally, Asian ancestry also leads to a higher prevalence of lupus than European ancestry [8
]. Knowing these disparities, the staff and leadership of the LFRR have consistently targeted recruitment efforts towards the population lupus affects the most, maintaining at least 40% minority enrollment since its inception.
A central focus of the repository since 1999 has been identifying the genes responsible for lupus in African-Americans. This effort was boosted by collaboration, in 2003, with Drs Gary Gilkeson and Diane Kamen at the Medical University of South Carolina (MUSC), who spearheaded the enrollment of SLE patients and controls from the Gullah population. The Gullah are a semi-isolated population residing in the Sea Islands along the South Carolina coast and adjacent inland communities. They have lived in this region since the early 1700s when they were transported from West Africa, and constitute a unique population with greater genetic homogeneity than most other African-American communities in the USA. The Gullah maintain stable family units, making assembly of a collection of pedigrees for the study of multigenic diseases like lupus possible.
The LFRR genetics approach has evolved over time; originally, it was focused on linkage in multiplex pedigrees and microsatellite genotyping. The technical revolution in SNP genotyping changed the focus to case–control studies for association, making genome-wide association studies (GWASs) feasible. The LFRR responded to this challenge by expanding recruitment to include simplex families, which are ideal for population-based case–control and family-based association studies.
By 2006, the expanded future goals of the LMRR led to active recruitment of both sporadic and familial SLE and the LMRR was renamed the Lupus Family Registry and Repository (LFRR). In addition, this change allowed for a greater variety in race, ethnicity, gender and age of participants diagnosed with SLE, leading to a collection of lupus patients that more closely reflects the natural distribution of the disease across those same variables ().
Fig. 1 Types of pedigrees enrolled in LFRR. Multiplex: two or more family members affected with SLE participating; Familial: one participant affected with SLE who has or had family members with SLE, but the other SLE affected family members are currently unavailable (more ...)
Fig. 2 Race and ethnicity of families enrolled in LFRR. (A) Number of individuals affected with SLE by race and ethnicity. (B) Number of pedigrees enrolled by race and ethnicity. The racial and ethnicity classification is based on the National Institutes of (more ...)
Fig. 3 Timeline of LFRR. Work on assembling and distributing materials and data from families with one or more living members diagnosed with lupus started in 1991 and has led to a large number of discoveries and scientific publications, as well as the world’s (more ...)
Fifteen years after its modest beginnings, the LFRR has collected 2618 affected SLE individuals from 1954 pedigrees, 8157 controls and information on 82
000 individuals and 35 million data points. The bio-bank contains more than 750
000 aliquots of biological samples and the genotyping core has generated more than 650 million genotypes. All of this has been translated into 104 peer-reviewed publications based on access to LFRR resources (see supplementary data
available at Rheumatology
Online) and has made a major contribution toward the transformation of our understanding of the genetics of lupus that, thanks partly to the LFRR, is well under way.
Nowadays, the processes of the LFRR are divided into multiple specialized areas as follows.
The LFRR recruiters receive information, with the permission of the prospective participant, concerning whether they are willing to participate and whether they might satisfy entry criteria. Collaborating investigators, the Centers for Medicare and Medicaid Services, physician referrals, recruitment events, and the LFRR Web site (www.lupus.omrf.org
) are ascertainment sources. Each SLE patient who agrees to participate completes a screening interview. Recruiters score the interviews based on the scoring system developed to gauge the ACR classification criteria [10
], enter the information into the database, and alert record reviewers of an interview awaiting evaluation. Recruiters proceed based on the record reviewer’s decision to enrol the participant, to place her/him into pending, or to remove from further consideration.
Enrolment involves obtaining informed consent from the participant and participating family members and controls, providing paperwork and arranging phlebotomies. Weekly follow-up through phone calls, e-mails or letters is critical to ensure sample and data return. Participants who are hesitant about completing participation after enrolment are removed, but not before substantial recruiter effort to facilitate their initial decision to participate.
Potential participants placed into pending (whose eligibility is determined by reviewing medical records before obtaining a blood sample) are provided with an explanation of why they are not proceeding immediately to enrolment, as well as release of information (ROI) forms and the OMRF Notice of Privacy Practices. Recruiters contact pending participants monthly regarding the status of records acquisition and review. After record review, enrolment or removal procedures follow based on the decision of the reviewer.
All enroled participants who complete participation receive a small reimbursement, a copy of their serology report (if desired) and a free subscription to the annual LFRR newsletter, the Lupus Linkage Newsletter.
Medical records acquisitions
The clinical data that are collected in the database are based on verifiable medical records. The records pertinent to the diagnosis and treatment of each affected participant are obtained by records acquisition specialists after the participant has sent a signed Health Insurance Portability and Accountability Act (HIPAA)-compliant release of information form along with a list of facilities that will provide the records. Each health care provider is contacted in writing, specifying the patient and time frame for records of interest and a description of the purpose of the request, and provided a copy of the signed ROI form. Collecting all the records often requires repeated contacts with the providers; the database helps track the requests and monitors ROI expiration dates. Minimizing the expenses of medical record retrieval is a continuing goal.
Medical record abstraction
The medical record reviewer (M.D., P.A.-C. or R.N. specifically trained in this area) analyses the interview from the recruiter to determine whether the prospective participant is likely to meet ACR classification criteria for SLE [11
]. The medical records are screened by the reviewer and specific information is collected regarding which of the ACR classification criteria have been met and the earliest date that they were observed, as well as more than 200 additional laboratory and clinical data points that characterize the SLE manifestations and treatment of each affected participant.
Bio-specimen processing and storing
The biological samples of enrolled participants are usually received by overnight courier and immediately assigned a unique lupus genetics study barcode (LGScode) for sample and data-tracking purposes. The most commonly received bio-specimen is whole peripheral blood, however, occasionally mouthwash samples or preserved tissues are the source of DNA. Up to 66.5
ml of blood are collected from female participants; one additional 9.5
ml blood tube is collected from affected male participants (for karyotyping). An aliquot of blood is sent to the Clinical Laboratory Improvement Amendments (CLIA)-approved Clinical Immunology Laboratory at OMRF, where a standardized set of serological tests are performed. These include an ANA test on HEp-2 cells (an indirect fluorescent antibody test); anti-dsDNA (with titre by immunofluorescence against Crithidae luciliae
); extractable antibodies (ENA) (by precipitin) with detection of antibodies against Ro/SSA, La/SSB, Smith (Sm), nRNP, ribosomal P, PM-Scl, Jo-1, Mi-2, Scl-70 and unidentified precipitins; aPLs (by ELISA) comprising IgG, IgM and IgA cardiolipin antibodies; and in the case of affected participants, an additional aliquot of 100
µl is used for determination of the total haemolytic complement (CH-50).
The main components isolated from blood for the bio-specimen bank include serum, plasma, peripheral blood mononuclear cells (PBMCs) for EBV-transformed cell lines (EBVTLCs) and frozen specimens, and granulocytes for DNA isolation. Each of the components is stored in colour-coded aliquots of different volumes that allow for both efficient sample retrieval and minimization of freeze/thaw cycles. Dilutions of DNA are stored in 96-well deep-well storage plates at different dilutions to facilitate distribution to approved users (detailed handling protocols are available as supplementary data
The physical specimens collected from each participant are divided into two sets and are housed in two independent buildings that are both physically separated by space and electrical control, and monitored by onsite and offsite systems. This arrangement allows for the availability of backup samples for each participant in case of a catastrophic failure at one location.
Data recording and quality control
The data collected from each participant include the interview, questionnaire, serology results from the OMRF Clinical Immunology Laboratory and medical record review. These data are entered into the database; the questionnaires are scanned and the serology is directly imported from the computer system of the OMRF Clinical Immunology Laboratory. The remaining information is manually entered by a specialized member of the LFRR staff and later checked for accuracy by a second member.
The informatics infrastructure is critical to operate, maintain and store data for a successful registry. Day-to-day issues including database schemas, storage requirements and informatics integration with enrolment workflows are indispensable. The two central canons of the informatics approach to the challenges posed by the LFRR are that (i) the accurate identification of subjects and samples is paramount and (ii) we must automate every possible procedure and process.
Every participant and enrollee is related to a central database via a single barcode identification scheme, the LGScode, which is a six-digit identifier. The database server is a relational database, wherein different tables of data are interconnected or related to one another. The MySQL relational database management system has more than 200 tables and 1700 fields, which are available across the network to approved users. The main LFRR interface for staff contains a number of custom forms and code currently housed in a Microsoft Access front end.
The LFRR inventory allows the accurate and precise location of the biological samples geographically in freezers by modelling them within the virtual world of the computer database. Each aliquot is linked back by its LGScode to its respective study subject and sample generation event, the precise coordinates of location in storage and any addition, removal or manipulation. The complete inventory of the registry can be appraised and followed in real time, down to the volume of each aliquot.
The genotypic data collected by the LFRR are also processed through the informatics system, which has participated in the analysis of more than 650
000 SNP markers. More than 1 billion genotypes are stored and distributed by a Web-based interface for easy export.
Finally, the informatics team is also responsible for quality control and security of the data. The latest technology is applied to encryption systems, anti-virus screening, firewalls, secured data centre, software patches, multiple checks on data entry, software-level data validation and database-level data constraints.
The information technology (IT) team provides custom-made solutions for the needs of the LFRR staff. By utilizing open standards, mostly free or open source software and avoiding proprietary systems that hold the danger of locking the LFRR into a single vendor or system, we are able to grow and adapt as needed without unnecessary purchasing expenses (detailed information available as supplementary data
, available at Rheumatology
Institutional review board
Participant consent and privacy are the top priorities of the LFRR. The LFRR operates under the guidance of two institutional review boards (IRBs)—OMRF and the University of Oklahoma Health Sciences Center (OUHSC). The scientific activity requires a full-time IRB coordinator within LFRR, who responds to human subject and privacy-related questions and concerns from the LFRR recruiters, referral sites, users and collaborators. The coordinator oversees all communications between the LFRR and the IRBs, enforces compliance with human subject rules and regulations and supervises the successful completion of the Collaborative Institutional Training Initiative (CITI) Course for the Protection of Human Subjects in Research for all new employees.
All of the forms sent to participants, including the informed consent forms, the LFRR Web site text, advertising documents and revisions or modifications of any of these require IRB approval. An average of 45 new and revised proposals requiring IRB approval have been submitted every year in recent years. In addition, both IRBs require annual continuing review progress reports.
The addition of investigators at other institutions as LFRR affiliate sites also requires approval by the IRBs of the affiliate site and of OMRF and OUHSC; each site uses an informed consent document designed to satisfy the regulations of the IRBs of participating institutions.
Distribution of LFRR materials to external scientists
One of the main goals of the LFRR is to make the wealth of clinical and biological resources available to the relevant scientific community. It is important to note that the data and material of the LFRR are suitable for addressing some scientific problems but may not be a good resource for others; one of the major reasons for writing this article was to help interested scientists assess whether their scientific questions of interest can be addressed with the data or materials in the LFRR ().
Step-by-step instructions for a serious scientist wanting LFRR data or material.
The application process to become an approved scientific user of the LFRR includes completing the application packet, agreeing to the letter of understanding by the investigator and responsible institutional official, and approval by the IRB of the requesting institution. This information is reviewed by the LFRR Scientific Advisory Committee (SAC), the OMRF Director of Research Administration and the NIAMS LFRR Program Officer. Approval is either granted or denied; often, additional information is required. (Application packet is available in supplementary materials
Online and at www.lupus.omrf.org/LFRRApp.html.)
The available data include a database release file, genotyping results from about 300 loci from pedigrees informative for linkage, and pedigree structure diagrams for all pedigrees except sporadic cases. The biological samples that can be requested include DNA, serum and plasma and a renewable source of DNA, which are the primary cultures of transformed B lymphocytes.