1. Participating Centers
Twenty centers participated in the study. The majority of these centers represent secondary or tertiary referral centers across the USA with an established expertise in the diagnosis and management of pancreatic disorders. The University of Pittsburgh served as the coordinating center. The study was approved by the Institutional Review Boards at each participating center and all enrolled patients and controls signed an informed consent form prior to enrollment.
2. Subject Recruitment and Sample Collection
The study was designed to recruit consecutive patients with RAP or CP using broad but strict entry criteria and detailed phenotyping. Subjects fulfilling the entry criteria for RAP or CP at the participating sites were offered participation in the study by their physician or physician's associate. RAP was defined by the presence of two or more attacks of documented acute pancreatitis (AP) but with no imaging evidence of CP. AP was defined by clinical criteria (typical abdominal pain with elevation of pancreatic enzymes >3 times normal) or imaging evidence of AP. The primary entry criteria for CP were predetermined, definitive evidence of CP on imaging studies – endoscopic retrograde cholangiopancreatography (ERCP) using the Cambridge classification or CT scan [11
] which was fulfilled in 447 (83%) patients. Enrollment was based on evidence of CP on histology (alone or in conjunction with magnetic resonance cholangiopancreatography [MRCP] or endoscopic ultrasound [EUS]) in 27 (5%), EUS alone in 37 (7%), MRCP (alone or with EUS or abdominal ultrasound [USG]) in 13 (2%). In nearly 3% (n = 16) of CP cases, details of imaging or histology were not available to the participating centers.
Pancreatitis subjects who agreed to participate in the study completed the patient questionnaire (see below) with the assistance of a trained study coordinator, and provided a blood sample (a minimum of two lavender top tubes for DNA and one serum separator tube, with additional samples collected depending on the local protocols). The study enrolled four types of controls. The primary controls included a spouse, plus any first-degree family members. The spouse controls were selected to broadly control for race, ethnicity, gender and environmental exposure, and the family members were included for potential genetic analysis. If a spouse was not available, an accompanying friend was invited to participate. Finally, centers were encouraged to recruit subjects without pancreatitis as unrelated controls. One of the investigators (M.E.M.) recruited 100 community controls over 50 years of age from a family practice seen for routine health maintenance or chronic medical problems who did not have a history of pancreatitis. All control subjects completed the patient questionnaire and provided a blood sample.
Document and Specimen Management. The completed forms and supporting documentation (e.g. reports of imaging studies, pathology reports, hospital discharge summaries) were mailed to the central data site in Pittsburgh along with the blood samples by overnight courier. Upon arrival, each blood sample was assigned a unique laboratory number, with the linking codes kept in a secure computer and a paper copy locked in a metal filing cabinet within the coordinating center office which was also locked. Genomic DNA was initially extracted from whole blood using the Puregene System® (Gentra Systems, Minneapolis, Minn., USA) with later samples extracted using the Flexgene DNA® kit (Qiagen, Valencia, Calif., USA) with a modified manufacturer's buffy coat protocol. Serum was isolated from 6 ml of whole blood in serum-separator tubes. Serum and a whole-blood sample were stored in a freezer at –80°C.
Two comprehensive questionnaires per affected subject, one completed by the subject and the other by the subject's physician-investigator, were used for data collection using paper forms with spaces reserved for subject or physician comments (see below). The subject questionnaire was designed to collect information on demographics, family and personal history, environmental history including smoking and alcohol consumption, clinical questions relating to pancreatitis, medication use, disability, and quality of life (SF12). The physician questionnaire consisted of sections on AP, RAP and CP, working etiological diagnosis and documentation of risk factors, medical or surgical therapies the patient had undergone and a list of the patient's current medications. The questionnaire for control subjects was identical to the subject questionnaire. Physician-investigators did not complete a questionnaire for control subjects.
3. Affected Subject Questionnaire
a. Demographics and Family Medical Histories
The affected subject questionnaire was completed by all affected and control subjects. Demographic questions included age, sex, race, Ashkenazi Jewish ancestry, current and maximum height and weight, number of siblings and children. In the section on family and personal history, subjects were asked whether they had a personal or family history of a variety of diseases (acute, chronic or hereditary pancreatitis, cystic fibrosis and features of atypical cystic fibrosis (e.g. male infertility), liver disease including cirrhosis, gallstone disease or cholecystectomy, major systemic diseases (e.g. cardiovascular diseases, kidney disease) and cancers (pancreatic, breast, liver, colon, ovarian, endometrial, and others)). Information regarding family members was queried to allow for assessment of whether a particular condition was present, what relationship the affected person had with the subject (parents, siblings, etc.) and the total number of immediate family members (siblings, children) affected. A positive response was considered a ‘yes’ while non-response was considered a ‘no’. This outline allowed for the construction of a pedigree without collecting personal identifiers on subjects outside of the study without their permission.
b. Tobacco Smoking
Subjects were asked whether they ever smoked (ever smoker defined as a history of smoking ≥100 cigarettes during their lifetime), date of initiating smoking, date of smoking cessation (for former smokers) and the average number of cigarettes smoked in a day. Ever smokers were classified as current or former smokers.
c. Alcohol Consumption
A comprehensive alcohol intake history was obtained to assess lifetime use in terms of quantity, type, duration, and pattern and the relationship between alcohol use and the onset and progression of pancreatic disease. The presence of alcoholism was assessed using the TWEAK questions, alcohol consumption during the maximum lifetime drinking period, and the lifetime, age-dependent drinking patterns.
Subjects were asked whether they ever drank alcohol (‘no’ was defined as <20 drinks in a lifetime). It they answered ‘yes’, they were asked about drinking in the months before developing pancreatitis using the TWEAK questions (‘Hold’ version – table ), since this questionnaire has been validated in detecting ‘at-risk drinking’ in both men and women [12
]. Subjects were further asked the age that they began drinking at least one drink per month and the age when they began drinking the most alcohol in their life. The average amount consumed on a drinking day during the maximum lifetime drinking period (could be non-consecutive), the number of days per month that this amount was consumed, most number of drinks consumed on any one day, the type of alcohol (including number of drinks of beer, wine or mixed drinks) and the duration of alcohol consumption at this level was asked.
TWEAK criteria used in the NAPS2 study to determine ‘at-risk’
The responses to these questions were used to determine the drinking categories and the total alcohol exposure during the maximum lifetime drinking period. The average weekly alcohol intake (quantity-frequency criteria, i.e. average amount consumed on a drinking day and the number of days per month that this amount was consumed) was used to stratify subjects into the following five drinking categories (drinks/week): Abstainers:
no alcohol use or <20 drinks in lifetime; Light drinkers:
≤3 drinks/week; Moderate drinkers:
4–7 drinks/week for females; 4–14 drinks/week for males; Heavy drinkers:
8–34 drinks/week for females; 15–34 drinks/week for males; Very heavy drinkers:
≥35 drinks/week. These alcohol use categories are similar to the National Health Interview Survey (NHIS) [13
] except that we used a different reference period (‘maximum lifetime drinking period’ rather than ‘the past 12 months’) and we subdivided the ‘Heavier’ NHIS category into ‘Heavy’ and ‘Very Heavy’ categories. 13 subjects (7 controls, 2 RAP, 4 CP) did not provide any information on drinking. In 42 (6%) controls, 32 (7%) RAP and 67 (12%) CP subjects who did not provide information on quantity or frequency of alcohol use or both during the maximum lifetime drinking period, drinking categories were assigned after review of the available response on quantity or frequency in conjunction with TWEAK questions and drinking patterns (see below). Thus, drinking category could not be assigned to 32 (4.6%) controls, 11 (2.4%) RAP and 4 (1%) CP subjects.
Drinking patterns and amounts during the subject's lifetime were assessed using a series of questions linked to choices of six drinking patterns (table ). This group of questions began with a background statement, ‘Drinking patterns often change after an event such as college, marriage, loss of a spouse, unemployment, religious reasons, development of pancreatitis or other health problem’. The subjects were first asked about the age at which they first began drinking alcohol and their drinking pattern, about subsequent major life events (with free text allowed) and the age that resulted in change to a new drinking pattern, and the number of drinks consumed on an average drinking day. Among ever drinkers, 83% of subjects provided information on both drinking patterns and age associated with pattern changes. Using this, we estimated the total duration of drinking (lifetime and before diagnosis of pancreatitis) and lifetime drinking patterns, based on the assumption that a previous drinking pattern continued until the onset of the new drinking pattern. The responses to drinking patterns were also used to create variables indicating the pattern of alcohol intake for each individual for every year between ages 15 and 65 years.
Description of drinking patterns during an average month used in the NAPS2 study to assess patterns of drinking in controls and patients with RAP and CP
We calculated the proportion of lifetime drinking accounted by the maximum lifetime drinking period. We also calculated the proportion of total drinking time within each drinking pattern. Since the information on the amount of alcohol consumption linked to the drinking patterns and ages was limited to quantity and did not include frequency, a precise estimation of alcohol consumption outside the maximum lifetime drinking period was not possible.
The questions concerning alcohol consumption included several similar questions about ages and amounts of typical and heaviest drinking to test for internal consistency, and the amounts of alcohol consumption were compared with the risk score derived from the TWEAK questions. In addition, the physicians were asked whether the working diagnosis for the etiology of pancreatitis was alcohol, and alcohol as a contributing factor was included as a choice in the TIGAR-O classification system (see physician questionnaire description) [3
]. The physician's impression and the subject's responses could then be compared, and the subjects with inconsistent responses could be identified and re-evaluated.
d. Pancreatitis History
Questions on pancreatitis included whether the patient had a history of AP, RAP or CP; date of diagnosis; attacks of and hospitalizations for AP, and abdominal pain (if any, frequency, duration, date for first hospitalization); presence of diabetes or steatorrhea; and medication use (analgesics, all other prescription and non-prescription medications).
e. Pain and Quality of Life
A series of questions was used to help understand the type and severity of pain based on recommendations of the AGA technical review of the topic [14
]. Subjects were asked about presence, type of pain and triggers of pain; number of days (work or school) missed in the previous month due to pain, and whether they were on disability or unemployed because of pain. A multiple choice question was used to classify pattern and severity of pain (table ) based on the patterns described by Ammann et al. [15
]. For assessing quality of life, a standardized questionnaire (SF12) was used. A subset of patients was also asked general questions relating to bowel habits and abdominal symptoms in order to associate specific symptoms with pancreatic disease and compare them with symptoms among the control population.
Description of pain patterns used in the NAPS2 study to assess patterns of pain in patients with RAP and CP
4. Physician Questionnaire
The physician questionnaire forms for pancreatitis subjects were completed by a recognized expert in pancreatic diseases at participating centers, usually with the assistance of a clinical research coordinator. The information in the questionnaire was supplemented by medical record documentation of key diagnostic, laboratory and hospital reports.
a. Acute Pancreatitis
In the section on AP, physicians were asked whether a subject was ever diagnosed with AP; date of first diagnosis of AP; how AP was documented (clinical, i.e. pain with pancreatic enzyme elevations >3 times normal vs. imaging studies); whether the severity of first episode of AP was mild (edematous, or pancreatic necrosis <30%) or severe (defined as APACHE-II score of >8, Imrie score >3, Ranson's score >3, CRP ≥150 mg/dl, or >30% pancreatic necrosis); number of additional AP attacks, and if any of the recurrent attacks was worse than initial AP attack.
b. Chronic Pancreatitis
Physicians were asked whether a subject had CP (yes, no, suspected); date of onset of symptoms suggestive of CP; presence, frequency and type of pain (similar to patient questionnaire); evidence of previous pancreatic disease; first imaging evidence of CP; details of imaging studies (ERCP, CT scan, ultrasound, MRCP, and EUS) used to diagnose CP (whether performed, if normal or abnormal, dates of normal and first abnormal test, and a description of diagnostic findings); details of histology (whether a histological examination was ever performed, method of obtaining tissue, and a description of histological findings); whether a subject had exocrine or endocrine insufficiency. The primary entry criteria for CP were definitive evidence of CP on imaging studies – ERCP using the Cambridge classification or on CT scan which was fulfilled in 447 patients (83%). Enrollment was based on evidence of CP on histology (alone or in conjunction with MRCP or EUS) in 27 (5%), EUS alone in 37 (7%), MRCP (alone or with EUS or USG) in 13 (2%). Of these 50 subjects where the diagnosis was based on EUS or MRCP, 35 (70%) were reported to have an abnormal ERCP or CT scan. In nearly 3% (n = 16) of CP cases, details of imaging or histology were not available to the participating centers.
c. Disease Classification, Severity, Complications and Treatments
Specific questions on the major complications of CP, including exocrine insufficiency, endocrine insufficiency, and pain patterns (table ) were assessed. The physicians also provided information on working diagnosis and presence of risk factors based on the TIGAR-O classification system (tables , ). A positive response to the categories was considered a ‘yes’ and a non-response was considered a ‘no’. Physicians then indicated which therapies were attempted and whether they were believed to be helpful (table ). A list of the patients’ current medications, dose, and date when initiated was also recorded. Finally, space for free text to describe any interesting or unique observations in the patient was also included in the questionnaire.
List of etiologies used by physicians in the NAPS2 study to identify the working diagnosis or diagnoses in patients with RAP and CP
TIGAR-O classification used in the NAPS2 study to identify the presence of risk factors in patients with RAP and CP
List of medical and surgical therapies used by physicians in the NAPS2 study to identify which therapies were tried in patients with RAP and CP and whether they were helpful
5. Quality Control
a. Data Infrastructure
Patient questionnaires and medical reports were faxed or mailed to the NAPS study coordinator at the University of Pittsburgh. As noted above, files were kept in locked cabinets within a research wing at the university whose entry was by key card access. Files were not allowed to be taken out of the research area. All data entry and recoding of descriptive responses were performed in this research wing. Only approved personnel had access to the paper charts, and they retrieved and re-filed charts with assistance of the study coordinator or their assistant as necessary.
b. Data Entry System
Questionnaire and laboratory data were managed using the Progeny® (Version 5) software product (Progeny Software, LLC, South Bend, Ind., USA). Progeny is a commonly used program for managing clinical data where there is a need to track familial relationships of study subjects. The software has specific features designed to track and graphically display the relationships of study participants. It incorporates a rich set of data types appropriate for the NAPS2 questionnaire data. Based on a Sybase SQL database, Progeny provides a powerful user interface for managing and entering data. This interface makes it relatively easy for data entry personnel to enter all questionnaire data with minimal training. Progeny allowed us to design a data entry display for each page of the paper questionnaires. Text and entry fields on the display were arranged to match the layout of questionnaires. The display method was developed to maximize accurate data entry and simplify training of data entry personnel.
In addition to the graphical pedigree display, Progeny has a convenient interface for selecting specific cases and subsets of data for display and review. Queries of data can be designed by building logical combinations of fields using a ‘drag-and-drop’ interface. Relatively complex logical constructions can be created, including the matching of substrings on text data types. Data selected for review appears in tabular form similar to an ordinary spreadsheet. Field values can be reviewed or modified within this spreadsheet view or the data can be exported as a text file with choice of field delimiter. While Progeny does have facilities for some statistical analyses and the creation of computed fields, we routinely exported subsets of data for analysis with other software.
c. Data Quality
All questionnaire data were entered twice by qualified independent data entry personnel provided with a user ID and password, in order to provide quality control on the data entry process. Because Progeny lacks a feature for comparing duplicate data entry, software was written locally to report differences between the two sets of data. Data were exported from Progeny and a report listing discrepancies for each case was prepared. For each reported discrepancy, data entry personnel reviewed the original paper chart to ensure Progeny had the correct values.
Data entry personnel were trained to provide careful and consistent entry of study data. Questionnaire responses which were illegible or confusing were flagged by data entry personnel and reviewed by physicians who were working members of the NAPS2 study. All descriptive or open-ended questions were coded by a group of physicians using printed guidelines. Coding was tracked on paper, including initials of the coder and date, which was then added to the subject's chart. Coded data were double entered to ensure accuracy. A codebook was also developed with details on each original and newly created variable for all subsequent data users.
d. Method for Handling Descriptive Responses
In the patient and physician questionnaires, the responses to several questions were asked in a descriptive, open-ended format. For imaging studies, physicians were asked to provide a description of diagnostic findings. These responses were coded with the purpose of organizing the findings in a standardized fashion to evaluate for presence of CP on each diagnostic test, severity of CP (whenever possible) and presence of individual findings. Cambridge classification was used to code for ERCP findings and a similar classification was used to code MRCP findings. Since there is no standardized and widely used classification system for cross-sectional imaging studies, we used the findings equivalent to Cambridge classification to code for CT scan findings [11
] and a similar system was used to code for MRI and abdominal USG findings. EUS findings were coded as ductal (irregular contour of main pancreatic duct, dilatation of main pancreatic duct, dilatation of side branches, increased echogenicity/thickness of main pancreatic duct wall) or parenchymal changes (calculi/calcifications, focal areas of reduced echogenicity in the pancreas, hyperechogenic foci, hyperechogenic strands, cysts in the pancreas, accentuation of lobular pattern). Pancreatic findings on imaging studies that did not fall into the above-mentioned descriptions were coded as ‘other pancreatic findings’ (e.g. pancreatic atrophy, pseudocysts, inflammatory changes in and around pancreas, pancreas divisum), while non-pancreatic findings were captured under biliary and liver findings. When the imaging findings were not recorded in sufficient detail in the descriptive diagnostic findings to allow for a CP diagnosis, the studies, identified by physicians as abnormal and meeting the enrollment criteria for CP, were used in conjunction with the descriptive diagnostic findings, and responses to questions on previous evidence of pancreatic disease and first imaging evidence of CP to determine the presence of CP on diagnostic tests.
Descriptive responses for histologic findings were reviewed and coded as indicative of CP, presence of calcifications, autoimmune pancreatitis, IPMN, adenocarcinoma, or other findings. Descriptive responses for some conditions not covered in the working diagnosis or risk factors were recoded using the TIGAR-O classification system or were classified as miscellaneous if they did not fall into one of the TIGAR-O categories.
Patient and physician responses for medications were stratified into medication groups (e.g. NSAIDS, narcotics, antihypertensives, vitamins). The information on analgesic medication type helped to assist in pain control assessment and was translated into a morphine-related analgesic potency [16
6. Data Management and Statistics
Much of the data management, coding of new variables, and initial statistical analysis was carried out using the R Project software for statistical computing (www.r-project.org) and SPSS Version 14(SPSS Inc., Chicago, Ill., USA). R is an open source software system that works on Windows, Macintosh, and Unix/Linux computers. It has both data management capabilities and a wide range of statistical functions. R includes a scripting language that allows for the recording (and playback) of all commands entered during data analysis. This feature was utilized during the creation of new summary variables based on logical combinations of up to ten or more fields. For example, physicians reported a wide variety of findings on imaging tests. In order to report the number of cases where physicians reported a diagnosis of CP, we examine combinations of responses in twelve fields. An R script provides a detailed record of the calculations and allows the process to be manually reviewed for accuracy and shared with other researchers.
SPSS was primarily used for preparation of many of the data tables and basic descriptive statistics. Exchange of data between the three programs, Progeny, R, and SPSS, was accomplished by structuring datasets in standardized formats for export and import.
7. Data Coordination and Access from Participating Sites
An operations manual with explanations of the intention of each question and standard operating procedures was prepared and distributed to all sites at the initiation of the study. During the data entry and data classification process, a data codebook was developed that documented how all of the data were organized and how the data were structured for comparison. This codebook includes a detailed description of all data fields, including the specific logic used to create new variables. The possibility of making the final NAPS2 data available with access controls for the principal investigators via the World Wide Web is also being evaluated. This will allow participating centers to access the final dataset for their site via an SSL secured website. Data subsets will be distributed by consent of the local principal investigator according IRB guidelines.