|Home | About | Journals | Submit | Contact Us | Français|
The performance of three line blot assays (LBAs), the Linear Array HPV genotyping assay (LA) (Roche Diagnostics), INNO-LiPA HPV Genotyping Extra (LiPA) (Innogenetics), and the reverse hybridization assay (RH) (Qiagen), was evaluated using quantitated whole genomic human papillomavirus (HPV) plasmids (types 6, 11, 16, 18, 31, 33, 35, 39, 51, 52, 56, 58, 59, and 68b) as well as epidemiologic samples. In a plasmid titration series, LiPA and RH did not detect 50 international units (IU) of HPV type 18 (HPV18) in the presence of 5 × 104 IU or more of HPV16. HPV DNA (1 to 6 types) in the plasmid challenges at 50 IU or genome equivalents (GE) were identified with an accuracy of 99.9% by LA, 97.3% by LiPA, and 95.4% by RH, with positive reproducibility of 99.8% (kappa = 0.992), 88.2% (kappa = 0.928), and 88.1% (kappa = 0.926), respectively. Two instances of mistyping occurred with LiPA. Of the 120 epidemiologic samples, 76 were positive for high-risk types by LA, 90 by LiPA, and 69 by RH, with a positive reproducibility of 87.3% (kappa = 0.925), 83.9% (kappa = 0.899), and 90.2% (kappa = 0.942), respectively. Although the assays had good concordance in the clinical samples, the greater accuracy and specificity in the plasmid panel suggest that LA has an advantage for internationally comparable genotyping studies.
The introduction of human papillomavirus (HPV) vaccines has highlighted the need for standardized accurate HPV genotyping assays to ensure comparability of results in laboratories worldwide, both before and after vaccine introduction. A large variety of assays, using both in-house methods and commercial kits, are available for HPV detection and typing (6), and more can be anticipated. International proficiency testing organized by the WHO HPV LabNet has documented significant differences in sensitivity, specificity, and reproducibility between different laboratories using a variety of testing platforms (2). Variation in performance for laboratories using the same assay implicates laboratory practice; however, the results also indicated differences between assays. The WHO HPV LabNet has further recognized the need for HPV typing assays that can easily be standardized, require minimal equipment for assay performance and interpretation, and can be used in a variety of laboratory settings. Following its meeting on the standardization of HPV assays and the role of the WHO HPV LabNet in supporting vaccine introduction (WHO, Geneva, Switzerland, 23 to 25 January 2008 [http://www.who.int/biologicals/publications/meetings/areas/vaccines/human_papillomavirus/HPV%20Jan%20meeting%20report_20080909%20_Clean_.pdf]), it was agreed that commercial assay kits should be evaluated in collaborative studies for proficiency.
Reverse line blot assays (LBAs), based on consensus amplification of conserved regions of HPV followed by hybridization to type-specific probes on line blot strips, have been widely used. Available LBAs differ in the primer set (1, 3, 4) used in the amplification phase and in the number and sequences of the detection probes. Beyond thermocyclers for target amplifications, only temperature control water baths and visual inspection are required to carry out these assays. Therefore, the LBA platforms have the potential to meet the needs of the WHO for high-performance genotyping that does not involve expensive equipment.
Studies have been conducted previously to compare and evaluate the performance of these HPV typing assays. Typically, these were restricted to DNA extracts from patient-collected anogenital specimens, which cannot be validated independently and limit the analysis to relative comparisons and assessments of type prevalence found by the individual assays (5, 7, 10). The use of plasmid standards offers clear advantages for more objective evaluations beyond that level. Cloned, full-length HPV genomic DNA (gDNA) provides complete control over genotype identity as well as copy number input and eliminates the need for a gold standard test.
We selected three commercial LBAs using different primer sets and subjected them to use with a number of different test samples. These included plasmid standards as well as patient extracts from different sources. The objective was a comparison of these kits' performance, with the potential needs of the WHO in mind. Specifically, assay sensitivity, specificity, and reproducibility were assessed.
HPV plasmids were used to prepare test samples to assess the assays' performance parameters. Human genomic DNA was added as carrier diluent to a final concentration of 1 ng/μl in each plasmid preparation to simulate the situation in actual samples. For the evaluation of competition between HPV types with unequal copy numbers, a titration series containing a constant 50 international units (IU) of HPV type 18 (HPV18) DNA and different numbers of HPV16 in 10-fold increments from 5 × 100 to 5 × 106 IU was prepared. To evaluate detection of HPV at the desired level of sensitivity, i.e., 50 IU or genome equivalents (GE) per assay, the final concentration of each type was adjusted so that 50 IU (or GE) would be assayed. Samples containing multiple types (from 2 to 6 of the 15 types in different combinations) were prepared in order to evaluate the impact of multiple types on assay performance (Table 1). Two additional control samples containing no template DNA and one with only human genomic DNA (Roche Diagnostics) were also prepared. Since sample source and collection medium can influence the performance of molecular assays, particularly those based on PCR technology, we included extracts of cervical cells in specimen transport medium (STM) and archived formalin-fixed paraffin-embedded (FFPE) tissue in the test panel. These samples are described below. In particular, FFPE extracts are known to be challenging for assays with large amplicons, and the LBAs could perform differently.
The final testing panel included 66 plasmid challenge samples (7 titration, 15 individual, and 18 multiple, each in duplicate), 60 epidemiologic samples, and 2 negative controls. Six replicate aliquots of the testing panel were prepared to provide duplicate testing on each of the three LBA platforms. All DNA samples were randomly coded by a member of the laboratory not involved in testing, disguising the origin of the sample, expected HPV status, and types. Two technologists independently tested a panel on each of the platforms and interpreted the results. Prior to initiating testing, each technologist successfully completed proficiency testing on each assay platform, correctly identifying one to five HPV types in 10 unknown samples with at least 80% accuracy.
Each assay used a 10-μl DNA aliquot per PCR. Results for all possible HPV types were entered into MS Access database tables. A sample was termed HPV positive when at least one type was detected, HPV negative if none of the 15 types but the genomic control was positive, and inadequate if neither HPV nor the control was detected. Testing was completed over a period of 2 weeks. The order in which each technologist used each platform varied.
Full-length clones of HPV types 6, 11, 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, and 68 in plasmid vectors (100 ng in Tris-EDTA [TE] buffer) were provided by the WHO HPV Global Reference Laboratory (Region Skåne, Malmö, Sweden). Each plasmid was diluted to 5 × 104 genome equivalents per ml in 200 μg/ml yeast tRNA solution (Invitrogen, Carlsbad, CA). The number of genomic copies for HPV16 and -18 had been previously standardized by WHO and was measured in international units (IU) accordingly. Copy numbers of other types were calculated from the molecular mass and concentration as genome equivalents (GE). Human genomic DNA used as background for HPV plasmid samples was obtained from Roche Diagnostics, (Indianapolis, IN) and diluted to 10 ng/μl in 0.1 mM TE.
Residual DNA extracts were retrieved from anonymous archived epidemiologic samples from studies of HPV, either from populations with a high HPV prevalence or from HPV-associated cancers. These were randomly selected without knowledge of prior HPV results to include 30 DNA extracts from cervical cells in Digene STM (Qiagen, Valencia, CA) and 30 from archived FFPE cancer tissues.
The selection of kits to be evaluated was based on the following criteria: (i) the ability to detect and individually identify all or the majority of the 13 high-risk HPV types 16, 18, 31, 33, 35, 39, 51, 52, 56, 58, 59, and 68b and the two low-risk types 6 and 11; (ii) no requirement for expensive instrumentation to perform and interpret the assay; and (iii) commercial availability in the United States. Meeting most of these criteria, the Linear Array HPV genotyping assay (LA) (Roche Diagnostics, Indianapolis, IN), INNO-LiPA HPV Genotyping Extra (Innogenetics, Ghent, Belgium), and the reverse hybridization probe assay (RH) (Qiagen, Valencia, CA) were chosen. All three tests are based on general HPV amplification via a low-stringency PCR amplification of the L1 region and subsequent detection via reverse line blot assay with type-specific probes. Technical details are compared in Table 2.
The LA uses PGMY 09/11 consensus primers and identifies a total of 37 individual types. The test was performed according to the manufacturer's specification, except that 40 μl DNase-free water was added with 10 μl template DNA to fill to the required volume. The hybridization and washing steps of the reverse line blot assay were done automatically with Beeblot instruments (Bee Robotics, Caernarton Gwynedd, United Kingdom). HPV52 is detected only by an “XR” probe which cross-hybridizes with HPV33, -35, and -58. In the presence of any of these three types, HPV52 cannot be identified unequivocally.
LiPA applies SPF10 primers and includes probes to detect 28 HPV types. The assay was performed in accordance with the manufacturer's protocol using an AutoBlot 3000H (MedTec, Buffalo, IL) for hybridization to the genotyping strips. Some types are defined by a single positive probe on the genotyping strip (i.e., HPV6, -11, and -16), but others are interpreted as a combination of two to four probes (i.e., HPV18, -33, and -58). While the manufacturer provides interpretation of type detection, including “possible” types, only those that were detected unequivocally were included in the analysis.
The RH was not officially available in the United States at the time that this study was conducted, and the kits were kindly provided by Qiagen. The test utilizes the GP5+/6+ primers and line probes to distinguish 18 high-risk types; it does not detect HPV6 and -11. PCR amplification and line blot detection were carried out as specified by Qiagen. Hybridization and washing of HPV genotyping strips were achieved manually using a Gemini Twin shaking water bath (SciGene, Sunnyvale, CA) and an aspiration system with 8-needle Stream Splitter (Art Robbins Instrument, Sunnyvale, CA).
A water blank and SiHa DNA were included as negative and positive controls, respectively, in every run with each assay to monitor validity. The results from these controls were not included in the analysis.
The 15 HPV types 6, 11, 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, and 68 were considered for the analysis. Some calculations (positive samples, accuracy, and reproducibility) for RH results were restricted to the 13 high-risk types as indicated. The two technologists' results for each sample and platform were counted as independent events. For plasmid samples, accuracy was calculated as the percentage of correctly detected types (true positive and true negative) from the total number of types assessed in all samples (total types assessed = 15 types per assay × 66 samples × 2 assays = 990, or 858 if restricted to 13 high-risk types).
Results from STM and FFPE extracts were compared descriptively. Positive reproducibility (PR) was calculated as the percentage of HPV types identified in both replicates among the total number of types detected in either of the duplicate samples. Unweighted Cohen's kappa coefficients were calculated with SPSS 15.0 for Windows for all types assessed (positive and negative) in the relevant sample set.
The results for detection of HPV16, HPV18, and the genomic control at all input concentrations of HPV16 are listed in Table 3. When HPV16 is present at below 5 × 104 IU, all three targets are detected by all assays. While LA results were robust to increasing amounts of HPV16, LiPA and RH failed to detect 50 IU HPV18 and at the highest input of HPV16 failed to detect the genomic control. The negative control containing only human gDNA gave expected results with all assays, as did the no-template control (data not shown).
Table 4 lists false-positive, false-negative, and inadequate results obtained by LBA among plasmid samples. Both technologists' results are reported, so the number of results is double the number of samples. In the single HPV plasmid challenges, the accuracy of LA was 29/30 (96.7%), that of LiPA was 27/30 (90.0%), and that of RH was 23/30 (76.7%). (RH detected 23/26 [88.5%] if HPV6 and -11 were excluded.) The challenges with multiple HPV plasmids included a total of 61 HPVs. Of these, LA identified all 122 types correctly (100%), LiPA 100/122 (82.0%), and RH 83/122 (68.0%). If HPV6 and -11 were excluded, 83/108 (76.9%) types were correctly found by RH.
The accuracy for all 15 types in the 66 plasmid samples was 99.9, 97.3, and 95.4% for LA, LiPA, and RH, respectively. Restricted to the 13 high-risk types, RH's accuracy was 96.7%.
Results for STM and FFPE samples are combined, as differences between the two sample types were negligible for all LBAs. From the 120 test results derived from each assay, the genomic control probe was positive in 116, 90, and 87 tests by LA, LiPA, and RH, respectively. However, results were adequate in 118, 118, and 111 tests by LA, LiPA, and RH, respectively, because some samples were positive for HPV but negative for the genomic control.
One or more HPV types were detected in 76, 90, and 69 samples by LA, LiPA, and RH, respectively. Complete type-specific agreement among all three LBAs was noted for 81 samples (67.5%). Of these, 42 were HPV positive and 39 HPV negative. An additional 24 samples (20%) had at least one type detected concurrently by all assays, and one sample was HPV positive in all assays but without type agreement.
The overall detection of individual types was very similar for the three LBAs, with the exception of HPV52, which was detected only by LiPA in 9 cases (Fig. 1). LA detected a total of 111 types (108 without HPV6 and -11) in 76 samples, LiPA detected 121 types (111 without HPV6 and -11) in 90 samples, and RH found 97 types in 69 samples. Instances of type concordances and discordances between the assays are illustrated in Fig. 2.
All test samples with the exception of the seven titration samples were included in the reproducibility assessment. In the 33 test samples prepared with single or multiple plasmids, expected results were obtained for a total of 76 instances of HPV type detection (67 without HPV6 and -11). Differences in reproducibility between STM and FFPE samples were not significant, and they are combined in Table 5.
Results from the plasmid standards revealed differences between the HPV typing assays. The titration of HPV16 in different amounts against constant HPV18 template numbers highlighted the robustness of the assays to competing types with different concentrations. Copy numbers of more than 5 × 103 or a >100-fold-larger template amount of HPV16 suppressed amplification of HPV18 in LiPA and RH PCRs. These observations are in line with assessments by van Doorn et al. (9), who found that 100 copies of HPV18 were outcompeted by a 1,000-fold-higher concentration of HPV16. Only the LA was able to detect both types even at 100,000-times-higher HPV16 concentrations. The larger reaction volume (100 μl in LA versus 50 μl in the others) might be responsible for the superior tolerance to competition. The extreme differences in copy number are unlikely to occur in actual biologic samples; however, pushing the technical limits of the assays allowed for evaluation of robustness to a variety of challenges. All three LBAs generally could detect 50 IU or GE of the single high-risk HPV types. Only HPV68b was not detected by either RH duplicate, which is not surprising since the limit of detection (LOD) is stated as 105 viral copies in the RH detection kit handbook. The sporadic lack of reproducibility may indicate that this input amount is in the range of the lowest LOD for at least some types (Table 3).
Significant deficiencies were seen in samples that included more than one type. LiPA failed to detect 18% of the types included in the 36 multiplasmid challenges. The 22 missed types consisted exclusively of HPV39, -58, -59, and -69. In some instances, types 39 and 68b were identified as “possible” types due to the LiPA's multiprobe set (Table 4). Nevertheless, both HPV39 and -68b were also missed in other samples that were free of this ambiguity, suggesting unequal amplification efficiency or competition. Disregarding HPV6 and -11, the RH failed to detect 20.5% of the types in this subset. Besides HPV68b, types 52, 59, and 39 were missed in several instances. A critical limitation of the RH might result from the large discrepancy in detection sensitivities for different HPV types. According to the handbook, LODs differ 25,000-fold, ranging from four copies for HPV16 to 100,000 copies for HPV53 and -68.
Reproducibility of results for the plasmid challenges was greatest for the LA but was generally good in all assays. Among the results derived from the patient samples, the RH had the highest positive reproducibility (90.2%), but the denominator was also lowered to 51 types since HPV6 and -11 are not detected by this test. It was rather surprising that no significant differences were found between STM and FFPE extracts, as fractured and poor-quality DNA from archived tissues should favor shorter amplicon lengths as targeted by the SPF primers (8). The sample size may not have been sufficient to allow differences in assay performance to be identified. Generally, results from the patient samples were comparable between the assays, with at least partial agreement in 87.5%. Performance differences found with the plasmid samples might reflect rather extreme situations which are not relevant for the majority of real clinical specimens.
False-positive results were a particular concern. Two instances were observed among LiPA results in plasmid samples. In both cases, the algorithm for identifying the expected type (HPV18 or -58) requires that more than one probe hybridizes with the amplicon. As only one of the required probes hybridized, another type (HPV39 or -52) was falsely indicated. It is likely that individual LiPA probes differ in their affinity to the same amplicon and generate an incomplete band pattern at small target amounts, which would consequently lead to attributing the result to an incorrect type. Detection of HPV52 by LiPA might be particularly vulnerable to this problem. In the LiPA, the HPV52 amplicon hybridizes to a single probe. However, that probe also hybridizes to amplicons of five additional HPV types that are distinguished by combinations with other probes. In this regard, it is noteworthy that HPV52 was detected in 9 of the 15 types found exclusively by LiPA among the patient samples (Fig. 1), and two of three patient samples that were exclusively HPV positive by LiPA had only type 52. It seems likely that at least some of these cases are false positives.
Conversely, some HPV52 may have been missed by the LA, as it has a similar shortcoming and detects HPV52 only through the cross-binding XR probe. While this ambiguity occurred in 28 cases among the LA results, only two of these samples tested positive for HPV52 by LiPA.
Every laboratory test is also influenced by the accuracy of human handling. The samples prepared from plasmid DNA yielded one inadequate result each by LA and LiPA, and the (single) HPV types not detected in these results were also counted as “missed.” Although the tests were performed with utmost care in a clinically certified laboratory, operator errors cannot be ruled out as a cause and may have lowered the real performances of the tests. However, this possible distortion should be minimal.
Analysis for this comparison study was restricted to the most relevant high-risk types as well as HPV6 and -11. It should be considered, however, that LA detected 50, LiPA 16, and RH 4 additional HPV types in the 60 patient samples. Depending on the number of types covered by each assay, the scope of HPV detection will always be limited and does not directly allow an “HPV-negative” interpretation.
For the majority of samples, HPV typing results will be very similar and comparable for the three LBAs evaluated. In difficult situations such as multiple infections, low copy numbers, or large difference in viral copies, LA has an advantage. Some limitations of LiPA are due to its multiple-probe detection system. RH is disadvantaged by low sensitivity for some types, particularly HPV53, -68, and -82. LA performed nearly perfectly and is hampered only by the cross-reacting XR probe for HPV52 detection.
This work was supported by the WHO via a project funded by the Bill and Melinda Gates Foundation.
The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the supporting agencies.
Published ahead of print 22 February 2012