Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Genome Res. Author manuscript; available in PMC 2008 February 8.
Published in final edited form as:
PMCID: PMC2235196

Genome-wide Detection of Allelic Imbalance Using Human SNPs and High-density DNA Arrays


Most human cancers are characterized by genomic instability, the accumulation of multiple genetic alterations and allelic imbalance throughout the genome. Loss of heterozygosity (LOH) is a common form of allelic imbalance and the detection of LOH has been used to identify genomic regions that harbor tumor suppressor genes and to characterize tumor stages and progression. Here we describe the use of high-density oligonucleotide arrays for genome-wide scans for LOH and allelic imbalance in human tumors. The arrays contain redundant sets of probes for 600 genetic loci that are distributed across all human chromosomes. The arrays were used to detect allelic imbalance in two types of human tumors, and a subset of the results was confirmed using conventional gel-based methods. We also tested the ability to study heterogeneous cell populations and found that allelic imbalance can be detected in the presence of a substantial background of normal cells. The detection of LOH and other chromosomal changes using large numbers of single nucleotide polymorphism (SNP) markers should enable identification of patterns of allelic imbalance with potential prognostic and diagnostic utility.

Neoplastic progression is generally characterized by the accumulation of multiple genetic alterations including loss of tumor suppressor gene function. Identification of the alterations involved in initiation and progression of premalignant conditions to cancer will help address many questions concerning the mechanisms of neoplastic progression in vivo and facilitate the discovery of diagnostic and prognostic markers and potential therapeutic targets.

The classic mechanism of tumor suppressor gene inactivation is described by the two-hit model in which one allele is mutated and the other allele is lost through a number of possible mechanisms, resulting in loss of heterozygosity (LOH) at multiple loci (Knudson 1985; Hansen and Cavenee 1987; Brown 1997). LOH can arise by a variety of genetic mechanisms, including physical deletion, chromosome nondisjunction, mitotic nondisjunction followed by reduplication of the remaining chromosome, mitotic recombination and gene conversion. LOH is one example of allelic imbalance. Allelic imbalance can arise from the complete loss of an allele or from an increase in copy number of one allele relative to the other. Allelic imbalances can be detected by measuring the proportion of one allele relative to the other in cells from individuals that are constitutionally heterozygous at a given locus. LOH involves complete loss of one of the two alleles at a locus, but normal cell contamination can confound the distinction between true LOH and other mechanisms of allelic imbalance. However, studies using flow-cytometrically purified samples have shown that complete LOH can be clearly detected in tissue samples (Barrett et al. 1996; Boige et al. 1997; Paulson et al. 1999). Studies have shown that neoplastic progression is often associated with the accumulation of somatic-cell genetic changes as the tumor progresses to advanced stages (Vogelstein et al. 1989; Fults et al. 1990; Sato et al. 1990; Stanbridge 1990; Tsuchiya et al. 1992; Yamaguchi et al. 1992; Thrash-Bingham et al. 1995; Reid et al. 1996). Thus, characterization of genome-wide patterns of allelic imbalance may provide a molecular basis for prognosis as well as aid in the identification of specific regions that harbor tumor suppressor genes.

Large-scale LOH measurements are difficult to perform with conventional approaches that employ restriction fragment length polymorphism (RFLP) or polymorphic microsatellite markers (short tandem repeats or STRs). RFLP markers have low heterozygosity rates and are available in small numbers. Gel-based microsatellite assays are difficult to automate and are not readily scalable (Gruis et al. 1993). As a result, most genome-wide scans for LOH have been conducted at low resolution with a relatively small number of polymorphic markers. For example, an average of 120 STRs was used to determine the allelotypes of multiple different human neoplasms in a series of studies since 1995, and the highest density STR allelotypes used ~280 polymorphic markers (Field et al. 1995; Hahn et al. 1995; Takeuchi et al. 1995; Califano et al. 1996; Johns et al. 1996; Tamura et al. 1996; Baccichet et al. 1997; Boige et al. 1997; Gleeson et al. 1997; Kawanishi et al. 1997; Mori et al. 1997; Chambon-Pautas et al. 1998; Hatta et al. 1998; Piao et al. 1998; Shih et al. 1998; Mao et al. 1999; Yustein et al. 1999). Comparative genomic hybridization (CGH) and cDNA microarrays can be useful for measuring genome-wide increases or decreases in DNA copy number (Forozan et al. 1997; Pollack et al. 1999). However, beginning with the seminal study by Cavenee et al. (1983), several reports have indicated that LOH can occur by genetic mechanisms (e.g., mitotic recombination, mitotic nondisjunction followed by chromosome reduplication, gene conversion) that do not lead to changes in DNA copy number. For example, it has been shown that a large number of LOH events result from mitotic recombination (Gupta et al. 1997; Hagstron and Dryja 1999), which does not lead to DNA copy number changes but could be detected as LOH by use of genetic polymorphisms such as single nucleotide polymorphisms (SNPs).

The recent identification of large numbers of SNPs in the human genome provides a rich set of markers that can be used in a wide variety of genetic studies. Biallelic SNPs are highly abundant, estimated at more than 3 × 106 in the human genome (Kruglyak 1997). In addition, SNPs can be amplified by multiplex PCR (Wang et al. 1998) in contrast with microsatellite markers that generally require individual amplification reactions. The amplification step makes it possible to use only small amounts of genomic DNA, which is often essential when working with limited clinical material. Furthermore, SNP analysis can be performed on high-density oligonucleotide arrays (Wang et al. 1998), eliminating the need for gel-based analysis. This study describes the use of SNPs combined with oligonucleotide probe array technology (Fodor et al. 1991, 1993; Pease et al. 1994; Chee et al. 1996; Lockhart et al. 1996; Wodicka et al. 1997; Gunderson et al. 1998) to detect changes in allelic representation in human tumors in a reproducible, accurate, sensitive, scaleable, and efficient manner.


SNP Array Design

The arrays were designed for the determination of the genotype of up to 600 biallelic SNPs (Figs. 1 and and2;2; a list of markers is available from the authors on request). On the basis of a previous study (Wang et al. 1998), we estimated that ~440 of the 600 loci are truly polymorphic. These polymorphic loci are distributed across all human chromosomes and have an average heterozygosity of 0.33. The basic approach to the genotyping of these markers is similar to that described by Wang et al. (1998), but the SNP array design and analysis algorithms used here are different. For each locus, the SNP array interrogates not only the polymorphic base (position 0) but also four additional bases for each allele, two on each side, flanking the polymorphic position (positions −4, −1, +1 and +4; Fig. 1A). This probe redundancy improves the confidence of the genotype calls. As shown in Figure 1A, the probe set for each interrogated base includes four oligonucleotide probes that differ only at the central position (referred to collectively as a tile and shown as four squares in the figure). Separate tiles are constructed for the A allele and the B allele at positions −4, −1, +1 and +4. At position 0 (polymorphic base), both alleles share a single tile (Fig. 1B). To increase accuracy further, both sense and antisense strands are queried on the array using the same type of probe sets. Genotypes for each locus were determined by calculation of the fraction of the A allele (P) in target samples, and chromosomal changes were assessed by measurement of the difference in P values between normal and tumor samples from the same individual (see Methods).

Figure 1
SNP array design. (A) Design for querying a locus. Target sequences (lowercase) for both A and B alleles are identical except for the polymorphic base (uppercase). Five positions at or near the polymorphic locus, indicated by −4, −1, 0, ...
Figure 2
Fluorescence images of the SNP array following hybridization of a tumor sample. (A) Low magnification view of the entire fluorescence hybridization image of the SNP array, (B) an enlarged portion of the hybridization pattern, and (C) block images for ...

A Test Case for SNP-based Detection of Allelic Imbalance

The ability to detect allelic imbalance was first demonstrated in a family case study with two unaffected parents and a child with two separate neurofibromatosis type 2 (NF-2) tumors. This case had been studied previously using conventional RFLP markers (Wolff et al. 1992), but the information about tumor type and the results of RFLP analysis were blinded prior to the SNP array experiments described here. The SNP-containing loci were amplified by multiplex PCR from genomic DNA derived from blood and genomic DNA from tumor tissues. PCR products were subsequently labeled with biotin and hybridized to the SNP arrays. As shown in Figure 3, one parent is heterozygous (AB) and the other is homozygous (BB) at one locus on chromosome 22, while the child is a heterozygote (AB). Tumor samples from two independent tumors taken from the child showed a clear loss of the A allele at this locus. The analysis identified only three SNPs that showed clear evidence of LOH. Those three SNPs were all located on chromosome 22, consistent with the previous RFLP analysis that also identified LOH only on chromosome 22 (Wolff et al. 1992; Seizinger et al. 1986).

Figure 3
The hybridization patterns for a SNP marker on chromosome 22. One parent is heterozygous (AB) and the other is homozygous (BB) at this marker. The child is heterozygous (AB) using DNA derived from blood, but scored as homozygous (BB) for the same locus ...

Reproducibility of SNP Array-based Allelic Imbalance Analysis

We tested the reproducibility of the SNP array-based allelic imbalance analysis by performing triplicate experiments with purified aneuploid DNA obtained from a patient with an esophageal adenocarcinoma. Three independent amplification and labeling reactions for 558 SNPs were performed on DNA derived from the patient’s normal cells and a purified aneuploid cell population that had been separated from the normal cells by DNA content flow cytometry. The three independent preparations for the two cell populations were hybridized to six separate SNP arrays. The genotypes for the triplicate experiments were determined by use of an algorithm that calculates the fraction of the A allele (P) for each marker in the target samples. The P values were calculated only for loci that passed the quality analysis, indicating sufficient signal and a clear hybridization pattern (see Methods). A total of 470 loci consistently passed the quality analysis for both the normal and the aneuploid samples across three independent preparations. One hundred and fifty loci were informative (i.e., clearly heterozygous in the normal sample) for this individual. The independently obtained P values were highly correlated (with linear correlation coefficient 0.99) for both the normal replicates (Fig. 4A) and the aneuploid replicates (Fig. 4B). In contrast, the P values were significantly different between the normal and aneuploid samples for a number of loci (Fig. 4C,D). Loci with P values that shift from the heterozygous range in the normal sample to the homozygous range in the aneuploid sample were scored as loci with a change in allelic representation. Of the 470 loci that passed the quality analysis in the triplicates, 33 were consistently scored as showing allelic imbalance and 434 were consistently scored either as showing no allelic imbalance (117) or as not informative (317). Thus, 22% of the informative loci showed allelic imbalance [fractional locus loss (FLL) of 0.22], which is similar to previously published fractional allelic loss (FAL) values of 0.22, 0.28, and 0.29 for esophageal adenocarcinoma (Barrett et al. 1996; Hammoud et al. 1996; Dolan et al. 1998). Only 3 out of the 470 loci (0.64%) gave inconsistent scoring across the three pairs of samples. The highly consistent results demonstrate that the SNP array-based analysis is reproducible with minimal variation introduced at each experimental step.

Figure 4
Reproducibility of the SNP array-based analysis. Loci were independently amplified and labeled three times from a pair of normal and aneuploid DNA samples. The paired samples generated by the three independent preparations were hybridized to six SNP arrays. ...

The extent of genome-wide chromosomal changes detected in the aneuploid population from esophageal adenocarcinoma (triplicate experiment) can be contrasted to that seen for the NF-2 tumor (Fig. 5). The significant difference in the number and location of events between the two tumor types may reflect the underlying biological differences between the benign NF-2 tumor and the malignant esophageal adenocarcinoma.

Figure 5
Genome-wide representation of the SNP-based analysis. (A) Genome-wide allelic imbalance detection using SNP markers in the same esophageal adenocarcinoma aneuploid population from the reproducibility experiment (Fig. 4). Of 558 SNP markers 470 passed ...

To confirm the array-based observations, we performed an independent analysis with polymorphic short tandem repeats (STRs) on the same aneuploid and normal DNA samples. We selected 81 STRs that mapped within or flanked SNP loci that have been scored as allelic imbalance in the triplicate experiment (for detailed criteria for scoring allelic imbalance, see Methods). Nine chromosomes (4, 5, 6, 7, 8, 11, 12, 13, and 18) were identified with loci with allelic imbalance by use of SNP arrays (Fig. 5), and eight out of the nine chromosome regions were confirmed to have allelic imbalance by STR analysis (Fig. 6). On multiple chromosomes, the losses extended across large regions. For example, on chromosome 7 the loss region identified by SNP analysis extended at least 92 cM, and STR analysis confirmed that the loss was contiguous throughout this entire region. In the single unconfirmed case (chromosome 13), the STR markers used in this region were not informative for this specific individual and, therefore, the event identified by the SNP array could not be confirmed by the STR analysis. For the STR analysis, rigorous criteria were used for calling allelic imbalance (see Methods). While we cannot rule out chromosome copy number changes for some loci with allelic imbalance, the majority (80%) of STR loci with allelic imbalance showed complete loss of one allele (Fig. 6). These data strongly suggest that, in the majority of cases, the observed allelic imbalance was the result of an LOH event.

Figure 6
Representative examples of LOH assessed by gel-based STR analysis. Shown are examples of loss (in the aneuploid populations) of the shorter allele of tetranucleotide repeats (A–C), loss of the longer allele (D–F) and loss with dinucleotides ...

Genome-wide Analysis in Esophageal Adenocarcinomas

We performed a genome-wide analysis with the SNP arrays on 10 patients with either high-grade dysplasia (HGD), the precursor to esophageal adenocarcinoma, or esophageal adenocarcinoma. For each patient, the normal DNA was derived from control gastric tissues whereas the tumor DNA was extracted from flow-cytometrically purified aneuploid populations. The aneuploid cell populations comprised, on average, 67% of the cells per biopsy, but after flow-cytometric cell sorting, the aneuploid populations were >95% pure. Figure 7 shows the SNPs with allelic imbalance for a subset of the aneuploid populations. In general, a larger number of chromosomal events were observed for patients who had developed cancer than those with HGD, consistent with data from previous studies (Barrett et al. 1996). Previously published data suggest that premalignant tissues typically contain fewer chromosomal aberrations than cancers and that losses frequently involve regions on chromosomes 9p and 17p, which were detected with the SNP arrays (Paulson 1999).

Figure 7
Allelic imbalance throughout the genome in aneuploid populations derived from high-grade dysplasia (HGD) and cancer (CA) cells. Genomic DNA was obtained from both a flow-purified aneuploid population and constitutional DNA from a gastric control biopsy ...

Next, we compared the array-based results with those obtained with a previously designed set of STR markers, comprised primarily of tetranucleotide repeats. We performed an independent analysis on three chromosomes (9, 17, and 18) in the same 10 aneuploid populations. A high frequency of LOH, as evidenced by complete loss of one allele, on these three chromosomes is known to be associated with esophageal cancer (Reid et al. 1996) and the STR markers were previously selected to increase the sensitivity of detection in targeted regions on these chromosomes. The SNP markers, on the other hand, were chosen randomly with no bias toward targeted regions. In addition, because the STRs were not selected to be in regions covering or flanking the SNPs used on the array, we expected to see some degree of discordance. Nonetheless, the SNP array and the STR analysis show consistent identification of allelic imbalance events on 24 of 30 chromosomes (Fig. 8). For 5 chromosomes (patients 2 and 7 on chromosome 9; patients 2, 4 and 5 on chromosome 18) no loss was detected by either technique, even though there were many informative markers. On four of 30 chromosomes (13%), allelic imbalance was detected in the STR analysis but not detected by the SNPs, as a result of either the absence of informative markers (patients 4 and 9 on chromosome 17; patient 8 on chromosome 18) or a false negative (patient 1 on chromosome 9). On 2 of 30 chromosomes (6.7 %), allelic imbalance was detected by a single SNP marker but was not confirmed by the STR analysis (patients 2 and 9 on chromosome 17). It is possible that the SNPs were mapped incorrectly or the STR analysis missed the events. Interstitial losses were also detected by both techniques on chromosome 18 (patient 9). Many examples of partial chromosomal losses were identified by both techniques (e.g., patients 3, 6, and 10 on chromosome 17). The comparison between the standard gel-based results and the SNP array-based results shows that, given a sufficient number of polymorphic markers, the SNP arrays can be used to screen for both small and large chromosomal losses. A higher density of SNP markers will help increase coverage and resolution, allowing a greater fraction of the genome to be checked simultaneously for somatic cell chromosome abnormalities.

Figure 8
Comparison of the SNP array-based and microsatellite STR-based analyses for chromosomes 9, 17, and 18. For each normal and aneuploid pair, the SNP results are shown on the left and the STR results on the right. For the SNP data, allelic imbalance was ...

Detection of Allelic Imbalance in Heterogeneous Samples

Because premalignant and tumor samples are often heterogeneous, containing normal cells as well as neoplastic cell populations, it is important to be able to detect chromosomal changes in nonhomogeneous cell populations. Known loci with allelic imbalance (identified in the triplicate experiment and validated by STR analysis) were used to detect allelic imbalance in simulated heterogeneous samples. The aneuploid population purified from normal DNA by flow-cytometry was mixed into DNA from the same patient’s normal control sample in increasing amounts (0%, 5%, 10%, 25%, 50%, 75%, 90%, 95%, and 100%) to simulate the heterogeneity of biopsy samples. DNA was mixed either prior to or after the locus-specific multiplex amplification and labeling reactions to determine whether the amplification procedures affected the relative representation of the two alleles (Fig. 9A,B). Two sets of samples, nine mixed before and nine mixed after the PCR steps, were applied to 18 separate SNP arrays and hybridized under identical conditions. The same aneuploid population was used in this experiment as in the previous triplicate experiments, in which we identified 33 loci with allelic imbalance by comparing normal (0% aneuploid) and aneuploid (100%) samples. The mixing experiment was repeated three times, and 28 of the 33 loci passed the quality test for all 18 mixtures. Figure 9A shows that the P values for one of the markers change linearly as a function of the percentage aneuploid DNA in the sample. As expected, the genotype for this marker gradually shifts from being clearly heterozygous in the pure normal sample to being homozygous as the proportion of aneuploid DNA increases (Fig. 9A). To show the overall behavior of all 28 loci, the P values were averaged for the 13 loci shifting to an AA genotype and for the 15 loci shifting to a BB genotype from their initially heterozygous state (Fig. 9B). Figure 9C shows a comparison of difference scans for the 50% mixture and the 100% aneuploid samples. The data show that the ΔP values for the 50% mixed sample decrease, compared with those for the 100% aneuploid sample, as expected. If the same difference threshold (ΔP = 20, as indicated by the dashed line) is applied to the data for the 50% mixed sample, 18 of the 28 loci show differences above the threshold. If the difference threshold is lowered to 15, 26 of the 28 loci are scored as allelic imbalance in the 50% mixed sample (Fig. 9C). However, lowering the threshold also resulted in three additional loci in the 100% aneuploid sample being scored as allelic imbalance. Further tests are required to determine whether these three are real and to determine the best threshold for investigations of heterogeneous samples.

Figure 9
Test of array-based difference detection in heterogeneous populations. The DNA derived from the aneuploid population was mixed into DNA derived from the same patient’s normal cells with increasing percentages of 0%, 5%, 10%, 25%, 50%, 75%, 90%, ...


We have demonstrated the feasibility of using SNPs and high-density oligonucleotide arrays in genome-wide screening for allelic imbalance in human tumors. The SNP array used here yielded ~150 informative loci per patient, comparable with the number of STRs used in current genome-wide LOH screens. For example, in 17 different genome-wide allelotype studies conducted over the past 5 years, the average number of STRs used was 120, and the study using the largest number of loci for LOH analysis included 280 STR polymorphisms (Field et al. 1995; Hahn et al. 1995; Takeuchi et al. 1995; Califano et al. 1996; Johns et al. 1996; Tamura et al. 1996; Baccichet et al. 1997; Boige et al. 1997; Gleeson et al. 1997; Kawanishi et al. 1997; Mori et al. 1997; Chambon-Pautas et al. 1998; Hatta et al. 1998; Piao et al. 1998; Shih et al. 1998; Mao et al. 1999; Yustein et al. 1999). However, for prognostic and diagnostic utility, genome-wide analysis will require a greater number of SNP markers that are more evenly distributed throughout the genome. In addition, because of the lower average heterozygosity rate of SNPs (0.33) compared with STRs, approximately three times the number of SNPs are required for an equivalent resolution (Kruglyak 1997). Higher density SNP arrays should greatly increase the ability to detect small regions of chromosomal changes and will provide more information regarding the boundaries of loss regions. In addition, more markers increase confidence in a detected event: If multiple adjacent SNPs all show a consistent change, the confidence in the call is much higher than if it is based on only a single SNP. It is clearly feasible to increase the density of SNP markers as SNPs are abundant in the human genome and SNP discovery and mapping is rapidly advancing (Wang et al. 1998; Cargill et al. 1999; Halushka et al. 1999). Because the array-based readout is parallel and scalable, larger numbers of markers can be assayed simultaneously without significant increases in time or labor.

SNP arrays have many advantages for LOH detection compared with traditional techniques. The PCR products containing SNP loci are typically smaller and more readily amplified in parallel than with STRs, and may be better for amplifying DNA from formalin-fixed or compromised tissues. Also, the amount of cellular DNA required to interrogate a SNP on an array is significantly less than that required for standard STR analysis, providing an opportunity to evaluate limited clinical samples.

Surgically removed tumor tissues often contain some normal cells that can interfere with the detection of changes in tumor cells. Therefore, it is important to be able to detect chromosomal changes in heterogeneous samples in which the tumor cells may represent only a portion of the sampled cell population. We simulated a heterogeneous cell population by preparing a mixture of purified aneuploid DNA with normal control DNA from the same patient. With the SNP arrays, we were able to detect chromosomal changes in heterogeneous samples, and changes can be clearly and reproducibly identified in samples with a background of up to 50% normal DNA (Fig. 9A–C). As described previously, high sample purity is required to distinguish true LOH from other types of allelic imbalance because of the confounding effects of normal cell contamination (Barrett et al. 1996; Boige et al. 1997; Paulson et al. 1999). Our mixing experiments reinforce the importance of working with purified samples to distinguish between true LOH and other mechanisms of allelic imbalance.

At present the SNP-based method cannot distinguish between loss and gain of alleles. With higher density SNP arrays, it may be possible to use signal intensity differences between tumor and normal samples to indicate chromosomal loss or gain. In a recent study, 3360 mapped cDNAs were used in a microarray hybridization assay (Pollack et al. 1999). This technique provides an approach for the detection of DNA copy number changes, which is complementary to a SNP-based method that detects changes in allelic representation.

The identification and mapping of additional SNP markers is rapidly advancing, and array-based methods provide a scalable approach to the simultaneous genotyping of thousands of markers in parallel. The availability of more markers and higher capacity array designs will allow efficient, genome-wide, high-resolution searches for chromosomal changes associated with tumor initiation and progression. The patterns of chromosomal alterations may be useful for diagnostic purposes and to follow disease progression and guide patient care.


Flow-Cytometric Purification and STR Analysis in Esophageal Adenocarcinoma

Frozen endoscopic or surgical biopsies were processed by DNA content flow cytometry to purify aneuploid cells from normal cells as described previously (Paulson et al. 1999). Aneuploid populations separated by this method have a high degree of purity and typically represent clonal populations (Barrett et al. 1999). The use of purified aneuploid populations allows for detection of near 100% LOH at some loci, making these samples ideal for comparing different LOH detection methodologies in human biopsy samples. DNA was extracted using the Puregene DNA Isolation Kit (Gentra Systems, Inc.). STR polymorphisms used consisted of primarily tetranucleotide repeats shown previously to have a high degree of reproducibility for scoring LOH (Paulson et al. 1999). Locus-specific PCR reagents and conditions for STR amplification and analysis were described previously (Paulson et al. 1999). PCR products were analyzed on an ABI 377 DNA Sequencer, and data were processed by use of Genotyper software (PE Applied Bio-systems). Allelic imbalance was assessed by measurement of the ratio of fluorescence intensity for the shorter allele A to that of the longer allele B (A/B) in the aneuploid sample, compared with a normal constitutive control. Ratios <0.4 or >2.5 (depending on which allele was lost) were considered to be indicative of allelic imbalance.

Amplification and Hybridization of SNPs

SNP-containing loci were amplified by allele-specific multiplex PCR from both tumor and normal genomic DNA. The multiplex PCR was performed by use of 46 PCR primer pairs in a single reaction (Wang et al. 1998). Forward and reverse primers contained T7 and T3 sequences, respectively (Wang et al. 1998). The PCR was performed under conditions similar to those described by Wang et al. (1998). The volume of PCR was 20 μl, containing 7 ng of genomic DNA, 0.1 μm of each primer, 1 unit of AmpliTaq Gold (Perkin-Elmer), 1 mm deoxynucleotide triphosphates (dNTPs), 10 mm Tris-HCl (pH 8.3), 50 mm KCl, and 5 mm MgCl2. Thermocycling was performed with initial denaturation at 96°C for 10 min, followed by 30 cycles of denaturation at 96°C for 30 sec, primer annealing at 55°C for 2 min, and primer extension at 65°C for 2 min. After 30 cycles, a final extension reaction was carried out at 65°C for 5 min. The length of the amplified PCR products was from 100 to 150 bp. An aliquot (2 μl) of the multiplex PCR products was subjected to a second round of PCR with biotinylated T7 and T3 primers. The reactions were performed with 0.1 μM labeled primer, 1 unit of AmpliTaq Gold, 100 μm dNTPs, 10 mm Tris-HCl (pH 8.3), 50 mm KCl, and 1.5 mm MgCl2. Thermocycling was carried out with initial denaturation at 96°C for 10 min, followed by 25 cycles of denaturation at 96°C for 30 sec, primer annealing at 55°C for 1 min, and primer extension at 72°C for 1 min. After 25 cycles, a final extension reaction was carried out at 72°C for 5 min. The biotin-labeled products were pooled and denatured at 99°C for 15 min and chilled on ice for 3 min before being added into a hybridization solution [3 M Tetramethl-ammonium Chloride, 10 mm Tris (pH 7.8), 0.01% Triton-X100 and 0.1 mg/ml herring sperm DNA]. Biotin-labeled control oligonucleotide was also added to the hybridization solution to produce fluorescence signals at the corners of the image for proper grid alignment and image analysis. An aliquot of 200 × of the hybridization mixture was added to the flow cell of the SNP arrays. Hybridization was carried out at 40°C for 16 hr on a rotisserie (50 rpm). Following hybridization, arrays were washed with 6× SSPE buffer [0.9 m NaCl, 60 mm NaH2PO4, 6 mM EDTA, (pH 7.4)] at room temperature. Then, the arrays were stained with Phycoerythrin-conjugated streptavidin (Molecular Probes, 2 μg/ml in 6× SSPE buffer, 0.01% Triton, 0.5 mg/ml BSA) on a rotisserie (50 rpm) for 10 min at room temperature. The arrays were washed again with 6× SSPE buffer and scanned with a custom-made scanning confocal microscope at a resolution of 3.4 μm per pixel (Trulson et al. 1997).

Data Analysis

Typical data analysis consisted of three sequential steps. First, the data underwent a quality analysis to reject loci lacking sufficiently strong and specific hybridization patterns. Second, the A-allele fraction (P) was calculated for loci that passed the quality analysis. Third, significant changes were assessed by calculating the difference of P values between the tumor sample and the corresponding normal sample at each locus.

Data Quality Analysis

The quality analysis was designed to identify and ignore loci that do not yield sufficiently clear hybridization patterns. This analysis is based on the idea that if a SNP marker is present in a target sample, it should hybridize to its complementary sequences tiled on the array and produce a specific hybridization pattern in which perfect match (PM) probes have higher intensity than mismatch (MM) probes. The intensity difference (PM − MM) and ratio (PM/MM) are calculated for each allele at each locus. For a given allele, if PM − MM > Difference Threshold (DT) and PM/MM > Ratio Threshold (RT), the allele is scored as present.

The appropriate values of DT and RT were developed and optimized through a series of analyses with a set of known control samples. In these experiments, 558 SNP markers from three individuals were amplified by single PCRs. The results show that in all three individuals, 510 PCR products had a single specific product of the expected size. These 510 loci were used as controls for false negative scores because they should give a specific hybridization pattern and be scored present. To test for false positive scores, 42 SNPs were chosen not to be amplified and therefore to give no hybridization pattern and be scored as absent. The products of these single PCRs for three individuals were pooled together and hybridized to three separated SNP arrays. False positive and false negative rates were measured for different combinations of DT and RT values, and thresholds were selected that gave the lowest overall false positive and false negative rates.

To score a locus, we also analyzed all probes that represented the marker. As shown in Fig. 1B, both the A- and B-allele tiles together at each position define a miniblock. If the signal for both alleles failed the DT and RT criteria, the miniblock was ignored. One strand of the marker was represented by 5 miniblocks (positions − 4, − 1, 0, +1 and + 4). If 3 miniblocks failed, the block was ignored. Both strands are queried for each marker using the same block structure. Therefore, if both the sense and antisense blocks failed, the marker was ignored.

Genotype Analysis

To determine the genotype, we used an algorithm that estimates the fraction of the A allele for each marker. The average percentage fraction of the A allele is defined as


where a and b represent the A and B allele, respectively, and MM is the average of the MM values as shown in Fig. 1B. Ideally, P = 100 (homozygous AA), P = 50 (heterozygous AB), or P = 0 (homozygous BB). To define the experimental deviation from ideality, a reference P range for each genotype was determined empirically by hybridizing samples from 39 unrelated individuals of known genotypes as described previously (Wang et al. 1998). The range of P values for each marker was defined by the presence of three distinct clusters representing the three genotypes. Although the absolute genotype calls are not crucial for the difference analysis, it is necessary to have clear distinctions between the P values for heterozygous calls in normal samples and homozygous calls in tumor samples.

Analysis of Allelic Imbalance

Allelic imbalance was assessed by measurement of the difference in P values between normal and tumor samples from the same individual. The difference value is defined as: ΔP = |PNPT|, where N denotes normal sample and T denotes tumor sample. Criteria for scoring allelic imbalance were developed with a training data set containing two normal samples and two tumor samples with known deletions (data not shown) and confirmed by the triplicate experiment (Figs. 4 and and5).5). First, to consider a marker potentially informative, the P value for the normal sample had to be in the heterozygous range (75 ≥ PN ≥ 25). For a marker to be considered as changed, the P value for the tumor sample had to be in the homozygous range

(P^N75or P^T25)

and had to be ΔP > 20. Finally, a change had to be consistent across the five miniblocks of each probe set, and if both strands of a marker passed the quality analysis, the change had to be called consistently for both strands.


We thank David Wang and Eric Lander for providing multiplex primer pools, PCR conditions and the SNP map, and Don Morris for designing the SNP array. This work was partially supported by National Institutes of Health Grants R01 CA61202 and RFA CA78855 to P.C.G and B.J.R.


  • Baccichet A, Qualman SK, Sinnett D. Allelic loss in childhood acute lymphoblastic leukemia. Leuk Res. 1997;21:817–823. [PubMed]
  • Barrett MT, Galipeau PC, Sanchez CA, Emond MJ, Reid BJ. Determination of the frequency of loss of heterozygosity in esophageal adenocarcinoma by cell sorting, whole genome amplification and microsatellite polymorphisms. Oncogene. 1996;12:1873–1878. [PubMed]
  • Barrett MT, Sanchez CA, Prevo LJ, Wong DJ, Galipeau PC, Ravinovitch PS, Reid BJ. Evolution of neoplastic cell lineages in Barrett oesophagus. Nature Genet. 1999;23:106–109. [PMC free article] [PubMed]
  • Boige V, Laurent-Puig P, Fouchet P, Flejou JF, Monges G, Bedossa P, Bioulac-Sage P, Capron F, Schmitz A, Olschwang S, et al. Concerted nonsyntenic allelic losses in hyperploid hepatocellular carcinoma as determined by a high-resolution allelotype. Cancer Res. 1997;57:1986–1990. [PubMed]
  • Brown MA. Tumor suppressor genes and human cancer. Adv Genet. 1997;36:45–135. [PubMed]
  • Califano JA, Johns MM, III, Westra WH, Lango MN, Eisele D, Saji M, Zeiger MA, Udelsman R, Koch WM, Sidransky D. An allelotype of papillary thyroid cancer. Int J Cancer. 1996;69:442–444. [PubMed]
  • Cargill M, Altshuler D, Ireland J, Sklar P, Ardlie K, Patil N, Lane CR, Lim EP, Kalayanaraman N, Nemesh J, et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nature Genet. 1999;22:231–238. [PubMed]
  • Cavenee WK, Dryja TP, Phillips RA, Benedict WF, Godbout R, Gallie BL, Murphree AL, Strong LC, White RL. Expression of recessive alleles by chromosomal mechanisms in retinoblastoma. Nature. 1983;305:779–784. [PubMed]
  • Chambon-Pautas C, Cave H, Gerard B, Guidal-Giroux C, Duval M, Vilmer E, Grandchamp B. High-resolution allelotype analysis of childhood B-lineage acute lymphoblastic leukemia. Leukemia. 1998;12:1107–1113. [PubMed]
  • Chee MS, Yang R, Hubbell E, Berno A, Huang XC, Stern D, Winkler J, Lockhart DJ, Morris MS, Fodor SP. Accessing genetic information with high-density DNA arrays. Science. 1996;274:610–614. [PubMed]
  • Dolan K, Garde J, Gosney J, Sissons M, Wright T, Kingsnorth AN, Walker SJ, Sutton R, Meltzer SJ. Allelotype analysis of oesophageal adenocarcinoma: loss of heterozygosity occurs at multiple sites. Cancer Res. 1998;78:950–957.
  • Field JK, Kiaris H, Risk JM, Tsiriyotis C, Adamson R, Zoumpourlis V, Rowley H, Taylor K, Whittaker J, Howard P, et al. Allelotype of squamous cell carcinoma of the head and neck: Fractional allele loss correlates with survival. Br J Cancer. 1995;72:1180–1188. [PMC free article] [PubMed]
  • Fodor SP, Read JL, Pirrung MC, Stryer L, Lu AT, Solas D. Light-directed, spatially addressable parallel chemical synthesis. Science. 1991;251:767–773. [PubMed]
  • Fodor SP, Rava RP, Huang XC, Pease AC, Holmes CP, Adams CL. Multiplexed biochemical assays with biological chips. Nature. 1993;364:555–556. [PubMed]
  • Forozan F, Karhu R, Kononen J, Kallioniemi A, Kallioniemi OP. Genome screening by comparative genomic hybridization. Trends Genet. 1997;13:405–409. [PubMed]
  • Fults D, Pedone CA, Thomas GA, White R. Allelotype of human malignant astrocytoma. Cancer Res. 1990;50:5784–5789. [PubMed]
  • Gleeson CM, Sloan JM, McGuigan JA, Ritchie AJ, Weber JL, Russell SE. Allelotype analysis of adenocarcinoma of the gastric cardia. Br J Cancer. 1997;76:1455–1465. [PMC free article] [PubMed]
  • Gruis NA, Abeln EC, Bardoel AF, Devilee P, Frants RR, Cornelisse CJ. PCR-based microsatellite polymorphisms in the detection of loss of heterozygosity in fresh and archival tumor tissue. Br J Cancer. 1993;68:308–313. [PMC free article] [PubMed]
  • Gunderson KL, Huang XC, Morris MS, Lipshutz RJ, Lockhart DJ, Chee MS. Mutation detection by ligation to complete n-mer DNA arrays. Genome Res. 1998;8:1142–1153. [PubMed]
  • Gupta PK, Sahota A, Boyadjiev SA, Bye S, Shao C, O’Neill JP, Hunter TC, Albertini RJ, Stambrook PJ, Tischfield JA. High frequency in vivo loss of heterozygosity is primarily a consequence of mitotic recombination. Cancer Res. 1997;57:1188–1193. [PubMed]
  • Hagstrom SA, Dryja TP. Mitotic recombination map of 13cen-13q14 derived from an investigation of loss of heterozygosity in retinoblastomas. Proc Natl Acad Sci. 1999;96:2952–2957. [PubMed]
  • Hahn SA, Seymour AB, Hoque AT, Schutte M, da Costa LT, Redston MS, Caldas C, Weinstein CL, Fischer A, Yeo CJ, et al. Allelotype of pancreatic adenocarcinoma using xenograft enrichment. Cancer Res. 1995;55:4670–4675. [PubMed]
  • Halushka MK, Fan JB, Bentley K, Hsie L, Shen N, Weder A, Cooper R, Lipshutz R, Chakravarti A. Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nature Genet. 1999;22:239–247. [PubMed]
  • Hansen MF, Cavenee WB. Genetics of cancer predisposition. Cancer Res. 1987;47:5518–5527. [PubMed]
  • Hatta Y, Yamada Y, Tomonaga M, Said JW, Miyosi I, Koeffler HP. Allelotype analysis of adult T-cell leukemia. Blood. 1998;92:2113–2117. [PubMed]
  • Hammoud ZT, Kaleem Z, Copper JD, Sundaresan RS, Patterson AG, Goodfellow PJ. Allelotype analysis of esophageal adenocarcinomas: evidence for the involvement of sequences on the long arm of chromosome 4. Cancer Res. 1996;56:4499–4502. [PubMed]
  • Johns MM, III, Westra WH, Califano JA, Eisele D, Koch WM, Sidransky D. Allelotype of salivary gland tumors. Cancer Res. 1996;56:1151–1154. [PubMed]
  • Kawanishi M, Kohno T, Otsuka T, Adachi J, Sone S, Noguchi M, Hirohashi S, Yokota J. Allelotype and replication error phenotype of small cell lung carcinoma. Carcinogenesis. 1997;18:2057–2062. [PubMed]
  • Knudson AG. Hereditary cancer, oncogenes, and anti-oncogenes. Cancer Res. 1985;45:1437–1443. [PubMed]
  • Kruglyak L. The use of a genetic map of biallelic markers in linkage studies. Nature Genet. 1997;17:21–24. [PubMed]
  • Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, et al. Expression monitoring by hybridization to oligonucleotide arrays. Nature Biotechnol. 1996;14:1675–1680. [PubMed]
  • Mao X, Barfoot R, Hamoudi RA, Easton DF, Flanagan AM, Stratton MR. Allelotype of uterine leiomyomas. Cancer Genet Cytogenet. 1999;114:89–95. [PubMed]
  • Mori N, Morosetti R, Lee S, Spira S, Ben-Yehuda D, Schiller G, Landolfi R, Mizoguchi H, Koeffler HP. Allelotype analysis in the evolution of chronic myelocytic leukemia. Blood. 1997;90:2010–2014. [PubMed]
  • Paulson TG, Galipeau PC, Reid BJ. Loss of heterozygosity analysis using whole genome amplification, cell sorting, and fluorescence-based PCR. Genome Research. 1999;9:482–491. [PMC free article] [PubMed]
  • Pease AC, Solas D, Sullivan EJ, Cronin MT, Holmes CP, Fodor SP. Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc Natl Acad Sci. 1994;91:5022–5026. [PubMed]
  • Piao Z, Park C, Park JH, Kim H. Allelotype analysis of hepatocellular carcinoma. Int J Cancer. 1998;75:29–33. [PubMed]
  • Pollack JR, Perou CM, Alizadeh AA, Eisen MB, Pergamenschikov A, Williams CF, Jeffrey SS, Botstein D, Brown PO. Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nature Genet. 1999;23:41–46. [PubMed]
  • Reid BJ, Barrett MT, Galipeau PC, Sanchez CA, Neshat K, Cowan DS, Levine DS. Barrett’s esophagus: Ordering of the events that lead to cancer. European J Cancer Prev. 1996;5:57–65. [PubMed]
  • Sato T, Tanigami A, Yamakawa K, Akiyama F, Kasumi F, Sakamoto G, Nakamura Y. Allelotype of breast cancer: Cumulative allele losses promote tumor progression in primary breast cancer. Cancer Res. 1990;50:7184–7189. [PubMed]
  • Seizinger BR, Martuza RL, Gusella JF. Loss of genes on chromosome 22 in tumorigenesis of human acoustic neuroma. Nature. 1986;322:644–647. [PubMed]
  • Shih YC, Kerr J, Hurst TG, Khoo SK, Ward BG, Chenevix-Trench G. No evidence for microsatellite instability from allelotype analysis of benign and low malignant potential ovarian neoplasms. Gynecol Oncol. 1998;69:210–213. [PubMed]
  • Stanbridge EJ. Human tumor suppressor genes. Annu Rev Genet. 1990;24:615–657. [PubMed]
  • Takeuchi S, Bartram CR, Wada M, Reiter A, Hatta Y, Seriu T, Lee E, Miller CW, Miyoshi I, Koeffler HP. Allelotype analysis of childhood acute lymphoblastic leukemia. Cancer Res. 1995;55:5377–5382. [PubMed]
  • Tamura G, Sakata K, Nishizuka S, Maesawa C, Suzuki Y, Terashima M, Eda Y, Satodate R. Allelotype of adenoma and differentiated adenocarcinoma of the stomach. J Pathol. 1996;180:371–377. [PubMed]
  • Thrash-Bingham CA, Greenberg RE, Howard S, Bruzel A, Bremer M, Goll A, Salazar H, Freed JJ, Tartof KD. Comprehensive allelotyping of human renal cell carcinomas using microsatellite DNA probes. Proc Natl Acad Sci. 1995;92:2854–2858. [PubMed]
  • Trulson MO, Stern D, Walton ID, Suseno AD, Rava RP. Fluorescence microscopy of oligonucleotide probe arrays. Proc SPIE. 1997;2980:145–148.
  • Tsuchiya E, Nakamura Y, Weng SY, Nakagawa K, Tsuchiya S, Sugano H, Kitagawa T. Allelotype of non-small cell lung carcinoma-comparison between loss of heterozygosity in squamous cell carcinoma and adenocarcinoma. Cancer Res. 1992;52:2478–2481. [PubMed]
  • Vogelstein B, Fearon ER, Kern SE, Hamilton SR, Preisinger AC, Nakamura Y, White R. Allelotype of colorectal carcinomas. Science. 1989;244:207–210. [PubMed]
  • Wang DG, Fan JB, Siao CJ, Berno A, Young P, Sapolsky R, Ghandour G, Perkins N, Winchester E, Spencer J, et al. Large-scale identification, mapping, and genotyping of single nucleotide polymorphisms in the human genome. Science. 1998;280:1077–1082. [PubMed]
  • Wodicka L, Dong H, Mittmann M, Ho MH, Lockhart DJ. Genome-wide expression monitoring in Saccharomyces cerevisiae. Nature Biotechnol. 1997;15:1359–1372. [PubMed]
  • Wolff RK, Frazer KA, Jackler RK, Lanser MJ, Pitts LH, Cox DR. Analysis of chromosome 22 deletions in neurofibromatosis type 2-related tumors. Am J Hum Genet. 1992;51:478–485. [PubMed]
  • Yamaguchi T, Toguchida J, Yamamuro T, Kotoura Y, Takada N, Kawaguchi N, Kaneko Y, Nakamura Y, Sasaki MS, Ishizaki K. Allelotype analysis in osteosarcomas: Frequent allele loss on 3q, 13q, 17p, and 18q. Cancer Res. 1992;52:2419–2423. [PubMed]
  • Yustein AS, Harper JC, Petroni GR, Cummings OW, Moskaluk CA, Powell SM. Allelotype of gastric adenocarcinoma. Cancer Res. 1999;59:1437–1441. [PubMed]