Figure S1: BAC FISH Analysis of Gene Predicted to Be Reduced Only in Orangutan
FISH images of BAC probe RP11-93K3 containing sequences from IMAGE cDNA clone 1882505.
(A) Normal human control, PHA-stimulated lymphocytes. Multiple (10–12) signals present: 9p12, 9q12, 4q arm, and two acrocentric p arm regions. The Chromosome 9 signals appear to flank the 9q heterochromatin and centromere regions, with the p arm signal a double signal.
(B) Bonobo fibroblast culture. Multiple (10–12) signals.
(C) Chimpanzee fibroblast culture. Multiple (10–12) signals.
(D) Gorilla fibroblast culture. Multiple (12–15) signals.
(E) Orangutan fibroblast culture. Two signals present on a pair of homologues (i.e., single copy in haploid genome). Also shown is a TreeView image (pseudocolor scale indicated) for IMAGE cDNA clone 1882505.
(3.29 MB EPS).
Figure S2: TreeView Image of cDNAs Selected Using Relaxed HLS Criteria
Figure shows a TreeView image of blocks of HLS genes selected using increasingly relaxed selection criteria. The top-most group represents HLS genes selected using the standard 0.5 cutoff value described in Materials and Methods
, while successive groups (separated by gray bars) represent additional cDNAs that were selected as the cutoff was progressively reduced to 0.45, 0.4, 0.35, and 0.3.
(1.9 MB EPS).
Figure S3: FISH Analysis with BAC Probe RP11-432G15 Containing the FLJ22004 Gene
(A) Normal human control, PHA-stimulated lymphocytes. Signal at Chromosome 2q13 and 22qtel.
(B) Bonobo fibroblast culture. Four signals.
(C) Chimpanzee fibroblast culture. Four signals.
(D) Gorilla fibroblast culture. More than 30 signals. Hybridization to most subtelomeric regions.
(E) Orangutan fibroblast culture. No apparent red signal. Probe BAC RP11-1007701 in green included as internal hybridization control. Also shown are aCGH TreeView images (pseudocolor scale indicated) for three FLJ22004 cDNAs.
(6.94 MB EPS).
Figure S4: GO Categories
Pie graphs showing the distribution of GO molecular function categories within HLS, LS, and whole genome gene lists. The top 22 categories are named in the legend in descending order of representation for all three groups. Categories were ranked by normalizing each category value for HLS and LS lists to the genome-wide list and then ranking the sum of these values for each category. Less well-represented categories were omitted from the graphs in order to enhance legibility, and zero values are not listed.
(1.1 MB EPS).
Figure S5: Interhominoid RT-PCR Analysis
RT-PCR was used to provide an independent confirmation of interspecies cDNA aCGH data for three genes in which aCGH signals were different in the African great apes compared to human and orangutan. The chromosomal location, IMAGE clone ID, and GenBank accession numbers are provided for each cDNA. The species average log2 ratios for each cDNA clone and the copy number ratio of the test gene to the CFTR (control) gene, as determined by RT-PCR, are shown for the indicated species. Also shown are TreeView images of interhominoid aCGH results for the indicated cDNAs and a graphical depiction of the correlation between aCGH signal and copy number ratio to CFTR (RT-PCR).
(A) PLA2G4B cDNA clone located on human Chromosome 15 using the UCSC November 2002 human genome assembly. The correlation between RT-PCR and aCGH-based copy number estimates is 0.94.
(B) FLJ31659 cDNA clone located on human Chromosome 4 using the UCSC November 2002 human genome assembly. As in (A), the correlation between RT-PCR and aCGH data is 0.97.
(C) BC040199 transcript located on human Chromosome 7 using the UCSC November 2002 human genome assembly. As in (A), the correlation between RT-PCR and aCGH data is 0.97.
(1.29 MB EPS).
Figure S6: FISH Analysis with BAC Probe RP11–23P13 Containing the Human PLA2G4B and SPTBN5 Genes
(A) Normal human control, PHA-stimulated lymphocytes. Two signals localized to Chromosome 15q15.1.
(B) Chimpanzee fibroblast culture. Two signals on the chromosome syntenic to human Chromosome 15 (at arrows). Multiple additional signals in the subtelomeric regions.
(C) Gorilla fibroblast culture. Two signals on the chromosome syntenic to Chromosome 15 (at arrows). Two additional signals on a large metacentric chromosome, which in interphase appear as amplified signals.
(D) Orangutan fibroblast culture. Two signals on the chromosome syntenic to human Chromosome 15.
(4.39 MB EPS).
Protocol S1: How to View aCGH Data Using TreeView
(2 KB TXT).
Table S1: Genome-Wide Interhominoid cDNA aCGH Gene Dataset
Values are provided for genes (cDNAs) queried by interhominoid aCGH. For each IMAGE clone queried, log2 aCGH values are listed for the human versus human samples (n = 5), human versus bonobo samples (n = 3), human versus chimpanzee (n = 4), human versus gorilla (n = 3), and human versus orangutan (n = 3). This table is tab-delimited and can be opened in Microsoft Excel to view the raw numbers or can be browsed using TreeView (see Protocol S1). Column B provides information regarding IMAGE clone number, chromosome, and nucleotide position (UCSC November 2002 freeze), Unigene ID, EST accession numbers, and known gene information.
(12.84 MB TXT).
Table S2: Detailed Comparison of HLS Gene and WSSD Datasets
For each IMAGE clone of the HLS genes, one or more EST sequences were used as a query for a BLAST search against the WSSD dataset. An expect value cutoff of e–20 was used and the best hit is reported in the table. Query refers to the HLS gene EST sequences; subject refers to the WSSD sequences. Score, expect value, and percent identity (ID) are reported for the best BLAST hit, while the start and stop positions and length for both query and subject are also reported.
(434 KB DOC).
Table S3: Satellite Repeat Subclass Analysis for LS Gene Clusters
For each of the 23 LS gene clusters, Satellite repeat subclass analysis was performed. The table lists the cluster's cytogenetic region, the chromosome and start and stop positions, and the adjusted length after accounting for gaps in the genomic sequence. The percent content for 24 subclasses of Satellite repeats is listed for each of the 23 gene clusters. Summary information includes the average content of the 24 subclasses of Satellite repeats for all of the clusters as well as the average for the entire human genome. The difference and fold change are calculated based on comparing the cluster averages to the entire human genome averages.
(111 KB DOC).
Table S4: LS Gene Datasets
Similar to Table S1, but only IMAGE clones with LS characteristics are listed, and each is ranked based on average fluorescence signal (highest to lowest) within each lineage.
(269 KB TXT).
Table S5: GO Analysis Comparing HLS and LS Genes to the Whole Genome
(52 KB DOC).
Table S6: Functional Assessment of Copies of HLS Genes
Presented are pertinent data from GO analysis with DAVID, including numbers of classified and unclassified genes in each gene list, as well as the data returned for each of the 22 most represented molecular function categories. Listed are GO identification numbers (GOIDs) and names for each of the top 22 categories, as well as raw values and relative percent values for HLS, LS, and genome classifications. Relative percent columns are taken as the ratio of the number of classifications in each category to the number of genes classified in the list. The average percent is also provided as the average of these relative percent values across the three groups. This is intended as a metric to help gauge deviations in group relative percent values from the combined average value.
(81 KB DOC).
Table S7: Comparison of Human HLS Genes to Chimpanzee Genomic Sequence
The table has three sections: a summary showing the percentages of blocks in each respective chimpanzee homology scoring class; a table with the HLS versus chimpanzee data; and a table with the random versus chimpanzee data. The HLS versus chimpanzee and random versus chimpanzee tables have columns derived from both parsing the BLAT PSL data and from the chimpanzee homology comparison. The table lists the IMAGE clone and the EST accession number used as a query, the hit number, the score and percent identities, the start and stop positions in the query, the chromosome and chromosome start and stop positions, the number of blocks of alignment for the hit, the numbers of blocks that fall into each chimpanzee homology scoring class, and finally the respective chimpanzee scaffold(s) for each hit, if available.
(3.58 MB DOC).