|Home | About | Journals | Submit | Contact Us | Français|
The chemical composition of Persistent Organic Pollutants (POPs) in the environment is not uniform throughout the world, and these contaminants contain many structurally different lipophilic compounds. In a well-defined study cohort in the Slovak Republic, the POP chemicals present in the peripheral blood of exposed children were chemically analyzed. The chemical analysis data revealed that the relative concentration and profile of structurally different organic pollutants, including polychlorinated biphenyls (PCBs), 2,2’-bis(4-chlorophenyl)-1,1- dichloroethylene (p,p’-DDE), 2,2’-bis(4-chlorophenyl)-1,1,1-trichloro-ethane (p,p’-DDT), hexachlorobenzene (HCB) and β-hexachlorocyclohexane (β-HCH), may vary from individual to individual, even within the same exposure area. These chemicals can be broadly classified into two groups. The first group, the PCB congeners, primarily originated from industrial compounds and their byproducts. The second group of compounds originated from or was commonly used in the agricultural sector (e.g., DDT, HCB). The objective of this study was to examine the effects of the two POP exposure profiles on gene expression. For the study population, we selected prepubertal girls (mean age of 46.2 ± 1.4 months) with high POP concentrations in their blood (> 75% tile of total POP) and classified them in the high ‘PCB’ group when the total PCB concentration was significantly higher than the total concentration of other POP components and in the ‘Other Than PCB’ (OTP) group, when the total PCB concentration was significantly lower than the concentration of the other major POP constituents. A matched control group of girls (< 25% tile of total POP) was selected for comparison purpose (n = 5 per group). Our aims were to determine whether there were any common effects of high POP exposure at a toxicogenomic level and to investigate how exposure may affect physiological functions of the children in two different exposure scenarios. Global gene expression analysis using a microarray (Affymetrix Gene Chip Human genome U133 Plus 2.0 Array) platform was conducted on the total RNA of peripheral blood mononuclear cells from the girls. The results were analyzed by Partek GS, Louis, MI, which identified twelve genes (ATAD2B, BIVM, CD96, CXorf39, CYTH1 ETNK1, FAM13A, HIRA, INO80B, ODG1, RAD23B, and TSGA14) and two unidentified probe sets, as regulated differentially in both the PCB and OTP groups against the control group. The qRT-PCR method was used to validate the microarray results. The Ingenuity Pathway Analysis (IPA) software package identified the possible molecular impairments and disease risks associated with each gene set. Connective tissue disorders, genetic disorders, skeletal muscular disorders and neurological diseases were associated with the 12 common genes. The data therefore identified the potential molecular effects of POP exposure on a genomic level. This report underscores the importance of further study to validate the results in a random population and to evaluate the use of the identified genes as biomarkers for POP exposure.
Persistent organic pollutant (POP) is a collective term for any organic, lipid-soluble compound that survives physicochemical and biological degradation and persists in the environment. These chemicals often bio-accumulate in fatty tissues and maintain a close equilibrium with the lipid content of the extracellular fluid of the body due to their slow metabolic clearance. The production and uses of many POP compounds, especially organochlorine insecticides and industrial chemicals, have been banned in developed countries. Information about the sources, release, and environmental levels of POPs are yet to be completely understood, particularly in underdeveloped and developing countries (Bouwman 2003). The global transport is common for POP compounds, so the effects of these chemicals on human health are not restricted by national boundaries.
These bioaccumulated POP compounds may be of different chemical structures. Even within a particular group of chemicals, such as polychlorinated biphenyls (PCB), there may be different chemical structures depending on the degree of chlorination and their positional substitution at the biphenyl nucleus.The relative distribution of each POP chemical in the body may be different, depending on several factors, including the source of the contamination, dietary habits, and the timing of the exposure (e.g., pre- or post-natal) (Webster et al., 2004).
Previous epidemiological studies reported an association between POP exposure and higher risks of different diseases (Sergeev and Carpenter 2010), including cancer (Brody et al., 2007; Bencko et al., 2009). In vitro and in vivo animal models have been used to understand the mechanisms of action responsible for the health risks of POP components (Zhu et al., 2009). The structural similarities of these POP chemicals with the naturally occurring metabolites, particularly with steroid and thyroid hormones may induce pseudo-hormonal behavior and may cause imbalances in normal physiological processes through receptor-mediated interactions (Kelce et al., 1995). Additionally, these compounds and their metabolites may also produce oxidative stress (Lin et al., 2009).
Microarray-based gene expression studies conducted in our lab and by other groups have so far used specific POP compound(s) in different combinations to study the effects of the exposure. These studies were mainly conducted in cultured cells, rodents or aquatic models (Dutta et al., 2008; De et al., 2010; Ghosh et al., 2007, 2010, 2011; Lyche et al., 2010). Royland (2008 a,b) investigated the effects of a commercial PCB mixture on the expression of genes related to developmental neurobiology in rats. However, the composition of actual bioaccumulated exposure profiles is different from the composition of the commercial chemical. Moreover, the species differences between rodents and humans also affect the pharmacokinetics of the toxicity. In our recent work, we have established that differences in gene expression due to PCB exposure can help elucidate the molecular mechanisms behind the associated disease processes (Dutta et al., 2011). In actual exposure, different lipophilic organochemicals are present at the same time. The evaluation of the integrated toxicogenomics impact of these compounds is essential to understand the molecular mechanism of the associated disease risk in individuals. The scientists were able to delineate the effects of a cocktail of serum POPs in the Greenlandic Inuit by evaluating receptor interactions in an in vivo set up. They found that the overall effect of the exposure was endocrine disruption in that particular population. (Kruger et al., 2008, Long et al., 2007). Toxicological potential other than endocrine disruption have also been observed for these chemicals (Faroon et al., 2001; Farnandes et al., 2010).
So far, no data are available on the effects of POP on global gene expression or the comparative effects of different POP profiles in humans. A single study has been performed with pooled serum samples to understand the effects of POP exposure on a moderate number of genes (908 genes) using an array-based method. This study detected the down regulation of tumor suppressor genes (Tsai et al., 2008). However, their work did not provide information about the effects of POP exposure on the whole genome, or the effects of variations in the POP profile on gene expression.
In the present study, our goal was to examine the effects of POP exposure on global gene expression and determine how the variation in POP components in actual human exposure may affect gene expression in the children of a well-studied cohort to expand our knowledge about the molecular mechanisms responsible for different POP exposure-related pathophysiological effects.
We selected subjects for this study from a unique cohort of mothers and children from the Slovak Republic in Eastern Europe. This cohort has already been explicitly documented for differential pathophysiological effects due to high POP exposure (Jan et al., 2007; Park et al., 2009; Radikova et al., 2008; Trnovec et al., 2010; Ukoprek et al., 2010). Each child was evaluated using standardized clinical assessments. We analyzed only the girls’ microarray results (mean age 46.2 months) because there were few boys for whom both the Affymetrix expression data and blood POP concentrations were available. We classified these subjects into two groups: the ‘high PCB’ group (Group A), for which the total PCB concentration was significantly higher than total concentration of other POP components, and the ‘Other Than PCB’ (OTP) group (Group B), for whom the total PCB concentration was significantly lower than the concentration of other major POP constituents. A matched control group of girls (<25% tile of total POP, Group C) was selected for comparison purpose (n = 5 per group) (Figure 1). This study was approved by the Howard University Institutional Review Board (IRB-07-GSAS-30), and written consent in the approved format was obtained from all study subjects prior to the collection of blood.
The details of the estimation of polychlorinated biphenyl and pesticide concentrations have been described previously (Ukoprek et al., 2010). The serum concentrations of fifteen PCB congeners (IUPAC numbers 28, 52, 101, 105, 114, 118, 123, 138+163, 153, 156+171, 157, 167, 170, 180 and 189) and the concentrations of p,p’-DDE, 2,2’-bis(4-chlorophenyl)-1,1,1-trichloro-ethane (p,p’-DDT), HCB and β-hexachlorocyclohexane (β-HCH) were determined in serum using a high-resolution gas chromatography device (HP 6890; Agilent, Santa Clara, CA, USA) equipped with a Ni-63 micro-electron capture detector and a 60-m DB-5 capillary column (J&W Scientific, Folsom, CA, USA) (Kocan et al.,1994,2001). For congeners that were below the detection limit the values corresponding to half the limit of detection were used. The detection limit of each PCB congener varied from 3.9 ng/g lipid for PCB-157 to 7.5 ng/g lipid for PCB-52. For organochlorine pesticides, the limits of detection were between 2.8 ng/g for γ-HCH and 7.4 ng/g for β-HCH (Petrk et al.,2006; Kokan et al.,1994). The sum of all 15 individual PCB congeners was calculated as the sum of PCBs including half the limit of detection for nondetected PCBs. The organochlorine concentrations were adjusted to the total lipid level. The total cholesterol, non-esterified cholesterol, phospholipids and triacylglycerol were measured using enzymatic methods (Ukropec et al., 2010; Patric et al., 2006).
Whole blood was collected in PAXgene® Blood RNA tubes in Slovakia and dispatched to the Molecular Genetics Laboratory at Howard University, Washington, DC, USA according to the specific protocol of the manufacturer (Qiagen, MD). RNA was extracted from the PAXgene tubes using the PAXgene Blood RNA Kit IVD (Qiagen, Cat. # 762164) according to manufacturer’s instructions. Contaminating DNA was removed using the Ambion DNA-free kit. The RNA content was determined spectrophotometrically on a nanodrop at 230, 260 and 280 nm. RNA quality was also verified with an Agilent bioanalyzer using an RNA 6000 nanochip before the microarray chip hybridization; the RNA was stored at −80 °C.
The microarray analysis was performed at the Center for Genetic Medicine, Children’s National Medical Center, Washington, DC, USA. Expression profiling data was obtained for each sample individually using the GeneChip® Human Genome U133 Plus 2.0 array with standard operating procedures and following the protocols of Zhao and Hoffman (2004) for the quality control procedure. The expression data sets are available at the following link: www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=vtcvzaqsusyysdy&acc=GSE28805.
To summarize the information obtained in the microarray analysis and to check data quality, we used Principal Component Analysis (PCA) and plotted the PCA scores. In PCA analysis, the aim was to reduce a complex, multi-dimensional problem to a few understandable components. The plot demonstrated one point per array. Each color represented one experimental factor (grouping). This allows viewing of separations between groups of replicates and from arrays of other groups. In the 3D PCA plot PCA scores (components) are represented in the X,Y, and Z axes and are numbered as PC1,PC2,and PC3, according to their decreasing significance. PC1 is the first axis which account for the most variability in the data. The proportion of variance accounted for by each subsequent axis, PC2 and PC3 (in three dimensional presentations) will sequentially decrease.
To identify differentially expressed gene sets, we analyzed the microarray results using a oneway ANOVA model by Partek GS. Briefly, the raw microarray values from the Affymetrix.cel files were imported into Partek Genomics Suite, version 6.5, release 6.10.0915 (Partek Inc., St. Louis, MI). The probe summarization and probe set normalization were done using the GC RMA algorithm (Wu and Irizarry 2004), which included GC-RMA background correction, quantile normalization, log2 transformation and median polish probe set summarization. We used a one-way ANOVA model and the method of moments (Eisenhart, 1947). Model: Yij = μ + PCB Other Control i+ εij, where Yij represents the jth observation of the ith PCB Other Control; μ is the common effect for the whole experiment. εij represents the random error present in the jth observation on the ith PCB Other Control. The errors εij are assumed to be normally and independently distributed with mean 0 and standard deviation δ for all measurements. The following comparisons were performed: Other than PCB vs. Control; PCB vs. Control.
The actual variation among the groups is expressed using F ratio and presented in the ‘Source of Variation Plot’. Thus, if the null hypothesis is correct that there is no variation between the groups, we expect F to be about 1, whereas large F-ratio indicates a likely significant effect. The bar chart shows the variation contributed by effects across all the test variables (response variables) in the ANOVA model, i.e., X axis represents the factors or interaction in the ANOVA model and Y axis represents the F ratio.
With more than 54,000 probe sets, whole-genome microarray studies are susceptible to type I errors, because they simultaneously test multiple hypotheses. Our goal was to identify a true positive gene set associated with POP exposure. We employed the False Discovery Rate (FDR) correction, the most frequently used method of controlling the type I error. The results yielded the expected proportion of truly null hypotheses among all the rejected null hypotheses (Kaizer et al., 2007) and kept the false positive rate below five percent.
We selected gene sets that represented at least a 1.5-fold change in expression level because at this level of altered gene expression, the biological impact of chemical exposure could be reliably deduced (Guo et al., 2006). This analysis generated two highly significant sets of differentially expressed genes that originated from two different high-POP exposure conditions relative to low POP exposure (FDR < 0.05 and Fold Change > 1.5 or < -1.5).
The differential expression of the genes in POP exposures is presented using ‘Volcano Plot’. It is an effective and easy-to-interpret graph that summarizes both fold-change and level of significance. It is a scatter-plot of the negative log10-transformed p-values against the log2 fold change. Genes with statistically significant differential expression will lie above a horizontal threshold line. Genes with large fold-change values will lie outside a pair of vertical threshold lines. The significant genes will tend to be located in the upper left or upper right parts of the plot (Cui and Churchill, 2003).
The common differentially expressed genes in two types of exposure are represented in Venn diagram. Venn Diagrams are useful for examining similarities and differences in two sets of genes. It is often used in mathematics to show relationships between sets. The overlapping sets of genes representing both the PCB set and OTP set are observed in the intersection of the circles.
The data mining software Ingenuity Pathway Analysis (IPA) was used for further analysis of the gene lists obtained from the Partek GS analysis. The gene identifiers (called focus molecules) of the lists were mapped to their corresponding gene objects and overlaid onto a global molecular network developed from information contained in the Ingenuity knowledge base. Gene identifiers were defined as value parameters for the analysis and identified the relationship between gene expression changes and related changes in biofunctions that fell into the subcategories of molecular and cellular functions, physiological system development, and function and disease and disorders.
Biofunction analysis identified the biofunctions that were significantly associated with the data set based on the association between the data set and the information in the IPA library. This association was calculated from the ratio of the number of genes from the data set that are associated with the biofunction divided by the total number of molecules that are associated with the function. The probability that each biological function and/or disease assigned to that data set is due to chance alone was calculated using a Right-tailed Fischer’s Exact Test. Overrepresentation of the molecules in a given process was considered to be statistically significant when P < 0.05. The over-represented functional or pathway processes are those which have more focus molecules than expected by chance.
The genes identified by microarray analysis were validated by qRT-PCR. The sets of forward and reverse primers are given in Supplemental Table 3. The amplification efficiency of the primer pairs are 100±10 as mentioned by Applied Pharmaceuticals user’s guide. Total RNA was DNase treated to eliminate contaminating DNA using a DNA-free kit (AM1906) from Ambion (Austin TX). The reverse transcription reaction was performed with a High Capacity RNA-to cDNA Kit (P/N 4387949) from Applied Biosystems (Foster City, CA) according to the manufacturer’s protocol. The reverse transcription reaction was carried out using Taqman Universal master Mix II (Applied Biosystems, Foster City, CA). All reactions were run in 3 technical replicates. The endogenous control GAPDH (Glyceraldehyde-3-phosphate dehydrogenase) was measured from the same cDNA preparations. All primers were designed to amplify only from cDNA. All quantitative PCR reactions were performed in a 7900 HT instrument. The reactions with Taqman Universal Master Mix II and specific primers were carried out for 10 min at 95°C, to allow for polymerase activation, followed by 40 cycles of 15 sec at 95 °C for denaturation and 1 min at 60 °C for the annealing and extension step. Values from 3 technical replicates for each expression assay were averaged and normalized to the expression level of GAPDH from the same cDNA preparation. Relative differences were calculated using the ΔΔCt method (Livak and Schmittgen, 2001).
The relative distribution of the two groups of POP components (PCB and OTP) in each subject is presented in Figure1. The median and mean values and ±SD of the three groups are as follows. Median POP concentration of High PCB Group is 4.8 ng/mg of lipid with a range 1.8-7.4 and the mean (±SD) of this group is 4.5 ± 1.9 ng/mg of serum lipid. Median POP concentration of High POP Group is 3.3 ng/mg of lipid with a range of 2.9-7.5 ng/mg of lipid and the mean (±SD) value of this group is 4.2 ±1.6 ng/mg of serum lipid. The control group has the Median of 0.15 ng/mg of lipid with a range of 0.07-0.4ng/mg of serum lipid and a mean (±SD) value of 0.18 ± 0.1 ng/mg of lipid.
To determine the actual differences in over all gene expression, between the three groups, if any, we conducted principal component analysis on the microarray data set. Figure 2 indicated the relationships among the groups in the 3D PCA plot. The colors represented red for control, blue for OTP and green for PCB group respectively. The contour of the ellipsoids demonstrated the possible variation within the replicates in each groups, constructed with Partek software. In this microarray experiment PC1 accounted for 34.5%, of the variability in the data followed by the subsequent axis, PC2 (13.2 %) and PC3 (9.22 %). The distribution and clustering of the samples within the PCA models indicated that the OTP and PCB group were mostly separated from control group though a small partial overlap was there. It also indicated that the two different exposure profiles may have two different impacts on gene expression. However, the complete absence of separation of the two high exposure groups from that of control group indicated that the effect of gene expression is modest.
We identified 38 differentially expressed genes between group A and C, of which 19 were up regulated and 21 were down regulated. The notable genes in this category are APC, ETNK1, and CAPN2 (for the complete list see Supplemental Table 1). With the same statistical stringency, we identified 82 genes that were differentially expressed between group B and C. The up regulated (22) and down regulated (60) gene sets with their location and respective fold change are included in the table. The notable genes in this category are ALCAM, CDK6, and CDK8 (for the complete list see Supplemental Table 2).
The effects of the exposure were tested for the F-ratio between the groups. In an ANOVA model, the F-ratio is the statistic used to test the hypothesis that the effects are real (i.e., the means are significantly different from one another). If the difference between the means is due only to chance, and there are no real effects, then the expected value of the F-ratio would be one (1.00). The variations in the F-ratios for our analysis were 27.32 and 23.30 for OTP vs. the control group and PCB vs. the control group, respectively (Supplemental Figure 3). The volcano plot (Supplemental Figure 4) represents the fold-changes as well as the statistical significance of the difference in terms of the P value of the two gene sets. The red circles represent up regulated (> 1.5-fold) genes and blue circles down regulated (< -1.5-fold) genes.
The two POP distribution patterns yielded two different gene sets with partial overlap on a Venn diagram (Figure 3). A large proportion of the genes that were significantly altered were POPprofile specific. The common area in the Venn diagram (Figure 3) represents 14 Affymetrix probe IDs common to two differentially expressed gene sets (Table 1). In two cases, the IDs did not represent any known gene. Among twelve known genes, 10 genes were down-regulated and only 2 genes were up-regulated. The heat map of the 14 gene IDs is shown in Figure 4. The hierarchal clustering pattern of these differentially expressed genes shows the admixture of subjects in the two POP groups, although they are significantly different from the lower exposure group. Table 1 shows the differential expression patterns in the two groups side by side. They changed in the same direction in both groups, and were highly correlated.
Of the 38 gene IDs from the differentially expressed gene set in Group A, 31 IDs were successfully mapped to the molecules in the Ingenuity Knowledge Base. The molecules formed 8 different networks (Table 2). The functions associated with network 1, which contains 12 focus molecules, are cellular growth and proliferation, tissue development, DNA replication, recombination, and repair. The biological functions associated with network 2 are cellular growth and proliferation, cell cycle, hepatic system development and function. Network 2 contains six focus molecules. These two networks are interconnected and in total contain 18 focus molecules.
In Group B, 70 gene IDs out of 82 were mapped to the Ingenuity Knowledge Base. These molecules formed 7 different networks (Table 3). The biological functions associated with the three major interconnected networks are cancer, gastrointestinal disease, cellular growth and proliferation for network 1, cancer developmental disorder, genetic disorder for network 2 and cellular development, cellular growth and proliferation, hematological system development and function for network 3. The three interconnected networks in total contain 38 focus molecules. Another two major networks contain eleven and 8 focus molecules.
To validate the array data, the gene expression levels of selected genes were measured by quantitative Real Time Polymerase Chain Reaction (qRT-PCR) (Supplemental Figure 1). The selected genes that were significantly changed in two exposure groups were validated by RT-PCR technique. The PCR results indicated the similar effect in expression due to POP exposure, as observed in microarray experiments. The array data tend to have a higher fold change than PCR data.
The selection of the subjects was very important for this study. The girls were selected from a well-defined cohort of children from the Slovak Republic. This eastern part of Slovakia is one of the most POP-contaminated areas due to negligence in the proper disposal of organochemicals from chemical manufacturing units. The details of the cohort have been described previously (Park et al., 2009, 2010; Wimmerova et al., 2010). The POP profiles were dominated by two different classes of chemicals, which represented two different sources of environmental contamination. Chemical analysis of extracellular fluid to identify the accumulated POPs in these children demonstrated that there was variation between the POP profiles in these children (Figure 1). The‘t test’ between the total POP levels in the OTP and PCB groups demonstrated no statistically significant difference (0.94). However, the POP concentration of each of the group was highly significant when compared against the control group. The prior epidemiological study of this cohort indicated that the common POP markers showed a different association pattern with the incidence of certain diseases (Ukopreck et al., 2010). The wide inter-individual variability in exposure (Llop et al., 2010) and the involvement of different known covariant is of major concern. We did not consider the confounding factors like, education, socio-economic conditions, and other epidemiological settings, while elucidating the gene expression results due to the fiscal constrain of microarray experiments. We did not analyze the data on that basis due to inadequate sample size in each category for statistically meaningful analysis result. However, health parameters of the girls and their mothers were within the physiological normal limit at the time of birth. The subjects were also within the physiological normal limit and healthy at the time of drawing the blood. There was no manifestation of disease either diabetes or obesity in these girls which was checked by the physician for any disease.
The pre- and post-natal routes of bioaccumulation are important determinants of the POP exposure profile. Upon lipophilic chemical exposure, the concentration of lipids in extracellular fluid determines the bioavailability and metabolic clearance of these compounds. Therefore, the concentration of lipids is an important factor that can influence the perturbation of physiological functions. We compared the lipid level and did not find any significant difference between the groups. The lipid concentrations of the subjects were in the same range in all three groups. A high correlation between plasma POP concentration and lipid-adjusted POP concentration was observed in these study subjects (Supplemental Figure 2). The free POP level, which has also been reported as being important for some physiological functions, was expected to be proportional to the POPs in the extra cellular fluid in the subjects (Wigle 2008). Similarly, the age and gender were consistent determinants of serum POPs (Jonsson et al., 2005). We included only the girls in this study, mean age of 46.2 ± 1.4 months. There was no significant difference in age between the groups. The chronological age of the girls was below the age of menarche. During this period of life, the gonadal and thyroid hormones maintain a steady level. Therefore, the differential changes in endocrine parameters and their functional effects, if any, are expected to only be due to the differences in POP exposure between the groups. These groups of girls, one with low blood POP levels and another with high blood POP levels, divided into two different exposure profiles and therefore represented a good match for the comparison analysis. The welldefined subjects are one of the major strengths of this study.
We identified a common set of 14 genes that were differentially expressed in the PBMC cells of the two groups of girls with different POP exposure profiles compared with the girls who had low POP exposure levels (Table 1). This finding suggested that there may be a set of common genes associated with POP exposure, regardless of the profile of the accumulated POPs in an individual subject or the exposure scenario. Our analysis also indicated that this common toxicogenomic response may explain some general functional effects common to POP exposure, and the specific functional manifestation and pathophysiology would depend on the common toxicogenomic perturbation and other, profile-specific genes.
The common differentially expressed genes in the two groups with high POP levels were mostly downregulated. The direction of the change in each gene was the same in both groups (Table 1). HIRA, a transcriptional regulator gene located in the nucleus, was down-regulated approximately two-fold in both cases. This gene encodes a histone chaperone that preferentially places the variant histone H3.3 in nucleosomes and plays an important role in the formation of the senescence-associated heterochromatin foci. These foci are thought to mediate the irreversible cell cycle changes that occur in senescent cells. Insufficient production of this gene may disrupt normal embryonic development (Lorain et al., 1998). An animal study has provided evidence supporting the observed associations between behavioral and learning disabilities and prenatal exposure to PCBs in humans (Nakagami et al., 2011). Du et al. (2008) showed in an animal model that HIRA may play a role in the development of embryos.
The two other nuclear genes, RAD23B and INO80B, also followed the same expression pattern in these two groups. RAD23B was found to be a component of the protein complex that specifically complements the NER (nucleotide excision repair) defect (Neher et al., 2010). This protein has also been shown to interact with and elevate the nucleotide excision activity of 3- methyladenine-DNA glycosylase (MPG), which is involved in DNA damage recognition in base excision repair. This protein is also involved in other biological processes, including spermatogenesis and proteasomal ubiquitin-dependent protein catabolism. The toxic effects of this gene under stressful environmental conditions, especially oxidative damage induced by different classes of chemical pollutants, have been studied (Valavanidis et al., 2006). It has also been reported that some environmentally persistent organochlorines may alter the expression of tumor suppressor genes that are important in the DNA repair process (Rattenborg et al., 2002). INO80B was another nuclear gene that was upregulated in both groups with high POP levels. Although very little is known about the biological function of this gene, the activator of its nucleolar binding protein, Pim-1, is a proto-oncogene. It has been used as a prognostic marker in prostate cancer, which suggests a link between this group of proteins and prostate carcinogenesis. INO80B induces growth arrest by halting the cell cycle at the G1phase (Miyako et al., 2009; Kuroda et al., 2004). The effect of the POP chemical HCB was studied in detail in NIH 3T3 (mouse) and WS1 (human) embryonic cells. Exposure of both cell types to HCB results in cell membrane damage, a short-term decrease in cell number, increased DNA strand breaks, and a long-term decrease in colony survival. This study demonstrated that relevant environmental concentrations of HCB have significant effects on mammalian embryonic cells in culture (Salmon et al., 2002). Therefore, our results support the in vitro mechanistic findings.
Among the differentially regulated cytoplasmic genes, ETNK1 and TSGA14 were down regulated, and CYTH1 was up regulated in both sets of subjects with high POP levels. The ETNK1 gene encodes a kinase, which is the first enzyme in the phosphatidyl ethanolamine synthesis pathway. This cytosolic enzyme is specific for ethanolamine and exhibits negligible kinase activity on choline. The specific function of TSGA14 is unknown, but a possible link between the gene and a centrosomal function in familial autism has been reported (Korvatska et al., 2011). The protein encoded by CYTH1 gene is a member of the PSCD family and appears to mediate the regulation of protein sorting and membrane trafficking. This gene is highly expressed in natural killer and peripheral T cells and regulates the adhesiveness of integrins at the plasma membrane of lymphocytes. Molecular functions of CYTH1 include ARF guanylnucleotide exchange factor activity and protein binding. These genes were associated with vesicle-mediated transport; regulation of cell adhesion; and regulation of ARF protein signal transduction.
The expression levels of CD96 and ODZ1, which have been tentatively located at the plasma membrane, were also down-regulated in both groups with high POP levels. The protein encoded by the CD96 gene belongs to the immunoglobulin superfamily. This protein is a type I membrane protein and may play a role in the adhesive interactions between activated T and NK cells during the late phase of the immune response. CD96 may also have a role in antigen presentation. C (Opitz trigonocephaly) syndrome can be caused by disruption of the CD96 gene by a missense mutation (Kaname et al., 2007), although the effects of this deficiency have not been discussed in the literature. The protein encoded by the ODZ1 gene belongs to the tenascin family and teneurin subfamily. It is expressed in the neurons and may function as a cellular signal transducer (Zhou et al., 2003). The expression patterns of these genes are not completely consistent with the gene expression analysis of Royland et al. (2008 a, b) in neural tissue of animals after PCB exposure. However, the neurobehavioral effects have also been reported in an epidemiological study on this particular population (Park et al., 2009).
The four other common genes, for which the subcellular localization is not known, were ATAD2B, BIVM, CXorf39 and FAM13A. ATAD2B overexpression has been demonstrated by immunostaining in human breast carcinoma. In tumors, ATAD2B appears to be cytoplasmic or membrane bound, not nuclear. ATAD2B is a phylogenetically conserved nuclear protein expressed during neuronal differentiation and tumorigenesis (Leachman et al., 2010). BIVM possesses virtually no sequence similarity to any currently described protein, making the prediction of a function challenging. It is highly likely that BIVM is essential for some aspect of basic cellular functioning and is expressed in a near-ubiquitous manner. The presence of a CpG island at the 5’ end of BIVM and its wide tissue distribution suggest that it may function as a housekeeping gene (Yoder et al., 2002). CXorf39 (FAM199Y) is an uncharacterized gene. However, FAM13A variants are associated with lung function (Pillai et al., 2010). At present, no known pathophysiology has been associated with these genes in epidemiological studies.
In our effort to understand the functional consequences of the differential expression induced by high POP exposure, we used IPA analysis software to identify some key mechanisms involved in POP-related toxicity (http://people.mbi.ohio-state.edu/baguda/PathwayAnalysis/ipa_help_manual_5.5_v1.pdf). The IPA Analysis identified the functions and/or diseases that were most significant to the data set. Fischer‘s exact test was used to calculate a p-value determining the probability that each biological function and/or disease assigned to that data set is due to chance alone.
The IPA analysis explained some of the cellular effects and aberrations in pathology commonly associated with POP exposure. The analysis of the gene sets differentially expressed in the OTP and PCB groups indicated some common disease processes, but these were associated with different gene sets. The involvement of different toxicogenomic pathways induced by the POP chemicals has been described in the literature (Sergent et al., 1989; Khan and Thomas 2001; Langer et al., 2008; Mariussen and Fonnum 2006; Sagerup et al., 2009., Park et al., 2009; Ruzzin et al., 2010; Shi et al., 2010; De et al., 2010). The effects of POP exposure can be better understood by studying the networks formed by the two groups of genes. However, the functional effect categorized by Ingenuity as ‘cellular growth and proliferation’ can be associated with three different networks of genes (Network 1, Network 2, Network 3) in the IPA analysis (Table 2) in the gene set from the OTP group (Group B) and two different networks (Network 1, Network 3) in the PCB group (Group B) (Table 3). In the OTP group, the first two of eight networks are interconnected, and in the PCB group, the first three of seven networks are interconnected. The change in one gene in the networks may affect the function of the other connected genes and influence the related biological functions. The IPA analysis of the genes in the PCB group did not include HIRA in any of the three networks in the analysis. This particular gene, along with seven other focus molecules (RTNK1, CDK6MSH1, TRIM24 etc.) formed another network with a different set of functions (Table 3). The IPA analysis of the genes from the OTP group demonstrated that HIRA, along with five other common genes (CD96, HIRA, CYTH1, RAD23A and INO80B), was incorporated in network 1. The second network formed by the genes from the OTP group was interconnected with the first one, through the common relationship of a single gene that was not present in either of the PCB or OTP sets; i.e., it was not significantly affected by high POP exposure. This can be explained by the fact that a particular function, e.g., ‘cellular growth and proliferation’, can involve different gene sets. At the same time a particular gene (HIRA) can be associated with different genes, depending on the cellular conditions perturbed by different chemical exposures. It seems that the combination of altered gene(s) resulting from the specific distribution of the POP chemicals determined the particular phenotypic manifestation and pathophysiology of the exposure.
To understand the common effects of POP exposure and to identify a common marker for different exposure scenarios, we compared the disease and disorder biofunctions in IPA for the two gene sets. There were 12 disease and disorder categories common to the two gene sets. However, the associated genes were not the same for the two groups. Some diseases (Table 4) were associated with group-specific gene sets. The inflammatory response, developmental disorders, cardiovascular disorders, and reproductive system diseases were associated with the genes that were differentially expressed in the PCB group (Group B). Respiratory disease was also associated with the same group (Table 4). We also performed the same analysis for the 14 genes regulated in common (Table 5). This analysis identified many important disease and disorder functions. Some of the disease processes associated with POP exposure have been reported in epidemiological association studies (Howsome et al., 2004; Hertz-Picciotto et al., 2008; Brody et al., 2007; Ruzzin et al., 2010; Sagiv et al., 2010; Trnovec et al., 2010; Sergeev and Carpenter, 2010). It has long been indicated that the immunotoxicity is one of the most sensitive endpoint and the human immune system is vulnerable to the effects of environmental contaminants and may have detrimental health effects (Tryphonus 2001). Our result indicated that in both group the differentially expressed genes are highly associated with immunological disease. To understand the specific contribution of the common gene sets to disease, we performed a comparison analysis using the IPA platform, comparing the common gene set with the OTP- and PCB-specific gene sets. We found only four biofunctions that were significantly associated with all three gene sets and independent of the POP profile. These were connective tissue disorders, genetic disorders, neurological diseases, and skeletal and muscular disorders (Figure 5). These four disease and disorder functions may be associated with common functions of the genes in these two POP profiles. This does not necessarily exclude an association with other diseases and disorders reported to be caused by POP-related exposure. One such disease is cancer, which is a controversial topic in PCB/ POP related toxicity (Brody et al., 2007). Our analysis suggested that the genes known to be responsible for cancer differ between the two high POP exposed groups (Table 4). Similarly, immunological disease, which is also a common complication of POP exposure, was associated with two different gene sets (Table 4). A particular gene can also exhibit two different connections in separate groups, as was the case for HIRA, which is a common gene between the two gene sets but did not show the same type of association and biological function in the two exposure scenarios.
The 14 genes altered by the PCB and OTP groups are strikingly similar and this merits emphasis even though both induce many other genes independently. This is important given the similarities of some adverse health effects of PCB and pesticide exposures. The exact POP composition and their individual toxic contribution in disease development are difficult to measure in actual exposure situation. In both of our experimental groups there are chemicals of varied structure. Moreover there are other lipophilic substances which may be migrating selectively in PCB and OTP groups with unknown effects. Even within PCBs which is a mixture of different congeners, quite different patterns of gene induction was observed as in dioxin-like compared to non-dioxin-like congeners (Vezina et al., 2001), so one should expect complexity in POP exposure as well.
While the number of Functions/Pathways/Lists Eligible molecules associated with a given function/pathway is an important measure when calculating the p-value for Functional Analyses, the p-value does not only depend on this number. According to IPA algorithm, the p-value for a given function is calculated by considering: 1) The number of Functions/Pathways/Lists Eligible molecules that participate in that Annotation. 2) The total number of knowledge base molecules known to be associated with that function. 3) The total numbers of Functions/Pathways/Lists Eligible molecules 4) The total number of molecules in the Reference Set of IPA library. The number of genes differentially expressed in the high PCB group was higher than OTP group under same statistical stringency. It may introduce the positive bias in p value in IPA functional analysis in case of PCB group. So the functional effect due to high PCB group may be more pronounced due to the additional number of focus molecule in PCB group. However, we selected an identical cut of value for the fold change and statistical analysis of the microarray data in two cases. We did not consider the effect of other molecule which may have substantial influence in a disease process but did not pass the selection cut off. In actual physiological situation every single alteration in the pathway would count in the resultant disease process.
In the present work, we have utilized one of the most advanced data mining software, an enriched database and a highly sophisticated algorithm (in IPA), to understand the possible overall effect associated with the altered gene expression level due to two different high POP exposure scenarios compared to the low exposure level. A single study like this one is neither a conclusive evidence of the effected bio-functions nor it excludes the possibilities of other physiological impairments, unless validated through other methodological and detailed mechanistic pathway investigations. The present study is the first to its kind to reveal a comparative picture of gene expressions due to different POP exposure profiles from the same exposed population.
This study demonstrated a comparative toxicogenomic effect of two distinct POP exposure profiles, where a common set of 14 genes (including 2 hypothetical probes) were differentially expressed, regardless of the POP profile. Functional analysis provided an overview of the molecular mechanisms involved in POP exposure and helped explain the associated disease processes. This information is essential for the identification of a robust biomarker for POP exposure. It will also help to explain the molecular mechanisms responsible for the different pathophysiological effects reported in epidemiological studies. The array data tend to have a higher fold change than PCR data, though each gene demonstrated the same directionality in two methods. The random validation of the gene panel in a larger sample size is needed, and robustness of the expression should also be checked in a different method like RNA sequencing technique.
This result also underscores the importance of a future multicenter global validation study of other POP exposure conditions and the incorporation of different ethnic groups as well. This would increase the study power and the probability of identifying genomic biomarker(s) for POP exposure, regardless of the specific POP constituents in an individual exposure scenario. Complementary to the gene expression study, the analysis of epigenetic components will be required to understand the gene-environment interactions and the underlying disease risk for an individual due to chronic POP exposure.
Comparison of the gene expression in microarray and in RT PCR experiments. The RT PCR was ATAD2B, ETNK1, INO80B, HIRA, and RAD23B performed on a subset of differentially expressed genes common in two groups.
Correlation between Total PCB concentration (Y axis in ng/ml of blood) vs. Lipid Adjusted PCB Concentration (X axis in ng/mg of serum lipid) in 45 month children selected for microarray analysis.
The effect of the exposure was tested for F ratio in between the groups. A. High POP (OTP group) vs. Low POP control group and B. High POP (PCB group) vs. Low POP control group.
Volcano plot represents the differential expression, both in terms of fold-changes (x-axis) and the statistical significance of difference (-log10 of p-value, yaxis). The red circles represent up regulated (>1.5 fold) genes and blue circles down regulated (<-1.5 fold) genes. A-Comparison of Control Group, and OTP group. B-Comparison of Control Group and genes in PCB group.
This study is supported by the 1UO1ES016127-01 from the National Institute of Environmental Health Sciences (NIEHS/NIH). Thanks are also due to the General Clinical Research Center (GCRC) of Howard University for their assistance with the blood collection from healthy donors, as per approved HU IRB # IRB-07- GSAS-30. The contents of this report are solely the responsibility of the authors.
Authors’ contributions PM developed the POP work, performed the statistical analysis of the microarray results, and the functional analysis of the results. He also wrote the manuscript. SG and SZ did the handling of the human subject’s specimens, and the isolation of the RNA for the microarray studies. SG helped with critical revisions on the manuscript. SGM ran the microarray experiments. TT, LP and ES were responsible for the collection of the human subjects’ blood and data. IHP and DS provided the required background data for the human subjects in this work. EH and SKD provided support and direction and revised the manuscript. SKD holds the NIEHS/UO1 grant.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.