Breeding of genetically resistant chickens to Marek’s disease (MD) is a vital strategy to poultry health. To find the markers underlying the genetic resistance to MD, copy number variation (CNV) was examined in inbred MD-resistant and -susceptible chicken lines. A total of 45 CNVs were found in four lines of chickens, and 28 were potentially involved in immune response and cell proliferation, etc. Importantly, two CNVs related with MD resistance were transmitted to descendent recombinant congenic lines that differ in susceptibility to MD. Our findings may lead to better strategies for genetic improvement of disease resistance in poultry.
CNV; disease resistance; Marek’s disease; chicken
Marek’s disease (MD) is a neoplastic disease in chickens caused by the MD virus (MDV). Successful vaccine development against MD has resulted in increased virulence of MDV and the understanding of genetic resistance to the disease is, therefore, crucial to long-term control strategies. Also, epigenetic factors are believed to be one of the major determinants of disease response.
Here, we carried out comprehensive analyses of the epigenetic landscape induced by MDV, utilizing genome-wide histone H3 lysine 4 and lysine 27 trimethylation maps from chicken lines with varying resistance to MD. Differential chromatin marks were observed on genes previously implicated in the disease such as MX1 and CTLA-4 and also on genes reported in other cancers including IGF2BP1 and GAL. We detected bivalent domains on immune-related transcriptional regulators BCL6, CITED2 and EGR1, which underwent dynamic changes in both lines as a result of MDV infection. In addition, putative roles for GAL in the mechanism of MD progression were revealed.
Our results confirm the presence of widespread epigenetic differences induced by MD in chicken lines with different levels of genetic resistance. A majority of observed epigenetic changes were indicative of increased levels of viral infection in the susceptible line symptomatic of lowered immunocompetence in these birds caused by early cytolytic infection. The GAL system that has known anti-proliferative effects in other cancers is also revealed to be potentially involved in MD progression. Our study provides further insight into the mechanisms of MD progression while revealing a complex landscape of epigenetic regulatory mechanisms that varies depending on host factors.
Histone modifications; Thymus; Differential marks; Bivalent domain; Chromatin signature; Marek’s disease
Chromatin immunoprecipitation followed by next-generation sequencing is a genome-wide analysis technique that can be used to detect various epigenetic phenomena such as, transcription factor binding sites and histone modifications. Histone modification profiles can be either punctate or diffuse which makes it difficult to distinguish regions of enrichment from background noise. With the discovery of histone marks having a wide variety of enrichment patterns, there is an urgent need for analysis methods that are robust to various data characteristics and capable of detecting a broad range of enrichment patterns.
To address these challenges we propose WaveSeq, a novel data-driven method of detecting regions of significant enrichment in ChIP-Seq data. Our approach utilizes the wavelet transform, is free of distributional assumptions and is robust to diverse data characteristics such as low signal-to-noise ratios and broad enrichment patterns. Using publicly available datasets we showed that WaveSeq compares favorably with other published methods, exhibiting high sensitivity and precision for both punctate and diffuse enrichment regions even in the absence of a control data set. The application of our algorithm to a complex histone modification data set helped make novel functional discoveries which further underlined its utility in such an experimental setup.
WaveSeq is a highly sensitive method capable of accurate identification of enriched regions in a broad range of data sets. WaveSeq can detect both narrow and broad peaks with a high degree of accuracy even in low signal-to-noise ratio data sets. WaveSeq is also suited for application in complex experimental scenarios, helping make biologically relevant functional discoveries.
Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases.
In this study using the high density BovineHD SNP array, we performed high resolution CNV analyses on both Btau_4.0 and UMD3.1 with 674 animals of 27 cattle breeds. We first compared CNV results derived from these two different SNP array platforms on Btau_4.0. With two thirds of the animals shared between studies, on Btau_4.0 we identified 3,346 candidate CNV regions representing 142.7 megabases (~4.70%) of the genome. With a similar total length but 5 times more event counts, the average CNVR length of current Btau_4.0 dataset is significantly shorter than the previous one (42.7 kb vs. 205 kb). Although subsets of these two results overlapped, 64% (91.6 megabases) of current dataset was not present in the previous study. We also performed similar analyses on UMD3.1 using these BovineHD SNP array results. Approximately 50% more and 20% longer CNVs were called on UMD3.1 as compared to those on Btau_4.0. However, a comparable result of CNVRs (3,438 regions with a total length 146.9 megabases) was obtained. We suspect that these results are due to the UMD3.1 assembly's efforts of placing unplaced contigs and removing unmerged alleles. Selected CNVs were further experimentally validated, achieving a 73% PCR validation rate, which is considerably higher than the previous validation rate. About 20-45% of CNV regions overlapped with cattle RefSeq genes and Ensembl genes. Panther and IPA analyses indicated that these genes provide a wide spectrum of biological processes involving immune system, lipid metabolism, cell, organism and system development.
We present a comprehensive result of cattle CNVs at a higher resolution and sensitivity. We identified over 3,000 candidate CNV regions on both Btau_4.0 and UMD3.1, further compared current datasets with previous results, and examined the impacts of genome assemblies on CNV calling.
Cattle genome; Breed; Copy number variation (CNV); Single nucleotide polymorphism (SNP)
Marek's disease (MD) is a lymphoproliferative disease in chicken induced by Marek's disease virus (MDV). Although studies have focused on the genetic differences between the resistant and susceptible chicken, less is known about the role of epigenetic factors in MD. In this study, genome-wide histone modifications in the non-MHC-associated resistant and susceptible chicken lines were examined. We found that tri-methylation at histone H3 Lys4 (H3K4me3) enrichment is positively correlated with the expression of protein coding genes as well as microRNA (miRNA) genes, whereas tri-methylation at histone H3 Lys27 (H3K27me3) exhibits a negative correlation. By identifying line-specific histone modifications in MDV infection, we found unique H3K4me3 islands in the resistant chicken activated genes, which are related to immune response and cell adhesion. Interestingly, we also found some miRNAs from unique H3K27me3 patterns in the susceptible chickens that targeted genes involved in 5-hydroxytryptamine (5-HT)-receptor and adrenergic receptor pathways. In conclusion, dynamic line-specific histone modifications in response to MDV infection suggested that intrinsic epigenetic mechanisms may play a role in MD-resistance and -susceptibility.
Gene expression of lymphocytes was found to be influenced by histone methylation in mammals and trimethylation of lysine 27 on histone H3 (H3K27me3) normally represses genes expressions. Peripheral blood lymphocytes are the main source of somatic cells in the milk of dairy cows that vary frequently in response to the infection or injury of mammary gland and number of parities.
The genome-wide status of H3K27me3 modifications on blood lymphocytes in lactating Holsteins was performed via ChIP-Seq approach. Combined with digital gene expression (DGE) technique, the regulation effects of H3K27me3 on genes expressions were analyzed.
The ChIP-seq results showed that the peaks of H3K27me3 in cows lymphocytes were mainly enriched in the regions of up20K (∼50%), down20K (∼30%) and intron (∼28%) of the genes. Only ∼3% peaks were enriched in exon regions. Moreover, the highest H3K27me3 modification levels were mainly around the 2 Kb upstream of transcriptional start sites (TSS) of the genes. Using conjoint analysis with DGE data, we found that H3K27me3 marks tended to repress target genes expressions throughout whole gene regions especially acting on the promoter region. A total of 53 differential expressed genes were detected in third parity cows compared to first parity, and the 25 down-regulated genes (PSEN2 etc.) were negatively correlated with H3K27me3 levels on up2Kb to up1Kb of the genes, while the up-regulated genes were not showed in this relationship.
The first blueprint of bovine H3K27me3 marks that mediates gene silencing was generated. H3K27me3 plays its repressed role mainly in the regulatory region in bovine lymphocytes. The up2Kb to up1Kb region of the down-regulated genes in third parity cows could be potential target of H3K27me3 regulation. Further studies are warranted to understand the regulation mechanisms of H3K27me3 on somatic cell count increases and milk losses in latter parities of cows.
miRNAs are a class of small, single-stranded, non-coding RNAs that perform post-transcriptional repression of target genes by binding to 3’ untranslated regions. Research has found that miRNAs involved in the regulation of many metabolic processes. Here we uncovered that the beef quality of Angus cattle sharply diversified after acute stress. By performing miRNA microarray analysis, 13 miRNAs were significantly differentially expressed in stressed group compared to control group. Using a bioinformatics method, 135 protein-coding genes were predicted as the targets of significant differentially expressed miRNAs. Gene Ontology (GO) term and Ingenuity Pathway Analysis (IPA) mined that these target genes involved in some important pathways, which may have impact on meat quality and beef tenderness.
miRNA; Bovine; Beef tenderness; Stress
To explore gene-environment interactions, based on temporal gene expression information, we analyzed gene and treatment information intensively and inferred interaction networks accordingly. The main idea is that gene expression reflects the response of genes to environmental factors, assuming that variations of gene expression occur under different conditions. Then we classified experimental conditions into several subgroups based on the similarity of temporal gene expression profiles. This procedure is useful because it allows us to combine diverse gene expression data as they become available, and, especially, allowing us to lay the regulatory relationships on a concrete biological basis. By estimating the activation points, we can visualize the gene behavior, and obtain a consensus gene activation order, and hence describe conditional regulatory relationships. The estimation of activation points and building of synthetic genetic networks may result in important new insights in the ongoing endeavor to understand the complex network of gene regulation.
Beef is one of the leading sources of protein, B vitamins, iron, and zinc in human food. Beef palatability is based on three general criteria: tenderness, juiciness, and flavor, of which tenderness is thought to be the most important factor. In this study, we found that beef tenderness, measured by the Warner-Bratzler shear force (WBSF), was dramatically increased by acute stress. Microarray analysis and qPCR identified a variety of genes that were differentially expressed. Pathway analysis showed that these genes were involved in immune response and regulation of metabolism process as activators or repressors. Further analysis identified that these changes may be related with CpG methylation of several genes. Therefore, the results from this study provide an enhanced understanding of the mechanisms that genetic and epigenetic regulations control meat quality and beef tenderness.
Marek’s disease (MD) is a lymphoproliferative disease induced by Marek’s disease virus (MDV) infection. To augment vaccination measures in MD control, host genetic resistant to MD becomes obviously more and more important. To elucidate the mechanism of MD-resistance, most of researches were focused on the genetic differences between resistant and susceptible chickens. However, epigenetic features between MD resistant and susceptible chickens are poorly characterized. Using bisulfite pyrosequencing method, we found some candidate genes have higher promoter methylation in the MD-susceptible (L72) chickens than in the MD-resistant (L63) chickens. The hypermethylated genes, involved in cellular component organization, responding to stimulus, cell adhesion, and immune system process, may play important role in susceptibility to disease by deregulation of these genes. MDV infection induced the expression changes of all three methyltransferases genes (DNMT1, DNMT3a, and DNMT3b) in both lines of chickens. The DNMT1 was up-regulated in L72, whereas the DNMT3b was down-regulated in L63 at 21 dpi. Interestingly, a dynamic change of promoter methylation was observed during MDV life cycle. Some genes, including HDAC9, GH, STAT1, CIITA, FABP3, LATS2, and H2Ac, showed differential methylation behaviors between the two lines of chickens. In summary, the findings from this study suggested that DNA methylation heterogeneity and MDV infection induced methylation alterations differences existed between the two lines of chickens. Therefore, it is suggested that epigenetic mechanisms may be involved in modulating the resistance and/or susceptibility to MD in chickens.
chicken; Marek’s disease; MD-resistance; MD-susceptibility; DNA methylation
Marek's disease (MD) is a lymphoproliferative disease in chickens caused by Marek's disease virus (MDV) and characterized by T cell lymphoma and infiltration of lymphoid cells into various organs such as liver, spleen, peripheral nerves and muscle. Resistance to MD and disease risk have long been thought to be influenced both by genetic and environmental factors, the combination of which contributes to the observed outcome in an individual. We hypothesize that after MDV infection, genes related to MD-resistance or -susceptibility may exhibit different trends in transcriptional activity in chicken lines having a varying degree of resistance to MD.
In order to study the mechanisms of resistance and susceptibility to MD, we performed genome-wide temporal expression analysis in spleen tissues from MD-resistant line 63, susceptible line 72 and recombinant congenic strain M (RCS-M) that has a phenotype intermediate between lines 63 and 72 after MDV infection. Three time points of the MDV life cycle in chicken were selected for study: 5 days post infection (dpi), 10dpi and 21dpi, representing the early cytolytic, latent and late cytolytic stages, respectively. We observed similar gene expression profiles at the three time points in line 63 and RCS-M chickens that are both different from line 72. Pathway analysis using Ingenuity Pathway Analysis (IPA) showed that MDV can broadly influence the chickens irrespective of whether they are resistant or susceptible to MD. However, some pathways like cardiac arrhythmia and cardiovascular disease were found to be affected only in line 72; while some networks related to cell-mediated immune response and antigen presentation were enriched only in line 63 and RCS-M. We identified 78 and 30 candidate genes associated with MD resistance, at 10 and 21dpi respectively, by considering genes having the same trend of expression change after MDV infection in lines 63 and RCS-M. On the other hand, by considering genes with the same trend of expression change after MDV infection in lines 72 and RCS-M, we identified 78 and 43 genes at 10 and 21dpi, respectively, which may be associated with MD-susceptibility.
By testing temporal transcriptome changes using three representative chicken lines with different resistance to MD, we identified 108 candidate genes for MD-resistance and 121 candidate genes for MD-susceptibility over the three time points. Genes included in our resistance or susceptibility genes lists that are also involved in more than 5 biofunctions, such as CD8α, IL8, USP18, and CTLA4, are considered to be important genes involved in MD-resistance or -susceptibility. We were also able to identify several biofunctions related with immune response that we believe play an important role in MD-resistance.
Overexpression of ramA has been implicated in resistance to multiple drugs in several enterobacterial pathogens. In the present study, Salmonella Typhimurium strain LTL with constitutive expression of ramA was compared to its ramA-deletion mutant by employing both DNA microarrays and phenotype microarrays (PM). The mutant strain with the disruption of ramA showed differential expression of at least 33 genes involved in 11 functional groups. The study confirmed at the transcriptional level that the constitutive expression of ramA was directly associated with increased expression of multidrug efflux pump AcrAB-TolC and decreased expression of porin protein OmpF, thereby conferring multiple drug resistance phenotype. Compared to the parent strain constitutively expressing ramA, the ramA mutant had increased susceptibility to over 70 antimicrobials and toxic compounds. The PM analysis also uncovered that the ramA mutant was better in utilization of 10 carbon sources and 5 phosphorus sources. This study suggested that the constitutive expression of ramA locus regulate not only multidrug efflux pump and accessory genes but also genes involved in carbon metabolic pathways.
Marek’s disease virus (MDV) is an oncovirus that induces lymphoid tumors in susceptible chickens, and may affect the epigenetic stability of the CD4 gene. The purpose of this study was to find the effect of MDV infection on DNA methylation status of the CD4 gene differed between MD-resistant (L63) and –susceptible (L72) chicken lines.
Chickens from each line were divided into two groups with one group infected by MDV and the other group as uninfected controls. Then, promoter DNA methylation levels of the CD4 gene were measured by Pyrosequencing; and gene expression analysis was performed by quantitative PCR.
Promoter methylation of the CD4 gene was found to be down-regulated in L72 chickens only after MDV infection. The methylation down-regulation of the CD4 promoter is negatively correlated with up-regulation of CD4 gene expression in the L72 spleen at 21 dpi.
The methylation fluctuation and mRNA expression change of CD4 gene induced by MDV infection suggested a unique epigenetic mechanism existed in MD-susceptible chickens.
Copy number variation (CNV) represents another important source of genetic variation complementary to single nucleotide polymorphism (SNP). High-density SNP array data have been routinely used to detect human CNVs, many of which have significant functional effects on gene expression and human diseases. In the dairy industry, a large quantity of SNP genotyping results are becoming available and can be used for CNV discovery to understand and accelerate genetic improvement for complex traits.
We performed a systematic analysis of CNV using the Bovine HapMap SNP genotyping data, including 539 animals of 21 modern cattle breeds and 6 outgroups. After correcting genomic waves and considering the pedigree information, we identified 682 candidate CNV regions, which represent 139.8 megabases (~4.60%) of the genome. Selected CNVs were further experimentally validated and we found that copy number "gain" CNVs were predominantly clustered in tandem rather than existing as interspersed duplications. Many CNV regions (~56%) overlap with cattle genes (1,263), which are significantly enriched for immunity, lactation, reproduction and rumination. The overlap of this new dataset and other published CNV studies was less than 40%; however, our discovery of large, high frequency (> 5% of animals surveyed) CNV regions showed 90% agreement with other studies. These results highlight the differences and commonalities between technical platforms.
We present a comprehensive genomic analysis of cattle CNVs derived from SNP data which will be a valuable genomic variation resource. Combined with SNP detection assays, gene-containing CNV regions may help identify genes undergoing artificial selection in domesticated animals.
Duplicated sequences are an important source of gene innovation and structural variation within mammalian genomes. We performed the first systematic and genome-wide analysis of segmental duplications in the modern domesticated cattle (Bos taurus). Using two distinct computational analyses, we estimated that 3.1% (94.4 Mb) of the bovine genome consists of recently duplicated sequences (≥ 1 kb in length, ≥ 90% sequence identity). Similar to other mammalian draft assemblies, almost half (47% of 94.4 Mb) of these sequences have not been assigned to cattle chromosomes.
In this study, we provide the first experimental validation large duplications and briefly compared their distribution on two independent bovine genome assemblies using fluorescent in situ hybridization (FISH). Our analyses suggest that the (75-90%) of segmental duplications are organized into local tandem duplication clusters. Along with rodents and carnivores, these results now confidently establish tandem duplications as the most likely mammalian archetypical organization, in contrast to humans and great ape species which show a preponderance of interspersed duplications. A cross-species survey of duplicated genes and gene families indicated that duplication, positive selection and gene conversion have shaped primates, rodents, carnivores and ruminants to different degrees for their speciation and adaptation. We identified that bovine segmental duplications corresponding to genes are significantly enriched for specific biological functions such as immunity, digestion, lactation and reproduction.
Our results suggest that in most mammalian lineages segmental duplications are organized in a tandem configuration. Segmental duplications remain problematic for genome and assembly and we highlight genic regions that require higher quality sequence characterization. This study provides insights into mammalian genome evolution and generates a valuable resource for cattle genomics research.
Array-based comparative genomics hybridization (aCGH) has gained prevalence as an effective technique for measuring structural variations in the genome. Copy-number variations (CNVs) form a large source of genomic structural variation, but it is not known whether phenotypic differences between intra-species groups, such as divergent human populations, or breeds of a domestic animal, can be attributed to CNVs. Several computational methods have been proposed to improve the detection of CNVs from array CGH data, but few population studies have used CGH data for identification of intra-species differences. In this paper we propose a novel method of genome-wide comparison and classification using CGH data that condenses whole genome information, aimed at quantification of intra-species variations and discovery of shared ancestry. Our strategy included smoothing CGH data using an appropriate denoising algorithm, extracting features via wavelets, quantifying the information via wavelet power spectrum and hierarchical clustering of the resultant profile. To evaluate the classification efficiency of our method, we used simulated data sets. We applied it to aCGH data from human and bovine individuals and showed that it successfully detects existing intra-specific variations with additional evolutionary implications.
Chicken Repeat 1 (CR1) repeats are the most abundant family of repeats in the chicken genome, with more than 200,000 copies accounting for ∼80% of the chicken interspersed repeats. CR1 repeats are believed to have arisen from the retrotransposition of a small number of master elements, which gave rise to the 22 CR1 subfamilies as previously reported in Repbase. We performed a global assessment of the divergence distributions, phylogenies, and consensus sequences of CR1 repeats in the chicken genome. We identified and validated 57 chicken CR1 subfamilies and further analyzed the correlation between these subfamilies and their regional GC contents. We also discovered one novel lineage-specific CR1 subfamilies in turkeys when compared with chickens. We built an evolutionary tree of these subfamilies and concluded that CR1 repeats may play an important role in reshaping the structure of bird genomes.
CR1 repeats; comparative genomics; chicken genome
Temporal gene expression data are of particular interest to researchers as they contain rich information in characterization of gene function and have been widely used in biomedical studies. However, extracting information and identifying efficient treatment effects without loss of temporal information are still in problem. In this paper, we propose a method of classifying temporal gene expression curves in which individual expression trajectory is modeled as longitudinal data with changeable variance and covariance structure. The method, mainly based on generalized mixed model, is illustrated by a dense temporal gene expression data in bacteria. We aimed at evaluating gene effects and treatments. The power and time points of measurements are also characterized via the longitudinal mixed model. The results indicated that the proposed methodology is promising for the analysis of temporal gene expression data, and that it could be generally applicable to other high-throughput temporal gene expression analyses.
Clustering analysis is a common statistical tool for knowledge discovery. It is mainly conducted when a project still is in the exploratory phase without any priori hypotheses. However, the statistical significance testing between the clusters can be meaningful in helping the researchers to assess if the classification results from implementing a clustering algorithm need to be improved, even after the cluster number has been determined by a well-established criterion. This is important when we want to identify highly-specific patterns through classification.
We proposed to use a principal component (PC) test, which is an implementation of an exact F statistic for the measures at multiple endpoints based on elliptical distribution theory, to assess the statistical significance between clusters. A challenge in the implementation is the choice of the number (q) of principal components to be considered, which can severely influence the statistical power of the method. We optimized the determination via validation according to a permutation test based on the clustering to be evaluated. The method was applied to a public dataset in classifying genes according to their temporal gene expression profiles.
The results demonstrated that the PC testing were useful for determining the optimal number of clusters.
Both epigenetic alterations and genetic variations play essential roles in tumorigenesis. The epigenetic modification of DNA methylation is catalyzed and maintained by the DNA methyltransferases (DNMT3a, DNMT3b and DNMT1). DNA mutations and DNA methylation profiles of DNMTs themselves and their relationships with chicken neoplastic disease resistance and susceptibility are not yet defined. In the present study, we analyzed the complexity of the DNA methylation variations and DNA mutations in the first exon of three DNMTs genes over generations, tissues, and ages among chickens of two highly inbred White Leghorn lines, Marek's disease-resistant line 63 and -susceptible line 72, and six recombinant congenic strains (RCSs). Among them, tissue-specific methylation patterns of DNMT3a were disclosed in spleen, liver, and hypothalamus in lines 63 and 72. The methylation level of DNMT3b on four CpG sites was not significantly different among four tissues of the two lines. However, two line-specific DNA transition mutations, CpG→TpG (Chr20:10203733 and 10203778), were discovered in line 72 compared to the line 63 and RCSs. The methylation contents of DNMT1 in blood cell showed significant epimutations in the first CpG site among the two inbred lines and the six RCSs (P<0.05). Age-specific methylation of DNMT1 was detected in comparisons between 15 month-old and 2 month-old chickens in both lines except in spleen samples from line 72. No DNA mutations were discovered on the studied regions of DNMT1 and DNMT3a among the two lines and the six RCSs. Moreover, we developed a novel method that can effectively test the significance of DNA methylation patterns consisting of continuous CpG sites. Taken together, these results highlight the potential of epigenetic alterations in DNMT1 and DNMT3a, as well as the DNA mutations in DNMT3b, as epigenetic and genetic factors to neoplastic diseases of chickens.
Chicken endogenous viruses, ALVE (Avian Leukosis Virus subgroup E), are inherited as LTR (long terminal repeat) retrotransposons, which are negatively correlated with disease resistance, and any changes in DNA methylation may contribute to the susceptibility to neoplastic disease. The relationship between ALVE methylation status and neoplastic disease in the chicken is undefined. White Leghorn inbred lines 72 and 63 at the ADOL have been respectively selected for resistance and susceptibility to tumors that are induced by avian viruses. In this study, the DNA methylation patterns of 3∼6 CpG sites of four conserved regions in ALVE, including one unique region in ALVE1, the promoter region in the TVB (tumor virus receptor of ALV subgroup B, D and E) locus, were analyzed in the two lines using pyrosequencing methods in four tissues, i.e., liver, spleen, blood and hypothalamus. A significant CpG hypermethylation level was seen in line 72 in all four tissues, e.g., 91.86±1.63% for ALVE region2 in blood, whereas the same region was hemimethylated (46.16±2.56%) in line 63. CpG methylation contents of the ALVE regions were significantly lower in line 63 than in line 72 in all tissues (P<0.01) except the ALVE region 3/4 in liver. RNA expressions of ALVE regions 2 and 3 (PPT-U3) were significantly higher in line 63 than in line 72 (P<0.01). The methylation levels of six recombinant congenic strains (RCSs) closely resembled to the background line 63 in ALVE-region 2, which imply the methylation pattern of ALVE-region 2 may be a biomarker in resistant disease breeding. The methylation level of the promoter region in the TVB was significantly different in blood (P<0.05) and hypothalamus (P<0.0001), respectively. Our data disclosed a hypermethylation pattern of ALVE that may be relevant for resistance against ALV induced tumors in chickens.
Predictive classification on the base of gene expression profiles appeared recently as an attractive strategy for identifying the biological functions of genes. Gene Ontology (GO) provides a valuable source of knowledge for model training and validation. The increasing collection of microarray data represents a valuable source for generating functional hypotheses of uncharacterized genes.
This study focused on using support vector machines (SVM) to predict GO biological processes from individual or multiple-tissue transcriptional profiles of aging in Drosophila melanogaster. Ten-fold cross validation was implemented to evaluate the prediction. One-tail Fisher's exact test was conducted on each cross validation and multiple testing was addressed using BH FDR procedure. The results showed that, of the 148 pursued GO biological processes, fifteen terms each had at least one model with FDR-adjusted p-value (Adj.p) <0.05 and six had the values between 0.05 and 0.25. Furthermore, all these models had the prediction sensitivity (SN) over 30% and specificity (SP) over 80%.
We proposed the concept of term-tissue specific models indicating the fact that the major part of the optimized prediction models was trained from individual tissue data. Furthermore, we observed that the memberships of the genes involved in all the three pursued children biological processes on mitochondrial electron transport could be predicted from the transcriptional profiles of aging (Adj.p < 0.01). This finding may be important in biology because the genes of mitochondria play a critical role in the longevity of C. elegans and D. melanogaster.
The identification of transcription factor binding sites is essential to the understanding of the regulation of gene expression and the reconstruction of genetic regulatory networks. The in silico identification of cis-regulatory motifs is challenging due to sequence variability and lack of sufficient data to generate consensus motifs that are of quantitative or even qualitative predictive value. To determine functional motifs in gene expression, we propose a strategy to adopt false discovery rate (FDR) and estimate motif effects to evaluate combinatorial analysis of motif candidates and temporal gene expression data. The method decreases the number of predicted motifs, which can then be confirmed by genetic analysis. To assess the method we used simulated motif/expression data to evaluate parameters. We applied this approach to experimental data for a group of iron responsive genes in Salmonella typhimurium 14028S. The method identified known and potentially new ferric-uptake regulator (Fur) binding sites. In addition, we identified uncharacterized functional motif candidates that correlated with specific patterns of expression. A SAS code for the simulation and analysis gene expression data is available from the first author upon request.
Gene expression; motif; FDR; mixed models
Chromosomal DNA replication in bacteria starts at the origin (ori) and the two replicores propagate in opposite directions up to the terminus (ter) region. We hypothesize that the two replicores need to reach ter at the same time to maintain a physical balance; DNA insertion would disrupt such a balance, requiring chromosomal rearrangements to restore the balance. To test this hypothesis, we needed to demonstrate that ori and ter are in a physical balance in bacterial chromosomes. Using wavelet analysis, we documented GC skew, AT skew, purine excess and keto excess on the published bacterial genomic sequences to locate the turning (minimum and maximum) points on the curves. Previously, the minimum point had been supposed to correlate with ori and the maximum to correlate with ter.
We observed a strong tendency of the bacterial chromosomes towards a physical balance, with the minima and maxima corresponding to the known or putative ori and ter and being about half chromosome separated in most of the bacteria studied. A nonparametric method based on wavelet transformation was employed to perform significance tests for the predicted loci.
The wavelet approach can reliably predict the ori and ter regions and the bacterial chromosomes have a strong tendency towards a physical balance between ori and ter.