DNase I hypersensitive sites (DHSs) mark diverse classes of cis-regulatory regions, such as promoters and enhancers. MSB-1 derived from chicken Marek's disease (MD) lymphomas is an MDV-transformed CD4+ T-cell line for MD study. Previously, DNase I HS sites were studied mainly in human cell types for mammalian. To capture the regulatory elements specific to MSB1 cells and explore the molecular mechanisms of T-cell transformation caused by MDV in MD, we generated high-quality of DHSs map and gene expression profile for functional analysis in MSB1 cell line. The total of 21,724 significant peaks of DHSs was identified from around 40 million short reads. DHSs distribution varied between chromosomes and they preferred to enrich in the gene-rich chromosomes. More interesting, DHSs enrichments appeared to be scarce on regions abundant in CpG islands. Besides, we integrated DHSs into the gene expression data and found that DHSs tended to enrich on high expressed genes throughout whole gene regions while DHSs did not show significant changes for low and silent expressed genes. Furthermore, the correlation of DHSs with lincRNAs expression was also calculated and it implied that enhancer-associated lincRNAs probably originated from enhancer-like regions of DHSs. Together, our results indicated that DNase I HS sites highly correlate with active genes expression in MSB1 cells, suggesting DHSs can be considered as markers to identify the cis-regulatory elements associated with chicken Marek's disease.
DNase I; DHS; intergenic DHSs; MSB1; CpG islands; gene expressions; long non-coding RNAs; Marek's disease (MD)
Milk production is an economically important sector of global agriculture. Much attention has been paid to the identification of quantitative trait loci (QTL) associated with milk, fat, and protein yield and the genetic and molecular mechanisms underlying them. Copy number variation (CNV) is an emerging class of variants which may be associated with complex traits.
In this study, we performed a genome-wide association between CNVs and milk production traits in 26,362 Holstein bulls and cows. A total of 99 candidate CNVs were identified using Illumina BovineSNP50 array data, and association tests for each production trait were performed using a linear regression analysis with PCA correlation. A total of 34 CNVs on 22 chromosomes were significantly associated with at least one milk production trait after false discovery rate (FDR) correction. Some of those CNVs were located within or near known QTL for milk production traits. We further investigated the relationship between associated CNVs with neighboring SNPs. For all 82 combinations of traits and CNVs (less than 400 kb in length), we found 17 cases where CNVs directly overlapped with tag SNPs and 40 cases where CNVs were adjacent to tag SNPs. In 5 cases, CNVs located were in strong linkage disequilibrium with tag SNPs, either within or adjacent to the same haplotype block. There were an additional 20 cases where CNVs did not have a significant association with SNPs, suggesting that the effects of those CNVs were probably not captured by tag SNPs.
We conclude that combining CNV with SNP analyses reveals more genetic variations underlying milk production traits than those revealed by SNPs alone.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-683) contains supplementary material, which is available to authorized users.
Copy number variation (CNV); dPTA; Association; Milk production traits
Marek’s disease (MD) is characterized as a T cell lymphoma induced by a cell-associated α-herpesvirus, Marek’s disease virus type 1 (MDV1). As with many viral infectious diseases, DNA methylation variations were observed in the progression of MD; these variations are thought to play an important role in host-virus interactions. We observed that DNA methyltransferase 3a (DNMT3a) and 3b (DNMT3b) were differentially expressed in chicken MD-resistant line 63 and MD-susceptible line 72 at 21 d after MDV infection. To better understand the role of methylation variation induced by MDV infection in both chicken lines, we mapped the genome-wide DNA methylation profiles in each line using Methyl-MAPS (methylation mapping analysis by paired-end sequencing). Collectively, the data sets collected in this study provide a more comprehensive picture of the chicken methylome. Overall, methylation levels were reduced in chickens from the resistant line 63 after MDV infection. We identified 11,512 infection-induced differential methylation regions (iDMRs). The number of iDMRs was larger in line 72 than in line 63, and most of iDMRs found in line 63 were overlapped with the iDMRs found in line 72. We further showed that in vitro methylation levels were associated with MDV replication, and found that MDV propagation in the infected cells was restricted by pharmacological inhibition of DNA methylation. Our results suggest that DNA methylation in the host may be associated with disease resistance or susceptibility. The methylation variations induced by viral infection may consequentially change the host transcriptome and result in diverse disease outcomes.
DNA methylation; Marek’s disease; chicken; epigenetics; tumor; viral infection
Background: Muscle development and lipid metabolism play important roles during fetal development stages. The commercial Texel sheep are more muscular than the indigenous Ujumqin sheep.
Results: We performed serial transcriptomics assays and systems biology analyses to investigate the dynamics of gene expression changes associated with fetal longissimus muscles during different fetal stages in two sheep breeds. Totally, we identified 1472 differentially expressed genes during various fetal stages using time-series expression analysis. A systems biology approach, weighted gene co-expression network analysis (WGCNA), was used to detect modules of correlated genes among these 1472 genes. Dramatically different gene modules were identified in four merged datasets, corresponding to the mid fetal stage in Texel and Ujumqin sheep, the late fetal stage in Texel and Ujumqin sheep, respectively. We further detected gene modules significantly correlated with fetal weight, and constructed networks and pathways using genes with high significances. In these gene modules, we identified genes like TADA3, LMNB1, TGF-β3, EEF1A2, FGFR1, MYOZ1, and FBP2 correlated with fetal weight.
Conclusion: Our study revealed the complex network characteristics involved in muscle development and lipid metabolism during fetal development stages. Diverse patterns of the network connections observed between breeds and fetal stages could involve some hub genes, which play central roles in fetal development, correlating with fetal weight. Our findings could provide potential valuable biomarkers for selection of body weight-related traits in sheep and other livestock.
Serial expression analysis; WGCNA; fetal development stages; fetal weight.
Breeding of genetically resistant chickens to Marek’s disease (MD) is a vital strategy to poultry health. To find the markers underlying the genetic resistance to MD, copy number variation (CNV) was examined in inbred MD-resistant and -susceptible chicken lines. A total of 45 CNVs were found in four lines of chickens, and 28 were potentially involved in immune response and cell proliferation, etc. Importantly, two CNVs related with MD resistance were transmitted to descendent recombinant congenic lines that differ in susceptibility to MD. Our findings may lead to better strategies for genetic improvement of disease resistance in poultry.
CNV; disease resistance; Marek’s disease; chicken
Marek’s disease (MD) is a neoplastic disease in chickens caused by the MD virus (MDV). Successful vaccine development against MD has resulted in increased virulence of MDV and the understanding of genetic resistance to the disease is, therefore, crucial to long-term control strategies. Also, epigenetic factors are believed to be one of the major determinants of disease response.
Here, we carried out comprehensive analyses of the epigenetic landscape induced by MDV, utilizing genome-wide histone H3 lysine 4 and lysine 27 trimethylation maps from chicken lines with varying resistance to MD. Differential chromatin marks were observed on genes previously implicated in the disease such as MX1 and CTLA-4 and also on genes reported in other cancers including IGF2BP1 and GAL. We detected bivalent domains on immune-related transcriptional regulators BCL6, CITED2 and EGR1, which underwent dynamic changes in both lines as a result of MDV infection. In addition, putative roles for GAL in the mechanism of MD progression were revealed.
Our results confirm the presence of widespread epigenetic differences induced by MD in chicken lines with different levels of genetic resistance. A majority of observed epigenetic changes were indicative of increased levels of viral infection in the susceptible line symptomatic of lowered immunocompetence in these birds caused by early cytolytic infection. The GAL system that has known anti-proliferative effects in other cancers is also revealed to be potentially involved in MD progression. Our study provides further insight into the mechanisms of MD progression while revealing a complex landscape of epigenetic regulatory mechanisms that varies depending on host factors.
Histone modifications; Thymus; Differential marks; Bivalent domain; Chromatin signature; Marek’s disease
Chromatin immunoprecipitation followed by next-generation sequencing is a genome-wide analysis technique that can be used to detect various epigenetic phenomena such as, transcription factor binding sites and histone modifications. Histone modification profiles can be either punctate or diffuse which makes it difficult to distinguish regions of enrichment from background noise. With the discovery of histone marks having a wide variety of enrichment patterns, there is an urgent need for analysis methods that are robust to various data characteristics and capable of detecting a broad range of enrichment patterns.
To address these challenges we propose WaveSeq, a novel data-driven method of detecting regions of significant enrichment in ChIP-Seq data. Our approach utilizes the wavelet transform, is free of distributional assumptions and is robust to diverse data characteristics such as low signal-to-noise ratios and broad enrichment patterns. Using publicly available datasets we showed that WaveSeq compares favorably with other published methods, exhibiting high sensitivity and precision for both punctate and diffuse enrichment regions even in the absence of a control data set. The application of our algorithm to a complex histone modification data set helped make novel functional discoveries which further underlined its utility in such an experimental setup.
WaveSeq is a highly sensitive method capable of accurate identification of enriched regions in a broad range of data sets. WaveSeq can detect both narrow and broad peaks with a high degree of accuracy even in low signal-to-noise ratio data sets. WaveSeq is also suited for application in complex experimental scenarios, helping make biologically relevant functional discoveries.
Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases.
In this study using the high density BovineHD SNP array, we performed high resolution CNV analyses on both Btau_4.0 and UMD3.1 with 674 animals of 27 cattle breeds. We first compared CNV results derived from these two different SNP array platforms on Btau_4.0. With two thirds of the animals shared between studies, on Btau_4.0 we identified 3,346 candidate CNV regions representing 142.7 megabases (~4.70%) of the genome. With a similar total length but 5 times more event counts, the average CNVR length of current Btau_4.0 dataset is significantly shorter than the previous one (42.7 kb vs. 205 kb). Although subsets of these two results overlapped, 64% (91.6 megabases) of current dataset was not present in the previous study. We also performed similar analyses on UMD3.1 using these BovineHD SNP array results. Approximately 50% more and 20% longer CNVs were called on UMD3.1 as compared to those on Btau_4.0. However, a comparable result of CNVRs (3,438 regions with a total length 146.9 megabases) was obtained. We suspect that these results are due to the UMD3.1 assembly's efforts of placing unplaced contigs and removing unmerged alleles. Selected CNVs were further experimentally validated, achieving a 73% PCR validation rate, which is considerably higher than the previous validation rate. About 20-45% of CNV regions overlapped with cattle RefSeq genes and Ensembl genes. Panther and IPA analyses indicated that these genes provide a wide spectrum of biological processes involving immune system, lipid metabolism, cell, organism and system development.
We present a comprehensive result of cattle CNVs at a higher resolution and sensitivity. We identified over 3,000 candidate CNV regions on both Btau_4.0 and UMD3.1, further compared current datasets with previous results, and examined the impacts of genome assemblies on CNV calling.
Cattle genome; Breed; Copy number variation (CNV); Single nucleotide polymorphism (SNP)
Marek's disease (MD) is a lymphoproliferative disease in chicken induced by Marek's disease virus (MDV). Although studies have focused on the genetic differences between the resistant and susceptible chicken, less is known about the role of epigenetic factors in MD. In this study, genome-wide histone modifications in the non-MHC-associated resistant and susceptible chicken lines were examined. We found that tri-methylation at histone H3 Lys4 (H3K4me3) enrichment is positively correlated with the expression of protein coding genes as well as microRNA (miRNA) genes, whereas tri-methylation at histone H3 Lys27 (H3K27me3) exhibits a negative correlation. By identifying line-specific histone modifications in MDV infection, we found unique H3K4me3 islands in the resistant chicken activated genes, which are related to immune response and cell adhesion. Interestingly, we also found some miRNAs from unique H3K27me3 patterns in the susceptible chickens that targeted genes involved in 5-hydroxytryptamine (5-HT)-receptor and adrenergic receptor pathways. In conclusion, dynamic line-specific histone modifications in response to MDV infection suggested that intrinsic epigenetic mechanisms may play a role in MD-resistance and -susceptibility.
Gene expression of lymphocytes was found to be influenced by histone methylation in mammals and trimethylation of lysine 27 on histone H3 (H3K27me3) normally represses genes expressions. Peripheral blood lymphocytes are the main source of somatic cells in the milk of dairy cows that vary frequently in response to the infection or injury of mammary gland and number of parities.
The genome-wide status of H3K27me3 modifications on blood lymphocytes in lactating Holsteins was performed via ChIP-Seq approach. Combined with digital gene expression (DGE) technique, the regulation effects of H3K27me3 on genes expressions were analyzed.
The ChIP-seq results showed that the peaks of H3K27me3 in cows lymphocytes were mainly enriched in the regions of up20K (∼50%), down20K (∼30%) and intron (∼28%) of the genes. Only ∼3% peaks were enriched in exon regions. Moreover, the highest H3K27me3 modification levels were mainly around the 2 Kb upstream of transcriptional start sites (TSS) of the genes. Using conjoint analysis with DGE data, we found that H3K27me3 marks tended to repress target genes expressions throughout whole gene regions especially acting on the promoter region. A total of 53 differential expressed genes were detected in third parity cows compared to first parity, and the 25 down-regulated genes (PSEN2 etc.) were negatively correlated with H3K27me3 levels on up2Kb to up1Kb of the genes, while the up-regulated genes were not showed in this relationship.
The first blueprint of bovine H3K27me3 marks that mediates gene silencing was generated. H3K27me3 plays its repressed role mainly in the regulatory region in bovine lymphocytes. The up2Kb to up1Kb region of the down-regulated genes in third parity cows could be potential target of H3K27me3 regulation. Further studies are warranted to understand the regulation mechanisms of H3K27me3 on somatic cell count increases and milk losses in latter parities of cows.
miRNAs are a class of small, single-stranded, non-coding RNAs that perform post-transcriptional repression of target genes by binding to 3’ untranslated regions. Research has found that miRNAs involved in the regulation of many metabolic processes. Here we uncovered that the beef quality of Angus cattle sharply diversified after acute stress. By performing miRNA microarray analysis, 13 miRNAs were significantly differentially expressed in stressed group compared to control group. Using a bioinformatics method, 135 protein-coding genes were predicted as the targets of significant differentially expressed miRNAs. Gene Ontology (GO) term and Ingenuity Pathway Analysis (IPA) mined that these target genes involved in some important pathways, which may have impact on meat quality and beef tenderness.
miRNA; Bovine; Beef tenderness; Stress
To explore gene-environment interactions, based on temporal gene expression information, we analyzed gene and treatment information intensively and inferred interaction networks accordingly. The main idea is that gene expression reflects the response of genes to environmental factors, assuming that variations of gene expression occur under different conditions. Then we classified experimental conditions into several subgroups based on the similarity of temporal gene expression profiles. This procedure is useful because it allows us to combine diverse gene expression data as they become available, and, especially, allowing us to lay the regulatory relationships on a concrete biological basis. By estimating the activation points, we can visualize the gene behavior, and obtain a consensus gene activation order, and hence describe conditional regulatory relationships. The estimation of activation points and building of synthetic genetic networks may result in important new insights in the ongoing endeavor to understand the complex network of gene regulation.
Beef is one of the leading sources of protein, B vitamins, iron, and zinc in human food. Beef palatability is based on three general criteria: tenderness, juiciness, and flavor, of which tenderness is thought to be the most important factor. In this study, we found that beef tenderness, measured by the Warner-Bratzler shear force (WBSF), was dramatically increased by acute stress. Microarray analysis and qPCR identified a variety of genes that were differentially expressed. Pathway analysis showed that these genes were involved in immune response and regulation of metabolism process as activators or repressors. Further analysis identified that these changes may be related with CpG methylation of several genes. Therefore, the results from this study provide an enhanced understanding of the mechanisms that genetic and epigenetic regulations control meat quality and beef tenderness.
Marek’s disease (MD) is a lymphoproliferative disease induced by Marek’s disease virus (MDV) infection. To augment vaccination measures in MD control, host genetic resistant to MD becomes obviously more and more important. To elucidate the mechanism of MD-resistance, most of researches were focused on the genetic differences between resistant and susceptible chickens. However, epigenetic features between MD resistant and susceptible chickens are poorly characterized. Using bisulfite pyrosequencing method, we found some candidate genes have higher promoter methylation in the MD-susceptible (L72) chickens than in the MD-resistant (L63) chickens. The hypermethylated genes, involved in cellular component organization, responding to stimulus, cell adhesion, and immune system process, may play important role in susceptibility to disease by deregulation of these genes. MDV infection induced the expression changes of all three methyltransferases genes (DNMT1, DNMT3a, and DNMT3b) in both lines of chickens. The DNMT1 was up-regulated in L72, whereas the DNMT3b was down-regulated in L63 at 21 dpi. Interestingly, a dynamic change of promoter methylation was observed during MDV life cycle. Some genes, including HDAC9, GH, STAT1, CIITA, FABP3, LATS2, and H2Ac, showed differential methylation behaviors between the two lines of chickens. In summary, the findings from this study suggested that DNA methylation heterogeneity and MDV infection induced methylation alterations differences existed between the two lines of chickens. Therefore, it is suggested that epigenetic mechanisms may be involved in modulating the resistance and/or susceptibility to MD in chickens.
chicken; Marek’s disease; MD-resistance; MD-susceptibility; DNA methylation
Marek's disease (MD) is a lymphoproliferative disease in chickens caused by Marek's disease virus (MDV) and characterized by T cell lymphoma and infiltration of lymphoid cells into various organs such as liver, spleen, peripheral nerves and muscle. Resistance to MD and disease risk have long been thought to be influenced both by genetic and environmental factors, the combination of which contributes to the observed outcome in an individual. We hypothesize that after MDV infection, genes related to MD-resistance or -susceptibility may exhibit different trends in transcriptional activity in chicken lines having a varying degree of resistance to MD.
In order to study the mechanisms of resistance and susceptibility to MD, we performed genome-wide temporal expression analysis in spleen tissues from MD-resistant line 63, susceptible line 72 and recombinant congenic strain M (RCS-M) that has a phenotype intermediate between lines 63 and 72 after MDV infection. Three time points of the MDV life cycle in chicken were selected for study: 5 days post infection (dpi), 10dpi and 21dpi, representing the early cytolytic, latent and late cytolytic stages, respectively. We observed similar gene expression profiles at the three time points in line 63 and RCS-M chickens that are both different from line 72. Pathway analysis using Ingenuity Pathway Analysis (IPA) showed that MDV can broadly influence the chickens irrespective of whether they are resistant or susceptible to MD. However, some pathways like cardiac arrhythmia and cardiovascular disease were found to be affected only in line 72; while some networks related to cell-mediated immune response and antigen presentation were enriched only in line 63 and RCS-M. We identified 78 and 30 candidate genes associated with MD resistance, at 10 and 21dpi respectively, by considering genes having the same trend of expression change after MDV infection in lines 63 and RCS-M. On the other hand, by considering genes with the same trend of expression change after MDV infection in lines 72 and RCS-M, we identified 78 and 43 genes at 10 and 21dpi, respectively, which may be associated with MD-susceptibility.
By testing temporal transcriptome changes using three representative chicken lines with different resistance to MD, we identified 108 candidate genes for MD-resistance and 121 candidate genes for MD-susceptibility over the three time points. Genes included in our resistance or susceptibility genes lists that are also involved in more than 5 biofunctions, such as CD8α, IL8, USP18, and CTLA4, are considered to be important genes involved in MD-resistance or -susceptibility. We were also able to identify several biofunctions related with immune response that we believe play an important role in MD-resistance.
Overexpression of ramA has been implicated in resistance to multiple drugs in several enterobacterial pathogens. In the present study, Salmonella Typhimurium strain LTL with constitutive expression of ramA was compared to its ramA-deletion mutant by employing both DNA microarrays and phenotype microarrays (PM). The mutant strain with the disruption of ramA showed differential expression of at least 33 genes involved in 11 functional groups. The study confirmed at the transcriptional level that the constitutive expression of ramA was directly associated with increased expression of multidrug efflux pump AcrAB-TolC and decreased expression of porin protein OmpF, thereby conferring multiple drug resistance phenotype. Compared to the parent strain constitutively expressing ramA, the ramA mutant had increased susceptibility to over 70 antimicrobials and toxic compounds. The PM analysis also uncovered that the ramA mutant was better in utilization of 10 carbon sources and 5 phosphorus sources. This study suggested that the constitutive expression of ramA locus regulate not only multidrug efflux pump and accessory genes but also genes involved in carbon metabolic pathways.
Marek’s disease virus (MDV) is an oncovirus that induces lymphoid tumors in susceptible chickens, and may affect the epigenetic stability of the CD4 gene. The purpose of this study was to find the effect of MDV infection on DNA methylation status of the CD4 gene differed between MD-resistant (L63) and –susceptible (L72) chicken lines.
Chickens from each line were divided into two groups with one group infected by MDV and the other group as uninfected controls. Then, promoter DNA methylation levels of the CD4 gene were measured by Pyrosequencing; and gene expression analysis was performed by quantitative PCR.
Promoter methylation of the CD4 gene was found to be down-regulated in L72 chickens only after MDV infection. The methylation down-regulation of the CD4 promoter is negatively correlated with up-regulation of CD4 gene expression in the L72 spleen at 21 dpi.
The methylation fluctuation and mRNA expression change of CD4 gene induced by MDV infection suggested a unique epigenetic mechanism existed in MD-susceptible chickens.
Copy number variation (CNV) represents another important source of genetic variation complementary to single nucleotide polymorphism (SNP). High-density SNP array data have been routinely used to detect human CNVs, many of which have significant functional effects on gene expression and human diseases. In the dairy industry, a large quantity of SNP genotyping results are becoming available and can be used for CNV discovery to understand and accelerate genetic improvement for complex traits.
We performed a systematic analysis of CNV using the Bovine HapMap SNP genotyping data, including 539 animals of 21 modern cattle breeds and 6 outgroups. After correcting genomic waves and considering the pedigree information, we identified 682 candidate CNV regions, which represent 139.8 megabases (~4.60%) of the genome. Selected CNVs were further experimentally validated and we found that copy number "gain" CNVs were predominantly clustered in tandem rather than existing as interspersed duplications. Many CNV regions (~56%) overlap with cattle genes (1,263), which are significantly enriched for immunity, lactation, reproduction and rumination. The overlap of this new dataset and other published CNV studies was less than 40%; however, our discovery of large, high frequency (> 5% of animals surveyed) CNV regions showed 90% agreement with other studies. These results highlight the differences and commonalities between technical platforms.
We present a comprehensive genomic analysis of cattle CNVs derived from SNP data which will be a valuable genomic variation resource. Combined with SNP detection assays, gene-containing CNV regions may help identify genes undergoing artificial selection in domesticated animals.
Duplicated sequences are an important source of gene innovation and structural variation within mammalian genomes. We performed the first systematic and genome-wide analysis of segmental duplications in the modern domesticated cattle (Bos taurus). Using two distinct computational analyses, we estimated that 3.1% (94.4 Mb) of the bovine genome consists of recently duplicated sequences (≥ 1 kb in length, ≥ 90% sequence identity). Similar to other mammalian draft assemblies, almost half (47% of 94.4 Mb) of these sequences have not been assigned to cattle chromosomes.
In this study, we provide the first experimental validation large duplications and briefly compared their distribution on two independent bovine genome assemblies using fluorescent in situ hybridization (FISH). Our analyses suggest that the (75-90%) of segmental duplications are organized into local tandem duplication clusters. Along with rodents and carnivores, these results now confidently establish tandem duplications as the most likely mammalian archetypical organization, in contrast to humans and great ape species which show a preponderance of interspersed duplications. A cross-species survey of duplicated genes and gene families indicated that duplication, positive selection and gene conversion have shaped primates, rodents, carnivores and ruminants to different degrees for their speciation and adaptation. We identified that bovine segmental duplications corresponding to genes are significantly enriched for specific biological functions such as immunity, digestion, lactation and reproduction.
Our results suggest that in most mammalian lineages segmental duplications are organized in a tandem configuration. Segmental duplications remain problematic for genome and assembly and we highlight genic regions that require higher quality sequence characterization. This study provides insights into mammalian genome evolution and generates a valuable resource for cattle genomics research.
Array-based comparative genomics hybridization (aCGH) has gained prevalence as an effective technique for measuring structural variations in the genome. Copy-number variations (CNVs) form a large source of genomic structural variation, but it is not known whether phenotypic differences between intra-species groups, such as divergent human populations, or breeds of a domestic animal, can be attributed to CNVs. Several computational methods have been proposed to improve the detection of CNVs from array CGH data, but few population studies have used CGH data for identification of intra-species differences. In this paper we propose a novel method of genome-wide comparison and classification using CGH data that condenses whole genome information, aimed at quantification of intra-species variations and discovery of shared ancestry. Our strategy included smoothing CGH data using an appropriate denoising algorithm, extracting features via wavelets, quantifying the information via wavelet power spectrum and hierarchical clustering of the resultant profile. To evaluate the classification efficiency of our method, we used simulated data sets. We applied it to aCGH data from human and bovine individuals and showed that it successfully detects existing intra-specific variations with additional evolutionary implications.
Chicken Repeat 1 (CR1) repeats are the most abundant family of repeats in the chicken genome, with more than 200,000 copies accounting for ∼80% of the chicken interspersed repeats. CR1 repeats are believed to have arisen from the retrotransposition of a small number of master elements, which gave rise to the 22 CR1 subfamilies as previously reported in Repbase. We performed a global assessment of the divergence distributions, phylogenies, and consensus sequences of CR1 repeats in the chicken genome. We identified and validated 57 chicken CR1 subfamilies and further analyzed the correlation between these subfamilies and their regional GC contents. We also discovered one novel lineage-specific CR1 subfamilies in turkeys when compared with chickens. We built an evolutionary tree of these subfamilies and concluded that CR1 repeats may play an important role in reshaping the structure of bird genomes.
CR1 repeats; comparative genomics; chicken genome
Temporal gene expression data are of particular interest to researchers as they contain rich information in characterization of gene function and have been widely used in biomedical studies. However, extracting information and identifying efficient treatment effects without loss of temporal information are still in problem. In this paper, we propose a method of classifying temporal gene expression curves in which individual expression trajectory is modeled as longitudinal data with changeable variance and covariance structure. The method, mainly based on generalized mixed model, is illustrated by a dense temporal gene expression data in bacteria. We aimed at evaluating gene effects and treatments. The power and time points of measurements are also characterized via the longitudinal mixed model. The results indicated that the proposed methodology is promising for the analysis of temporal gene expression data, and that it could be generally applicable to other high-throughput temporal gene expression analyses.
Clustering analysis is a common statistical tool for knowledge discovery. It is mainly conducted when a project still is in the exploratory phase without any priori hypotheses. However, the statistical significance testing between the clusters can be meaningful in helping the researchers to assess if the classification results from implementing a clustering algorithm need to be improved, even after the cluster number has been determined by a well-established criterion. This is important when we want to identify highly-specific patterns through classification.
We proposed to use a principal component (PC) test, which is an implementation of an exact F statistic for the measures at multiple endpoints based on elliptical distribution theory, to assess the statistical significance between clusters. A challenge in the implementation is the choice of the number (q) of principal components to be considered, which can severely influence the statistical power of the method. We optimized the determination via validation according to a permutation test based on the clustering to be evaluated. The method was applied to a public dataset in classifying genes according to their temporal gene expression profiles.
The results demonstrated that the PC testing were useful for determining the optimal number of clusters.
Both epigenetic alterations and genetic variations play essential roles in tumorigenesis. The epigenetic modification of DNA methylation is catalyzed and maintained by the DNA methyltransferases (DNMT3a, DNMT3b and DNMT1). DNA mutations and DNA methylation profiles of DNMTs themselves and their relationships with chicken neoplastic disease resistance and susceptibility are not yet defined. In the present study, we analyzed the complexity of the DNA methylation variations and DNA mutations in the first exon of three DNMTs genes over generations, tissues, and ages among chickens of two highly inbred White Leghorn lines, Marek's disease-resistant line 63 and -susceptible line 72, and six recombinant congenic strains (RCSs). Among them, tissue-specific methylation patterns of DNMT3a were disclosed in spleen, liver, and hypothalamus in lines 63 and 72. The methylation level of DNMT3b on four CpG sites was not significantly different among four tissues of the two lines. However, two line-specific DNA transition mutations, CpG→TpG (Chr20:10203733 and 10203778), were discovered in line 72 compared to the line 63 and RCSs. The methylation contents of DNMT1 in blood cell showed significant epimutations in the first CpG site among the two inbred lines and the six RCSs (P<0.05). Age-specific methylation of DNMT1 was detected in comparisons between 15 month-old and 2 month-old chickens in both lines except in spleen samples from line 72. No DNA mutations were discovered on the studied regions of DNMT1 and DNMT3a among the two lines and the six RCSs. Moreover, we developed a novel method that can effectively test the significance of DNA methylation patterns consisting of continuous CpG sites. Taken together, these results highlight the potential of epigenetic alterations in DNMT1 and DNMT3a, as well as the DNA mutations in DNMT3b, as epigenetic and genetic factors to neoplastic diseases of chickens.
Chicken endogenous viruses, ALVE (Avian Leukosis Virus subgroup E), are inherited as LTR (long terminal repeat) retrotransposons, which are negatively correlated with disease resistance, and any changes in DNA methylation may contribute to the susceptibility to neoplastic disease. The relationship between ALVE methylation status and neoplastic disease in the chicken is undefined. White Leghorn inbred lines 72 and 63 at the ADOL have been respectively selected for resistance and susceptibility to tumors that are induced by avian viruses. In this study, the DNA methylation patterns of 3∼6 CpG sites of four conserved regions in ALVE, including one unique region in ALVE1, the promoter region in the TVB (tumor virus receptor of ALV subgroup B, D and E) locus, were analyzed in the two lines using pyrosequencing methods in four tissues, i.e., liver, spleen, blood and hypothalamus. A significant CpG hypermethylation level was seen in line 72 in all four tissues, e.g., 91.86±1.63% for ALVE region2 in blood, whereas the same region was hemimethylated (46.16±2.56%) in line 63. CpG methylation contents of the ALVE regions were significantly lower in line 63 than in line 72 in all tissues (P<0.01) except the ALVE region 3/4 in liver. RNA expressions of ALVE regions 2 and 3 (PPT-U3) were significantly higher in line 63 than in line 72 (P<0.01). The methylation levels of six recombinant congenic strains (RCSs) closely resembled to the background line 63 in ALVE-region 2, which imply the methylation pattern of ALVE-region 2 may be a biomarker in resistant disease breeding. The methylation level of the promoter region in the TVB was significantly different in blood (P<0.05) and hypothalamus (P<0.0001), respectively. Our data disclosed a hypermethylation pattern of ALVE that may be relevant for resistance against ALV induced tumors in chickens.