1.  A clinical study of lung cancer dose calculation accuracy with Monte Carlo simulation 
The accuracy of dose calculation is crucial to the quality of treatment planning and, consequently, to the dose delivered to patients undergoing radiation therapy. Current general calculation algorithms such as Pencil Beam Convolution (PBC) and Collapsed Cone Convolution (CCC) have shortcomings in regard to severe inhomogeneities, particularly in those regions where charged particle equilibrium does not hold. The aim of this study was to evaluate the accuracy of the PBC and CCC algorithms in lung cancer radiotherapy using Monte Carlo (MC) technology.
Methods and materials
Four treatment plans were designed using Oncentra Masterplan TPS for each patient. Two intensity-modulated radiation therapy (IMRT) plans were developed using the PBC and CCC algorithms, and two three-dimensional conformal therapy (3DCRT) plans were developed using the PBC and CCC algorithms. The DICOM-RT files of the treatment plans were exported to the Monte Carlo system to recalculate. The dose distributions of GTV, PTV and ipsilateral lung calculated by the TPS and MC were compared.
For 3DCRT and IMRT plans, the mean dose differences for GTV between the CCC and MC increased with decreasing of the GTV volume. For IMRT, the mean dose differences were found to be higher than that of 3DCRT. The CCC algorithm overestimated the GTV mean dose by approximately 3% for IMRT. For 3DCRT plans, when the volume of the GTV was greater than 100 cm3, the mean doses calculated by CCC and MC almost have no difference. PBC shows large deviations from the MC algorithm. For the dose to the ipsilateral lung, the CCC algorithm overestimated the dose to the entire lung, and the PBC algorithm overestimated V20 but underestimated V5; the difference in V10 was not statistically significant.
PBC substantially overestimates the dose to the tumour, but the CCC is similar to the MC simulation. It is recommended that the treatment plans for lung cancer be developed using an advanced dose calculation algorithm other than PBC. MC can accurately calculate the dose distribution in lung cancer and can provide a notably effective tool for benchmarking the performance of other dose calculation algorithms within patients.
PMCID: PMC4276018  PMID: 25511623
3-Dimensional conformal radiation therapy; Collapsed cone convolution; Pencil beam convolution; Lung cancer; Monte Carlo; Intensity-modulated radiation therapy
2.  The mitochondrial genome of the land snail Camaena cicatricosa (Müller, 1774) (Stylommatophora, Camaenidae): the first complete sequence in the family Camaenidae 
ZooKeys  2014;33-48.
The complete mitochondrial (mt) genome of the snail Camaena cicatricosa (Müller, 1774) has been sequenced and annotated in this study. The entire circular genome is 13,843 bp in size and represents the first camaenid mt genome, with content of 31.9%A, 37.9%T, 13.5%C and 16.7%G. Gene content, codon usage and base organization show similarity to a great extent to the sequenced mt genome from Stylommatophora, whereas, gene order is different from them, especially the positions of tRNACys, tRNAPhe, COII, tRNAAsp, tRNAGly, tRNAHis and tRNATrp. All protein coding genes use standard initiation codons ATN except for COII with GTG as start signal. Conventional stop codons TAA and TAG have been assigned to all protein coding genes. All tRNA genes possess the typical clover leaf structure, but the TψC arm of tRNAAsp and dihydrouridine arm of tRNASer(AGN) only form a simple loop. Shorter intergenic spacers have been found in this mt genome. Phylogenetic study based on protein coding genes shows close relationship of Camaenidae and Bradybaenidae. The presented phylogeny is consistent with the monophyly of Stylommatophora.
PMCID: PMC4258619  PMID: 25493046
Camaena cicatricosa; Camaenidae; Stylommatophora; mitochondrial genome; secondary structure
3.  Craniocerebral injury promotes the repair of peripheral nerve injury 
Neural Regeneration Research  2014;9(18):1703-1708.
The increase in neurotrophic factors after craniocerebral injury has been shown to promote fracture healing. Moreover, neurotrophic factors play a key role in the regeneration and repair of peripheral nerve. However, whether craniocerebral injury alters the repair of peripheral nerve injuries remains poorly understood. Rat injury models were established by transecting the left sciatic nerve and using a free-fall device to induce craniocerebral injury. Compared with sciatic nerve injury alone after 6–12 weeks, rats with combined sciatic and craniocerebral injuries showed decreased sciatic functional index, increased recovery of gastrocnemius muscle wet weight, recovery of sciatic nerve ganglia and corresponding spinal cord segment neuron morphologies, and increased numbers of horseradish peroxidase-labeled cells. These results indicate that craniocerebral injury promotes the repair of peripheral nerve injury.
PMCID: PMC4211192  PMID: 25374593
nerve regeneration; craniocerebral injury; peripheral nerve; sciatic nerve; sciatic nerve injury; nerve repair; horseradish peroxidase tracer technique; neural regeneration
4.  Identification of Important Nodes in Directed Biological Networks: A Network Motif Approach 
PLoS ONE  2014;9(8):e106132.
Identification of important nodes in complex networks has attracted an increasing attention over the last decade. Various measures have been proposed to characterize the importance of nodes in complex networks, such as the degree, betweenness and PageRank. Different measures consider different aspects of complex networks. Although there are numerous results reported on undirected complex networks, few results have been reported on directed biological networks. Based on network motifs and principal component analysis (PCA), this paper aims at introducing a new measure to characterize node importance in directed biological networks. Investigations on five real-world biological networks indicate that the proposed method can robustly identify actually important nodes in different networks, such as finding command interneurons, global regulators and non-hub but evolutionary conserved actually important nodes in biological networks. Receiver Operating Characteristic (ROC) curves for the five networks indicate remarkable prediction accuracy of the proposed measure. The proposed index provides an alternative complex network metric. Potential implications of the related investigations include identifying network control and regulation targets, biological networks modeling and analysis, as well as networked medicine.
PMCID: PMC4149525  PMID: 25170616
5.  Opposite Effects of Neuropeptide FF on Central Antinociception Induced by Endomorphin-1 and Endomorphin-2 in Mice 
PLoS ONE  2014;9(8):e103773.
Neuropeptide FF (NPFF) is known to be an endogenous opioid-modulating peptide. Nevertheless, very few researches focused on the interaction between NPFF and endogenous opioid peptides. In the present study, we have investigated the effects of NPFF system on the supraspinal antinociceptive effects induced by the endogenous µ-opioid receptor agonists, endomorphin-1 (EM-1) and endomorphin-2 (EM-2). In the mouse tail-flick assay, intracerebroventricular injection of EM-1 induced antinociception via µ-opioid receptor while the antinociception of intracerebroventricular injected EM-2 was mediated by both µ- and κ-opioid receptors. In addition, central administration of NPFF significantly reduced EM-1-induced central antinociception, but enhanced EM-2-induced central antinociception. The results using the selective NPFF1 and NPFF2 receptor agonists indicated that the EM-1-modulating action of NPFF was mainly mediated by NPFF2 receptor, while NPFF potentiated EM-2-induecd antinociception via both NPFF1 and NPFF2 receptors. To further investigate the roles of µ- and κ-opioid systems in the opposite effects of NPFF on central antinociception of endomprphins, the µ- and κ-opioid receptors selective agonists DAMGO and U69593, respectively, were used. Our results showed that NPFF could reduce the central antinociception of DAMGO via NPFF2 receptor and enhance the central antinociception of U69593 via both NPFF1 and NPFF2 receptors. Taken together, our data demonstrate that NPFF exerts opposite effects on central antinociception of endomorphins and provide the first evidence that NPFF potentiate antinociception of EM-2, which might result from the interaction between NPFF and κ-opioid systems.
PMCID: PMC4121275  PMID: 25090615
6.  Demonstrating the feasibility of large-scale development of standardized assays to quantify human proteins 
Nature methods  2013;11(2):149-155.
The successful application of MRM in biological specimens raises the exciting possibility that assays can be configured to measure all human proteins, resulting in an assay resource that would promote advances in biomedical research. We report the results of a pilot study designed to test the feasibility of a large-scale, international effort in MRM assay generation. We have configured, validated across three laboratories, and made publicly available as a resource to the community 645 novel MRM assays representing 319 proteins expressed in human breast cancer. Assays were multiplexed in groups of >150 peptides and deployed to quantify endogenous analyte in a panel of breast cancer-related cell lines. Median assay precision was 5.4%, with high inter-laboratory correlation (R2 >0.96). Peptide measurements in breast cancer cell lines were able to discriminate amongst molecular subtypes and identify genome-driven changes in the cancer proteome. These results establish the feasibility of a scaled, international effort.
PMCID: PMC3922286  PMID: 24317253
7.  Comparative Mitogenomics of Plant Bugs (Hemiptera: Miridae): Identifying the AGG Codon Reassignments between Serine and Lysine 
PLoS ONE  2014;9(7):e101375.
Insect mitochondrial genomes are very important to understand the molecular evolution as well as for phylogenetic and phylogeographic studies of the insects. The Miridae are the largest family of Heteroptera encompassing more than 11,000 described species and of great economic importance. For better understanding the diversity and the evolution of plant bugs, we sequence five new mitochondrial genomes and present the first comparative analysis of nine mitochondrial genomes of mirids available to date. Our result showed that gene content, gene arrangement, base composition and sequences of mitochondrial transcription termination factor were conserved in plant bugs. Intra-genus species shared more conserved genomic characteristics, such as nucleotide and amino acid composition of protein-coding genes, secondary structure and anticodon mutations of tRNAs, and non-coding sequences. Control region possessed several distinct characteristics, including: variable size, abundant tandem repetitions, and intra-genus conservation; and was useful in evolutionary and population genetic studies. The AGG codon reassignments were investigated between serine and lysine in the genera Adelphocoris and other cimicomorphans. Our analysis revealed correlated evolution between reassignments of the AGG codon and specific point mutations at the antidocons of tRNALys and tRNASer(AGN). Phylogenetic analysis indicated that mitochondrial genome sequences were useful in resolving family level relationship of Cimicomorpha. Comparative evolutionary analysis of plant bug mitochondrial genomes allowed the identification of previously neglected coding genes or non-coding regions as potential molecular markers. The finding of the AGG codon reassignments between serine and lysine indicated the parallel evolution of the genetic code in Hemiptera mitochondrial genomes.
PMCID: PMC4079613  PMID: 24988409
8.  A penalized EM algorithm incorporating missing data mechanism for Gaussian parameter estimation 
Biometrics  2014;70(2):312-322.
Missing data rates could depend on the targeted values in many settings, including mass spectrometry-based proteomic profiling studies. Here we consider mean and covariance estimation under a multivariate Gaussian distribution with non-ignorable missingness, including scenarios in which the dimension (p) of the response vector is equal to or greater than the number (n) of independent observations. A parameter estimation procedure is developed by maximizing a class of penalized likelihood functions that entails explicit modeling of missing data probabilities. The performance of the resulting ‘penalized EM algorithm incorporating missing data mechanism (PEMM)’ estimation procedure is evaluated in simulation studies and in a proteomic data illustration.
PMCID: PMC4061266  PMID: 24471933
Expectation-maximization (EM) algorithm; maximum penalized likelihood estimate; not-missing-at-random (NMAR)
9.  Altered Contractile Phenotypes of Intestinal Smooth Muscle in Mice Deficient in Myosin Phosphatase Target Subunit 1 
Gastroenterology  2013;144(7):1456-1465.e5.
The regulatory subunit of myosin light chain phosphatase, MYPT1, has been proposed to control smooth muscle contractility by regulating phosphorylation of the Ca2+-dependent myosin regulatory light chain. We generated mice with a smooth muscle–specific deletion of MYPT1 to investigate its physiologic role in intestinal smooth muscle contraction.
We used the CreloxP system to establish Mypt1-floxed mice, with the promoter region and exon 1 of Mypt1 flanked by 2 loxP sites. These mice were crossed with SMA-Cre transgenic mice to generate mice with smooth muscle–specific deletion of MYPT1 (Mypt1SMKO mice). The phenotype was assessed by histologic, biochemical, molecular, and physiologic analyses.
Young adult Mypt1SMKO mice had normal intestinal motility in vivo, with no histologic abnormalities. On stimulation with KCl or acetylcholine, intestinal smooth muscles isolated from Mypt1SMKO mice produced robust and increased sustained force due to increased phosphorylation of the myosin regulatory light chain compared with muscle from control mice. Additional analyses of contractile properties showed reduced rates of force development and relaxation, and decreased shortening velocity, compared with muscle from control mice. Permeable smooth muscle fibers from Mypt1SMKO mice had increased sensitivity and contraction in response to Ca2+.
MYPT1 is not essential for smooth muscle function in mice but regulates the Ca2+ sensitivity of force development and contributes to intestinal phasic contractile phenotype. Altered contractile responses in isolated tissues could be compensated by adaptive physiologic responses in vivo, where gut motility is affected by lower intensities of smooth muscle stimulation for myosin phosphorylation and force development.
PMCID: PMC3782749  PMID: 23499953
Mouse Model; Development; Calcium Signaling; Phosphorylation
10.  A Randomized, Prospective Pilot Study of Patient Expectancy and Antidepressant Outcome 
Psychological medicine  2012;43(5):975-982.
This study is a randomized, prospective, investigation of the relationships between clinical trial design, patient expectancy, and the outcome of treatment with antidepressant medication.
Adult outpatients with Major Depressive Disorder (MDD) were randomized to either Placebo-Controlled (PC, 50% probability of receiving active medication) or Comparator (COMP, 100% probability of receiving active medication) administration of antidepressant medication. Independent samples t tests and analysis of covariance (ANCOVA) were employed to determine whether the probability of receiving active medication influenced patient expectancy and to compare medication response in the PC vs. COMP conditions. We also tested the correlations between baseline expectancy score and final improvement in depressive symptoms across study groups.
Subjects randomized to the COMP condition reported greater expectancy of improvement compared to subjects in the PC condition (t = 2.60, df 27, p = 0.015). There were no statistically significant differences in the analyses comparing antidepressant outcomes between subjects receiving medication in the COMP condition and those receiving medication in the PC condition. Higher baseline expectancy of improvement was correlated with lower final depression severity scores (r = 0.53, p = 0.021) and greater improvement in depressive symptoms over the course of the study (r = 0.44, p = 0.058).
The methods described represent a promising way of subjecting patient expectancy to scientific study. Expectancy of improvement is affected by the probability of receiving active antidepressant medication and appears to influence antidepressant response.
PMCID: PMC3594112  PMID: 22971472
antidepressant; placebo effect; expectancy; clinical trial; depression
11.  Laminoplasty and Laminectomy Hybrid Decompression for the Treatment of Cervical Spondylotic Myelopathy with Hypertrophic Ligamentum Flavum: A Retrospective Study 
PLoS ONE  2014;9(4):e95482.
To report the outcomes of a posterior hybrid decompression protocol for the treatment of cervical spondylotic myelopathy (CSM) associated with hypertrophic ligamentum flavum (HLF).
Laminoplasty is widely used in patients with CSM; however, for CSM patients with HLF, traditional laminoplasty does not include resection of a pathological ligamentum flavum.
This study retrospectively reviewed 116 CSM patients with HLF who underwent hybrid decompression with a minimum of 12 months of follow-up. The procedure consisted of reconstruction of the C4 and C6 laminae using CENTERPIECE plates with spinous process autografts, and resection of the C3, C5, and C7 laminae. Surgical outcomes were assessed using Japanese Orthopedic Association (JOA) score, recovery rate, cervical lordotic angle, cervical range of motion, spinal canal sagittal diameter, bone healing rates on both the hinge and open sides, dural sac expansion at the level of maximum compression, drift-back distance of the spinal cord, and postoperative neck pain assessed by visual analog scale.
No hardware failure or restenosis was noted. Postoperative JOA score improved significantly, with a mean recovery rate of 65.3±15.5%. Mean cervical lordotic angle had decreased 4.9 degrees by 1 year after surgery (P<0.05). Preservation of cervical range of motion was satisfactory postoperatively. Bone healing rates 6 months after surgery were 100% on the hinge side and 92.2% on the open side. Satisfactory decompression was demonstrated by a significantly increased sagittal canal diameter and cross-sectional area of the dural sac together with a significant drift-back distance of the spinal cord. The dural sac was also adequately expanded at the time of the final follow-up visit.
Hybrid laminectomy and autograft laminoplasty decompression using Centerpiece plates may facilitate bone healing and produce a comparatively satisfactory prognosis for CSM patients with HLF.
PMCID: PMC3989326  PMID: 24740151
12.  Correlation between serum IGF-1 and blood lead level in short stature children and adolescent with growth hormone deficiency 
This study aimed to investigate correlation between serum insulin-like growth factor-1 (IGF-1) and blood lead level in short stature children with growth hormone deficiency (GHD), and IGF-1 signal molecules were investigated in lead exposed rats. Our findings may provide evidence for clarifying pathogenesis of lead induced short stature in children. Methods: 880 short stature children were recruited from clinics and divided into GHD group and idiopathic short stature (ISS) group according to the GH peak in growth hormone stimulation test. The height, body weight, serum IGF-1 level and blood lead level were determined. A rat model of lead poisoning was used to establish and western blot assay was employed to detect the phosphorylation of signaling molecules (MAPK and PI3K/Akt) related to IGF-1 signaling pathway. Results: In GHD group, the height, body weight and serum IGF-1 level were significantly lower, but the blood lead level was significantly higher than those in ISS group (P<0.05). Western blot assay confirmed that the protein expression of phosphorylated ERK1/2, JNK, p38, Akt473 and Akt308 increased significantly (P<0.01) in lead exposure rats. Conclusion: Our study suggesting that reduction in IGF-1 in children with GHD is associated with blood lead level. Lead exposure may induce expression of phosphorylated MAPK and Akt signaling molecules. The activation of these molecules may influence binding of IGF-1 and tyrosine kinase receptor IGFIR to regulate cell growth via the MAPK and Akt signaling pathways, which then interfere with growth-promoting effect of IGF-1 in short children.
PMCID: PMC4057833  PMID: 24955154
Growth hormone deficiency; insulin-like growth factor I; lead exposure animal model; short stature; signaling pathway
13.  Unusual Compression Behavior of Nanocrystalline CeO2 
Scientific Reports  2014;4:4441.
The x-ray diffraction study of 12 nm CeO2 was carried out up to ~40 GPa using an angle dispersive synchrotron-radiation in a diamond-anvil cell with different pressure transmitting medium (PTM) (4:1 methanol: ethanol mixture, silicone oil and none) at room temperature. While the cubic fluorite-type structure CeO2 was retained to the highest pressure, there is progressive broadening and intensity reduction of the reflections with increasing pressure. At pressures above 12 GPa, an unusual change in the compression curve was detected in all experiments. Significantly, apparent negative volume compressibility was observed at P = 18–27 GPa with silicone oil as PTM, however it was not detected in other circumstances. The expansion of the unit cell volume of cubic CeO2 was about 1% at pressures of 15–27 GPa. To explain this abnormal phenomenon, a dual structure model (hard amorphous shell and relatively soft crystalline core) has been proposed.
PMCID: PMC3963033  PMID: 24658049
14.  A 13-gene signature prognostic of HPV-negative OSCC: discovery and external validation 
To identify a prognostic gene signature for HPV-negative OSCC patients.
Experimental Design
Two gene expression datasets were used; a training dataset from the Fred Hutchinson Cancer Research Center (FHCRC) (n=97), and a validation dataset from the MD Anderson Cancer Center (MDACC) (n=71). We applied L1/L2-penalized Cox regression models to the FHCRC data on the 131–gene signature previously identified to be prognostic in OSCC patients to identify a prognostic model specific for high-risk HPV-negative OSCC patients. The models were tested with the MDACC dataset using a receiver operating characteristic analysis.
A 13-gene model was identified as the best predictor of HPV-negative OSCC-specific survival in the training dataset. The risk score for each patient in the validation dataset was calculated from this model and dichotomized at the median. The estimated 2-year mortality (± SE) of patients with high risk scores was 47.1 (±9.24)% compared with 6.35 (± 4.42)% for patients with low risk scores. ROC analyses showed that the areas under the curve for the age, gender, and treatment modality-adjusted models with risk score (0.78, 95%CI: 0.74-0.86) and risk score plus tumor stage (0.79, 95%CI: 0.75-0.87) were substantially higher than for the model with tumor stage (0.54, 95%CI: 0.48-0.62).
We identified and validated a 13-gene signature that is considerably better than tumor stage in predicting survival of HPV-negative OSCC patients. Further evaluation of this gene signature as a prognostic marker in other populations of patients with HPV-negative OSCC is warranted.
PMCID: PMC3593802  PMID: 23319825
gene signature; prognosis; HPV-negative; OSCC
The annals of applied statistics  2013;7(1):391-417.
Gaussian Graphical Models (GGMs) have been used to construct genetic regulatory networks where regularization techniques are widely used since the network inference usually falls into a high–dimension–low–sample–size scenario. Yet, finding the right amount of regularization can be challenging, especially in an unsupervised setting where traditional methods such as BIC or cross-validation often do not work well. In this paper, we propose a new method — Bootstrap Inference for Network COnstruction (BINCO) — to infer networks by directly controlling the false discovery rates (FDRs) of the selected edges. This method fits a mixture model for the distribution of edge selection frequencies to estimate the FDRs, where the selection frequencies are calculated via model aggregation. This method is applicable to a wide range of applications beyond network construction. When we applied our proposed method to building a gene regulatory network with microarray expression breast cancer data, we were able to identify high-confidence edges and well-connected hub genes that could potentially play important roles in understanding the underlying biological processes of breast cancer.
PMCID: PMC3930359  PMID: 24563684
high dimensional data; GGM; model aggregation; mixture model; FDR
Journal of statistical research  2010;44(1):103-107.
In a recent paper [4], Efron pointed out that an important issue in large-scale multiple hypothesis testing is that the null distribution may be unknown and need to be estimated. Consider a Gaussian mixture model, where the null distribution is known to be normal but both null parameters-the mean and the variance-are unknown. We address the problem with a method based on Fourier transformation. The Fourier approach was first studied by Jin and Cai [9], which focuses on the scenario where any non-null effect has either the same or a larger variance than that of the null effects. In this paper, we review the main ideas in [9], and propose a generalized Fourier approach to tackle the problem under another scenario: any non-null effect has a larger mean than that of the null effects, but no constraint is imposed on the variance. This approach and that in [9] complement with each other: each approach is successful in a wide class of situations where the other fails. Also, we extend the Fourier approach to estimate the proportion of non-null effects. The proposed procedures perform well both in theory and on simulated data.
PMCID: PMC3928715  PMID: 24563569
empirical null; Fourier transformation; generalized Fourier transformation; proportion of non-null effects; sample size calculation
17.  Nephroprotective effect of astaxanthin against trivalent inorganic arsenic-induced renal injury in wistar rats 
Inorganic arsenic (iAs) is a toxic metalloid found ubiquitously in the environment. In humans, exposure to iAs can result in toxicity and cause toxicological manifestations. Arsenic trioxide (As2O3) has been used in the treatment for acute promyelocytic leukemia. The kidney is the critical target organ of trivalent inorganic As (iAsIII) toxicity. We examine if oral administration of astaxanthin (AST) has protective effects on nephrotoxicity and oxidative stress induced by As2O3 exposure (via intraperitoneal injection) in rats. Markers of renal function, histopathological changes, Na+-K+ ATPase, sulfydryl, oxidative stress, and As accumulation in kidneys were evaluated as indicators of As2O3 exposure. AST showed a significant protective effect against As2O3-induced nephrotoxicity. These results suggest that the mechanisms of action, by which AST reduces nephrotoxicity, may include antioxidant protection against oxidative injury and reduction of As accumulation. These findings might be of therapeutic benefit in humans or animals suffering from exposure to iAsIII from natural sources or cancer therapy.
PMCID: PMC3944156  PMID: 24611105
Astaxanthin; trivalent inorganic arsenic; arsenic accumulation; nephrotoxicity; oxidative stress
18.  Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer 
In this paper, we propose a new method remMap — REgularized Multivariate regression for identifying MAster Predictors — for fitting multivariate response regression models under the high-dimension-low-sample-size setting. remMap is motivated by investigating the regulatory relationships among different biological molecules based on multiple types of high dimensional genomic data. Particularly, we are interested in studying the influence of DNA copy number alterations on RNA transcript levels. For this purpose, we model the dependence of the RNA expression levels on DNA copy numbers through multivariate linear regressions and utilize proper regularization to deal with the high dimensionality as well as to incorporate desired network structures. Criteria for selecting the tuning parameters are also discussed. The performance of the proposed method is illustrated through extensive simulation studies. Finally, remMap is applied to a breast cancer study, in which genome wide RNA transcript levels and DNA copy numbers were measured for 172 tumor samples. We identify a trans-hub region in cytoband 17q12–q21, whose amplification influences the RNA expression levels of more than 30 unlinked genes. These findings may lead to a better understanding of breast cancer pathology.
PMCID: PMC3905690  PMID: 24489618
sparse regression; MAP(MAster Predictor) penalty; DNA copy number alteration; RNA transcript level; v-fold cross validation
19.  Revision of three camaenid and one bradybaenid species (Gastropoda, Stylommatophora) from China based on morphological and molecular data, with description of a new bradybaenid subspecies from Inner Mongolia, China 
ZooKeys  2014;1-16.
We have revised the taxonomy of three camaenid and one bradybaenid species from China and described one new subspecies of the genus Bradybaena (Family Bradybaenidae) from Inner Mongolia, China. The genitalia of three Satsuma (Family Camaenidae) species S. mellea stenozona (Moellendorff, 1884), S. meridionalis (Moellendorff, 1884), comb. n. and S. uncopila (Heude, 1882), comb. n. assigned to the genus Bradybaena previously,lack a dart sac and mucous glands. Moreover, the molecular phylogeny has revealed close relationships between the three species and the genus Satsuma. Two species, S. stenozona (Moellendorff, 1884) from Fuzhou and Ganesella citrina Zilch, 1940 from Wuyi Mountain, are considered as synonymous and should be a subspecies of S. mellea mellea (Pfeiffer, 1866) because of the morphological and molecular similarities. Meanwhile, the other two are placed in the genus Satsuma: S. meridionalis (Moellendorff, 1884), comb. n. and S. uncopila (Heude, 1882), comb. n. G. virgo Pilsbry, 1927 differs from species of the genera Ganesella and Satsuma not only in its shell, but also in anatomical characters, such as having a dart sac and mucous gland, and lacking a flagellum. Additionally, phylogenetic analyses highly support the sister relationship with other Bradybaena species. Thus, placement of G. virgo Pilsbry, 1927 in the genus Bradybaena issuggested.
PMCID: PMC3909801  PMID: 24493955
Satsuma; Ganesella; Bradybaena; revision; new subspecies
20.  Alkaloids from the Mangrove-Derived Actinomycete Jishengella endophytica 161111 
Marine Drugs  2014;12(1):477-490.
A new alkaloid, 2-(furan-2-yl)-6-(2S,3S,4-trihydroxybutyl)pyrazine (1), along with 12 known compounds, 2-(furan-2-yl)-5-(2S,3S,4-trihydroxybutyl)pyrazine (2), (S)-4-isobutyl-3-oxo-3,4-dihydro-1H-pyrrolo[2,1-c][1,4]oxazine-6-carbaldehyde (3), (S)-4-isopropyl-3-oxo-3,4-dihydro-1H-pyrrolo[2,1-c][1,4]oxazine-6-carbaldehyde (4), (4S)-4-(2-methylbutyl)-3-oxo-3,4-dihydro-1H-pyrrolo[2,1-c][1,4]oxazine-6-carbaldehyde (5), (S)-4-benzyl-3-oxo-3,4-dihydro-1H-pyrrolo[2,1-c][1,4]oxazine-6-carbaldehyde (6), flazin (7), perlolyrine (8), 1-hydroxy-β-carboline (9), lumichrome (10), 1H-indole-3-carboxaldehyde (11), 2-hydroxy-1-(1H-indol-3-yl)ethanone (12), and 5-(methoxymethyl)-1H-pyrrole-2-carbaldehyde (13), were isolated and identified from the fermentation broth of an endophytic actinomycetes, Jishengella endophytica 161111. The new structure 1 and the absolute configurations of 2–6 were determined by spectroscopic methods, J-based configuration analysis (JBCA) method, lactone sector rule, and electronic circular dichroism (ECD) calculations. Compounds 8–11 were active against the influenza A virus subtype H1N1 with IC50 and selectivity index (SI) values of 38.3(±1.2)/25.0(±3.6)/39.7(±5.6)/45.9(±2.1) μg/mL and 3.0/16.1/3.1/11.4, respectively. The IC50 and SI values of positive control, ribavirin, were 23.1(±1.7) μg/mL and 32.2, respectively. The results showed that compound 9 could be a promising new hit for anti-H1N1 drugs. The absolute configurations of 2–5, 13C nuclear magnetic resonance (NMR) data and the specific rotations of 3–6 were also reported here for the first time.
PMCID: PMC3917282  PMID: 24451190
mangrove; actinomycete; Jishengella endophytica 161111; pyrazine derivative; anti-H1N1 virus activity
21.  Network Based Prediction Model for Genomics Data Analysis* 
Statistics in biosciences  2012;4(1):10.1007/s12561-012-9056-7.
Biological networks, such as genetic regulatory networks and protein interaction networks, provide important information for studying gene/protein activities. In this paper, we propose a new method, NetBoosting, for incorporating a priori biological network information in analyzing high dimensional genomics data. Specially, we are interested in constructing prediction models for disease phenotypes of interest based on genomics data, and at the same time identifying disease susceptible genes. We employ the gradient descent boosting procedure to build an additive tree model and propose a new algorithm to utilize the network structure in fitting small tree weak learners. We illustrate by simulation studies and a real data example that, by making use of the network information, NetBoosting outperforms a few existing methods in terms of accuracy of prediction and variable selection.
PMCID: PMC3859188  PMID: 24348880
22.  Transgenerational Variations in DNA Methylation Induced by Drought Stress in Two Rice Varieties with Distinguished Difference to Drought Resistance 
PLoS ONE  2013;8(11):e80253.
Adverse environmental conditions have large impacts on plant growth and crop production. One of the crucial mechanisms that plants use in variable and stressful natural environments is gene expression modulation through epigenetic modification. In this study, two rice varieties with different drought resistance levels were cultivated under drought stress from tilling stage to seed filling stage for six successive generations. The variations in DNA methylation of the original generation (G0) and the sixth generation (G6) of these two varieties in normal condition (CK) and under drought stress (DT) at seedling stage were assessed by using Methylation Sensitive Amplification Polymorphism (MSAP) method. The results revealed that drought stress had a cumulative effect on the DNA methylation pattern of both varieties, but these two varieties had different responses to drought stress in DNA methylation. The DNA methylation levels of II-32B (sensitive) and Huhan-3 (resistant) were around 39% and 32%, respectively. Genome-wide DNA methylation variations among generations or treatments accounted for around 13.1% of total MSAP loci in II-32B, but was only approximately 1.3% in Huhan-3. In II-32B, 27.6% of total differentially methylated loci (DML) were directly induced by drought stress and 3.2% of total DML stably transmitted their changed DNA methylation status to the next generation. In Huhan-3, the numbers were 48.8% and 29.8%, respectively. Therefore, entrainment had greater effect on Huhan-3 than on II-32B. Sequence analysis revealed that the DML were widely distributed on all 12 rice chromosomes and that it mainly occurred on the gene’s promoter and exon region. Some genes with DML respond to environmental stresses. The inheritance of epigenetic variations induced by drought stress may provide a new way to develop drought resistant rice varieties.
PMCID: PMC3823650  PMID: 24244664
23.  Analyzing LC-MS/MS data by spectral count and ion abundance: two case studies 
Statistics and its interface  2012;5(1):75-87.
In comparative proteomics studies, LC-MS/MS data is generally quantified using one or both of two measures: the spectral count, derived from the identification of MS/MS spectra, or some measure of ion abundance derived from the LC-MS data. Here we contrast the performance of these measures and show that ion abundance is the more sensitive. We also examine how the conclusions of a comparative analysis are influenced by the manner in which the LC-MS/MS data is ‘rolled up’ to the protein level, and show that divergent conclusions obtained using different rollups can be informative. Our analysis is based on two publicly available reference data sets, BIATECH-54 and CPTAC, which were developed for the purpose of assessing methods used in label-free differential proteomic studies. We find that the use of the ion abundance measure reveals properties of both data sets not readily apparent using the spectral count.
PMCID: PMC3806317  PMID: 24163717
mass spectrometry; comparative proteomics; ion abundance; spectral count; ion competition
24.  Estimating the Causal Effect of Randomization Versus Treatment Preference in a Doubly Randomized Preference Trial 
Psychological methods  2012;17(2):244-254.
Although randomized studies have high internal validity, generalizability of the estimated causal effect from randomized clinical trials to real-world clinical or educational practice may be limited. We consider the implication of randomized assignment to treatment, as compared with choice of preferred treatment as it occurs in real-world conditions. Compliance, engagement, or motivation may be better with a preferred treatment, and this can complicate the generalizability of results from randomized trials. The doubly randomized preference trial (DRPT) is a hybrid randomized and nonrandomized design that allows for estimation of the causal effect of randomization versus treatment preference. In the DRPT, individuals are first randomized to either randomized assignment or choice assignment. Those in the randomized assignment group are then randomized to treatment or control, and those in the choice group receive their preference of treatment versus control. Using the potential outcomes framework, we apply the algebra of conditional independence to show how the DRPT can be used to derive an unbiased estimate of the causal effect of randomization versus preference for each of the treatment and comparison conditions. Also, we show how these results can be implemented using full matching on the propensity score. The methodology is illustrated with a DRPT of introductory psychology students who were randomized to randomized assignment or preference of mathematics versus vocabulary training. We found a small to moderate benefit of preference versus randomization with respect to the mathematics outcome for those who received mathematics training.
PMCID: PMC3772621  PMID: 22563844
generalizability; causal inference; conditional independence; propensity score matching; treatment preference
25.  A regularized Hotelling’s T2 test for pathway analysis in proteomic studies 
Recent proteomic studies have identified proteins related to specific phenotypes. In addition to marginal association analysis for individual proteins, analyzing pathways (functionally related sets of proteins) may yield additional valuable insights. Identifying pathways that differ between phenotypes can be conceptualized as a multivariate hypothesis testing problem: whether the mean vector μ of a p-dimensional random vector X is μ0. Proteins within the same biological pathway may correlate with one another in a complicated way, and type I error rates can be inflated if such correlations are incorrectly assumed to be absent. The inflation tends to be more pronounced when the sample size is very small or there is a large amount of missingness in the data, as is frequently the case in proteomic discovery studies. To tackle these challenges, we propose a regularized Hotelling’s T2 (RHT) statistic together with a non-parametric testing procedure, which effectively controls the type I error rate and maintains good power in the presence of complex correlation structures and missing data patterns. We investigate asymptotic properties of the RHT statistic under pertinent assumptions and compare the test performance with four existing methods through simulation examples. We apply the RHT test to a hormone therapy proteomics data set, and identify several interesting biological pathways for which blood serum concentrations changed following hormone therapy initiation.
PMCID: PMC3755504  PMID: 23997374
proteomics; pathway analysis; regularization; Hotelling’s T2

