1.  Towards microbiome transplant as a therapy for periodontitis: an exploratory study of periodontitis microbial signature contrasted by oral health, caries and edentulism 
BMC Oral Health  2015;15:125.
Conventional periodontal therapy aims at controlling supra- and subgingival biofilms. Although periodontal therapy was shown to improve periodontal health, it does not completely arrest the disease. Almost all subjects compliant with periodontal maintenance continue to experience progressive clinical attachment loss and a fraction of them loses teeth. An oral microbial transplant may be a new alternative for treating periodontitis (inspired by fecal transplant). First, it must be established that microbiomes of oral health and periodontitis are distinct. In that case, the health-associated microbiome could be introduced into the oral cavity of periodontitis patients. This relates to the goals of our study: (i) to assess if microbial communities of the entire oral cavity of subjects with periodontitis were different from or oral health contrasted by microbiotas of caries and edentulism patients; (ii) to test in vitro if safe concentration of sodium hypochlorite could be used for initial eradication of the original oral microbiota followed by a safe neutralization of the hypochlorite prior transplantation.
Sixteen systemically healthy white adults with clinical signs of one of the following oral conditions were enrolled: periodontitis, established caries, edentulism, and oral health. Oral biofilm samples were collected from sub- and supra-gingival sites, and oral mucosae. DNA was extracted and 16S rRNA genes were amplified. Amplicons from the same patient were pooled, sequenced and quantified. Volunteer’s oral plaque was treated with saline, 16 mM NaOCl and NaOCl neutralized by ascorbate buffer followed by plating on blood agar.
Ordination plots of rRNA gene abundances revealed distinct groupings for the oral microbiomes of subjects with periodontitis, edentulism, or oral health. The oral microbiome in subjects with periodontitis showed the greatest diversity harboring 29 bacterial species at significantly higher abundance compared to subjects with the other assessed conditions. Healthy subjects had significantly higher abundance in 10 microbial species compared to the other conditions. NaOCl showed strong antimicrobial properties; nontoxic ascorbate was capable of neutralizing the hypochlorite.
Distinct oral microbial signatures were found in subjects with periodontitis, edentulism, or oral health. This finding opens up a potential for a new therapy, whereby a health-related entire oral microbial community would be transplanted to the diseased patient.
PMCID: PMC4607249  PMID: 26468081
Bacteriotherapy; Microbial transplant; Caries; Edentulism; Periodontitis; Red complex
2.  SELDI-TOF MS Whole Serum Proteomic Profiling with IMAC Surface Does Not Reliably Detect Prostate Cancer 
Clinical chemistry  2007;54(1):53-60.
The analysis of bodily fluids using SELDI-TOF MS has been reported to identify signatures of spectral peaks that can be used to differentiate patients with a specific disease from normal or control patients. This report is the 2nd of 2 companion articles describing a validation study of a SELDI-TOF MS approach with IMAC surface sample processing to identify prostatic adenocarcinoma.
We sought to derive a decision algorithm for classification of prostate cancer from SELDI-TOF MS spectral data from a new retrospective sample cohort of 400 specimens. This new cohort was selected to minimize possible confounders identified in the previous study described in the companion paper.
The resulting new classifier failed to separate patients with prostate cancer from biopsy-negative controls; nor did it separate patients with prostate cancer with Gleason scores <7 from those with Gleason scores ≥7.
In this, the 2nd stage of our planned validation process, the SELDI-TOF MS– based protein expression profiling approach did not perform well enough to advance to the 3rd (prospective study) stage. We conclude that the results from our previous studies—in which differentiation between prostate cancer and noncancer was demonstrated—are not generalizable. Earlier study samples likely had biases in sample selection that upon removal, as in the present study, resulted in inability of the technique to discriminate cancer from non-cancer cases.
PMCID: PMC4332515  PMID: 18024530
3.  Analyzing LC-MS/MS data by spectral count and ion abundance: two case studies 
Statistics and its interface  2012;5(1):75-87.
In comparative proteomics studies, LC-MS/MS data is generally quantified using one or both of two measures: the spectral count, derived from the identification of MS/MS spectra, or some measure of ion abundance derived from the LC-MS data. Here we contrast the performance of these measures and show that ion abundance is the more sensitive. We also examine how the conclusions of a comparative analysis are influenced by the manner in which the LC-MS/MS data is ‘rolled up’ to the protein level, and show that divergent conclusions obtained using different rollups can be informative. Our analysis is based on two publicly available reference data sets, BIATECH-54 and CPTAC, which were developed for the purpose of assessing methods used in label-free differential proteomic studies. We find that the use of the ion abundance measure reveals properties of both data sets not readily apparent using the spectral count.
PMCID: PMC3806317  PMID: 24163717
mass spectrometry; comparative proteomics; ion abundance; spectral count; ion competition
4.  The clathrin adaptor Dab2 recruits EH domain scaffold proteins to regulate integrin β1 endocytosis 
Molecular Biology of the Cell  2012;23(15):2905-2916.
Dab2 binds EH domain proteins. This interaction is required for integrin β1 but not TfnR endocytosis. β1 and TfnR do not colocalize, even though their adaptors sort to the same pits. The data suggest that Dab2 selectively drives β1 endocytosis. It is proposed that specific cargo–adaptor–EH domain protein complexes are needed for efficient endocytosis.
Endocytic adaptor proteins facilitate cargo recruitment and clathrin-coated pit nucleation. The prototypical clathrin adaptor AP2 mediates cargo recruitment, maturation, and scission of the pit by binding cargo, clathrin, and accessory proteins, including the Eps-homology (EH) domain proteins Eps15 and intersectin. However, clathrin-mediated endocytosis of some cargoes proceeds efficiently in AP2-depleted cells. We found that Dab2, another endocytic adaptor, also binds to Eps15 and intersectin. Depletion of EH domain proteins altered the number and size of clathrin structures and impaired the endocytosis of the Dab2- and AP2-dependent cargoes, integrin β1 and transferrin receptor, respectively. To test the importance of Dab2 binding to EH domain proteins for endocytosis, we mutated the EH domain–binding sites. This mutant localized to clathrin structures with integrin β1, AP2, and reduced amounts of Eps15. Of interest, although integrin β1 endocytosis was impaired, transferrin receptor internalization was unaffected. Surprisingly, whereas clathrin structures contain both Dab2 and AP2, integrin β1 and transferrin localize in separate pits. These data suggest that Dab2-mediated recruitment of EH domain proteins selectively drives the internalization of the Dab2 cargo, integrin β1. We propose that adaptors may need to be bound to their cargo to regulate EH domain proteins and internalize efficiently.
PMCID: PMC3408417  PMID: 22648170
5.  Statistical Methods for Tissue Array Images – Algorithmic Scoring and Co-training 
The annals of applied statistics  2012;6(3):1280-1305.
Recent advances in tissue microarray technology have allowed immunohistochemistry to become a powerful medium-to-high throughput analysis tool, particularly for the validation of diagnostic and prognostic biomarkers. However, as study size grows, the manual evaluation of these assays becomes a prohibitive limitation; it vastly reduces throughput and greatly increases variability and expense. We propose an algorithm—Tissue Array Co-Occurrence Matrix Analysis (TACOMA)—for quantifying cellular phenotypes based on textural regularity summarized by local inter-pixel relationships. The algorithm can be easily trained for any staining pattern, is absent of sensitive tuning parameters and has the ability to report salient pixels in an image that contribute to its score. Pathologists’ input via informative training patches is an important aspect of the algorithm that allows the training for any specific marker or cell type. With co-training, the error rate of TACOMA can be reduced substantially for a very small training sample (e.g., with size 30). We give theoretical insights into the success of co-training via thinning of the feature set in a high dimensional setting when there is “sufficient” redundancy among the features. TACOMA is flexible, transparent and provides a scoring process that can be evaluated with clarity and confidence. In a study based on an estrogen receptor (ER) marker, we show that TACOMA is comparable to, or outperforms, pathologists’ performance in terms of accuracy and repeatability.
PMCID: PMC3441061  PMID: 22984376
6.  Structured penalties for functional linear models—partially empirical eigenvectors for regression 
One of the challenges with functional data is incorporating geometric structure, or local correlation, into the analysis. This structure is inherent in the output from an increasing number of biomedical technologies, and a functional linear model is often used to estimate the relationship between the predictor functions and scalar responses. Common approaches to the problem of estimating a coefficient function typically involve two stages: regularization and estimation. Regularization is usually done via dimension reduction, projecting onto a predefined span of basis functions or a reduced set of eigenvectors (principal components). In contrast, we present a unified approach that directly incorporates geometric structure into the estimation process by exploiting the joint eigenproperties of the predictors and a linear penalty operator. In this sense, the components in the regression are ‘partially empirical’ and the framework is provided by the generalized singular value decomposition (GSVD). The form of the penalized estimation is not new, but the GSVD clarifies the process and informs the choice of penalty by making explicit the joint influence of the penalty and predictors on the bias, variance and performance of the estimated coefficient function. Laboratory spectroscopy data and simulations are used to illustrate the concepts.
PMCID: PMC3358792  PMID: 22639702
Penalized regression; generalized singular value decomposition; regularization; functional data
7.  Cruciferous vegetable supplementation in a controlled diet study alters the serum peptidome in a GSTM1-genotype dependent manner 
Nutrition Journal  2011;10:11.
Cruciferous vegetable intake is inversely associated with the risk of several cancers. Isothiocyanates (ITC) are hypothesized to be the major bioactive constituents contributing to these cancer-preventive effects. The polymorphic glutathione-S-transferase (GST) gene family encodes several enzymes which catalyze ITC degradation in vivo.
We utilized high throughput proteomics methods to examine how human serum peptides (the "peptidome") change in response to cruciferous vegetable feeding in individuals of different GSTM1 genotypes. In two randomized, crossover, controlled feeding studies (EAT and 2EAT) participants consumed a fruit- and vegetable-free basal diet and the basal diet supplemented with cruciferous vegetables. Serum samples collected at the end of the feeding period were fractionated and matrix assisted laser desorption/ionization-time of flight (MALDI-TOF) mass spectrometry spectra were obtained. Peak identification/alignment computer algorithms and mixed effects models were used to analyze the data.
After analysis of spectra from EAT participants, 24 distinct peaks showed statistically significant differences associated with cruciferous vegetable intake. Twenty of these peaks were driven by their GSTM1 genotype (i.e., GSTM1+ or GSTM1- null). When data from EAT and 2EAT participants were compared by joint processing of spectra to align a common set, 6 peaks showed consistent changes in both studies in a genotype-dependent manner. The peaks at 6700 m/z and 9565 m/z were identified as an isoform of transthyretin (TTR) and a fragment of zinc α2-glycoprotein (ZAG), respectively.
Cruciferous vegetable intake in GSTM1+ individuals led to changes in circulating levels of several peptides/proteins, including TTR and a fragment of ZAG. TTR is a known marker of nutritional status and ZAG is an adipokine that plays a role in lipid mobilization. The results of this study present evidence that the GSTM1-genotype modulates the physiological response to cruciferous vegetable intake.
PMCID: PMC3042379  PMID: 21272319
8.  Detecting genomic aberrations using products in a multiscale analysis 
Biometrics  2010;66(3):684-693.
Genomic instability, such as copy-number losses and gains, occurs in many genetic diseases. Recent technology developments enable researchers to measure copy numbers at tens of thousands of markers simultaneously. In this paper, we propose a non-parametric approach for detecting the locations of copy-number changes and provide a measure of significance for each change point. The proposed test is based on seeking scale-based changes in the sequence of copy numbers, which is ordered by the marker locations along the chromosome. The method leads to a natural way to estimate the null distribution for the test of a change point and adjusted p-values for the significance of a change point using a step-down maxT permutation algorithm to control the family-wise error rate. A simulation study investigates the finite sample performance of the proposed method and compares it with a more standard sequential testing method. The method is illustrated using two real data sets.
PMCID: PMC2942992  PMID: 19817738
Array-based comparative genomic hybridization; Change point; Copy number variation; Multiple comparison; Multiscale product; p-value; Wavelet
9.  Occurrence of Autoantibodies to Annexin I, 14-3-3 Theta and LAMR1 in Prediagnostic Lung Cancer Sera 
Journal of Clinical Oncology  2008;26(31):5060-5066.
We have implemented a high throughput platform for quantitative analysis of serum autoantibodies, which we have applied to lung cancer for discovery of novel antigens and for validation in prediagnostic sera of autoantibodies to antigens previously defined based on analysis of sera collected at the time of diagnosis.
Materials and Methods
Proteins from human lung adenocarcinoma cell line A549 lysates were subjected to extensive fractionation. The resulting 1,824 fractions were spotted in duplicate on nitrocellulose-coated slides. The microarrays produced were used in a blinded validation study to determine whether annexin I, PGP9.5, and 14-3-3 theta antigens previously found to be targets of autoantibodies in newly diagnosed patients with lung cancer are associated with autoantibodies in sera collected at the presymptomatic stage and to determine whether additional antigens may be identified in prediagnostic sera. Individual sera collected from 85 patients within 1 year before a diagnosis of lung cancer and 85 matched controls from the Carotene and Retinol Efficacy Trial (CARET) cohort were hybridized to individual microarrays.
We present evidence for the occurrence in lung cancer sera of autoantibodies to annexin I, 14-3-3 theta, and a novel lung cancer antigen, LAMR1, which precede onset of symptoms and diagnosis.
Our findings suggest potential utility of an approach to diagnosis of lung cancer before onset of symptoms that includes screening for autoantibodies to defined antigens.
PMCID: PMC2652098  PMID: 18794547

