Search tips
Search criteria

Results 1-25 (1123233)

Clipboard (0)

Related Articles

1.  AucPR: An AUC-based approach using penalized regression for disease prediction with high-dimensional omics data 
BMC Genomics  2014;15(Suppl 10):S1.
It is common to get an optimal combination of markers for disease classification and prediction when multiple markers are available. Many approaches based on the area under the receiver operating characteristic curve (AUC) have been proposed. Existing works based on AUC in a high-dimensional context depend mainly on a non-parametric, smooth approximation of AUC, with no work using a parametric AUC-based approach, for high-dimensional data.
We propose an AUC-based approach using penalized regression (AucPR), which is a parametric method used for obtaining a linear combination for maximizing the AUC. To obtain the AUC maximizer in a high-dimensional context, we transform a classical parametric AUC maximizer, which is used in a low-dimensional context, into a regression framework and thus, apply the penalization regression approach directly. Two kinds of penalization, lasso and elastic net, are considered. The parametric approach can avoid some of the difficulties of a conventional non-parametric AUC-based approach, such as the lack of an appropriate concave objective function and a prudent choice of the smoothing parameter. We apply the proposed AucPR for gene selection and classification using four real microarray and synthetic data. Through numerical studies, AucPR is shown to perform better than the penalized logistic regression and the nonparametric AUC-based method, in the sense of AUC and sensitivity for a given specificity, particularly when there are many correlated genes.
We propose a powerful parametric and easily-implementable linear classifier AucPR, for gene selection and disease prediction for high-dimensional data. AucPR is recommended for its good prediction performance. Beside gene expression microarray data, AucPR can be applied to other types of high-dimensional omics data, such as miRNA and protein data.
PMCID: PMC4304290  PMID: 25559769
AUC; high-dimensional data; penalized regression; ROC curve
2.  A boosting method for maximizing the partial area under the ROC curve 
BMC Bioinformatics  2010;11:314.
The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration.
We have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis.
The proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker.
PMCID: PMC2898798  PMID: 20537139
3.  Minimalist ensemble algorithms for genome-wide protein localization prediction 
BMC Bioinformatics  2012;13:157.
Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms.
This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors.
We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at
PMCID: PMC3426488  PMID: 22759391
Protein subcellular localization; Ensemble algorithms; Classifiers; Logistic regression
4.  The discriminative ability of FRAX, the WHO algorithm, to identify women with prevalent asymptomatic vertebral fractures: a cross-sectional study 
A Moroccan model for the FRAX tool to determine the absolute risk of osteoporotic fracture at 10 years has been established recently. The study aimed to assess the discriminative capacity of FRAX in identifying women with prevalent asymptomatic vertebral fractures (VFs).
We enrolled in this cross-sectional study 908 post-menopausal women with a mean age of 60.9 years ±7.7 (50 to 91) with no prior known diagnosis of osteoporosis. Subjects were recruited from asymptomatic women selected from the general population. Lateral VFA images and scans of the lumbar spine and proximal femur were obtained using a GE Healthcare Lunar Prodigy densitometer. VFs were defined using a combination of Genantsemiquantitative (SQ) approach and morphometry. We calculated the absolute risk of major fracture and hip fracture with and without bone mineral density (BMD)using the FRAX website.The overall discriminative value of the different risk scores was assessed by calculating the areas under the ROC curve (AUC).
VFA images showed that 179 of the participants (19.7%) had at least one grade 2/3 VF. The group of women with VFs had a statistically significant higher FRAX scores for major and hip fractures with and without BMD, and lower weight, height, and lumbar spine and hip BMD and T-scores than those without a VFA-identified VF. The AUC ROC of FRAX for major fracture without BMD was 0.757 (CI 95%; 0.718-0.797) and 0.736 (CI 95%; 0.695-0.777) with BMD, being 0.756 (CI 95%; 0.716-0.796) and 0.747 (CI 95%; 0.709-0.785), respectively for FRAX hip fracture without and with BMD. The AUC ROC of lumbar spine T-score and femoral neck T-score were 0.660 (CI 95%; 0.611-0.708) and 0.707 (CI 95%; 0.664-0.751) respectively.
In asymptomatic post-menopausal women, the FRAX risk for major fracture without BMD had a better discriminative capacity in identifying the women with prevalent VFs than lumbar spine and femoral neck T-scores suggesting its usefulness in identifying women in whom VFA could be indicated.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2474-15-365) contains supplementary material, which is available to authorized users.
PMCID: PMC4226884  PMID: 25366306
FRAX; Bone density; Female; Vertebral fractures; VFA; DXA; Bone; Osteoporosis; Postmenopausal; Menopause; Risk factors; Sensitivity and specificity
5.  On the analysis of glycomics mass spectrometry data via the regularized area under the ROC curve 
BMC Bioinformatics  2007;8:477.
Novel molecular and statistical methods are in rising demand for disease diagnosis and prognosis with the help of recent advanced biotechnology. High-resolution mass spectrometry (MS) is one of those biotechnologies that are highly promising to improve health outcome. Previous literatures have identified some proteomics biomarkers that can distinguish healthy patients from cancer patients using MS data. In this paper, an MS study is demonstrated which uses glycomics to identify ovarian cancer. Glycomics is the study of glycans and glycoproteins. The glycans on the proteins may deviate between a cancer cell and a normal cell and may be visible in the blood. High-resolution MS has been applied to measure relative abundances of potential glycan biomarkers in human serum. Multiple potential glycan biomarkers are measured in MS spectra. With the objection of maximizing the empirical area under the ROC curve (AUC), an analysis method was considered which combines potential glycan biomarkers for the diagnosis of cancer.
Maximizing the empirical AUC of glycomics MS data is a large-dimensional optimization problem. The technical difficulty is that the empirical AUC function is not continuous. Instead, it is in fact an empirical 0–1 loss function with a large number of linear predictors. An approach was investigated that regularizes the area under the ROC curve while replacing the 0–1 loss function with a smooth surrogate function. The constrained threshold gradient descent regularization algorithm was applied, where the regularization parameters were chosen by the cross-validation method, and the confidence intervals of the regression parameters were estimated by the bootstrap method. The method is called TGDR-AUC algorithm. The properties of the approach were studied through a numerical simulation study, which incorporates the positive values of mass spectrometry data with the correlations between measurements within person. The simulation proved asymptotic properties that estimated AUC approaches the true AUC. Finally, mass spectrometry data of serum glycan for ovarian cancer diagnosis was analyzed. The optimal combination based on TGDR-AUC algorithm yields plausible result and the detected biomarkers are confirmed based on biological evidence.
The TGDR-AUC algorithm relaxes the normality and independence assumptions from previous literatures. In addition to its flexibility and easy interpretability, the algorithm yields good performance in combining potential biomarkers and is computationally feasible. Thus, the approach of TGDR-AUC is a plausible algorithm to classify disease status on the basis of multiple biomarkers.
PMCID: PMC2211327  PMID: 18076765
6.  Accurate Prediction of Immunogenic T-Cell Epitopes from Epitope Sequences Using the Genetic Algorithm-Based Ensemble Learning 
PLoS ONE  2015;10(5):e0128194.
T-cell epitopes play the important role in T-cell immune response, and they are critical components in the epitope-based vaccine design. Immunogenicity is the ability to trigger an immune response. The accurate prediction of immunogenic T-cell epitopes is significant for designing useful vaccines and understanding the immune system.
In this paper, we attempt to differentiate immunogenic epitopes from non-immunogenic epitopes based on their primary structures. First of all, we explore a variety of sequence-derived features, and analyze their relationship with epitope immunogenicity. To effectively utilize various features, a genetic algorithm (GA)-based ensemble method is proposed to determine the optimal feature subset and develop the high-accuracy ensemble model. In the GA optimization, a chromosome is to represent a feature subset in the search space. For each feature subset, the selected features are utilized to construct the base predictors, and an ensemble model is developed by taking the average of outputs from base predictors. The objective of GA is to search for the optimal feature subset, which leads to the ensemble model with the best cross validation AUC (area under ROC curve) on the training set.
Two datasets named ‘IMMA2’ and ‘PAAQD’ are adopted as the benchmark datasets. Compared with the state-of-the-art methods POPI, POPISK, PAAQD and our previous method, the GA-based ensemble method produces much better performances, achieving the AUC score of 0.846 on IMMA2 dataset and the AUC score of 0.829 on PAAQD dataset. The statistical analysis demonstrates the performance improvements of GA-based ensemble method are statistically significant.
The proposed method is a promising tool for predicting the immunogenic epitopes. The source codes and datasets are available in S1 File.
PMCID: PMC4447411  PMID: 26020952
7.  A Comparison of Prediction Models for Fractures in Older Women: Is More Better 
Archives of internal medicine  2009;169(22):2087-2094.
A web-based risk assessment tool (FRAX®) using clinical risk factors with and without femoral neck bone mineral density (BMD) has been incorporated into clinical guidelines regarding treatment to prevent fractures. Our objective is to determine whether prediction with FRAX® models is superior to that based on parsimonious models.
We conducted a prospective cohort study in 6252 women aged ≥65 years and compared the value of FRAX® models that include BMD to parsimonious models based on age and BMD alone for prediction of fractures. We also compared FRAX® models without BMD to simple models based on age and fracture history alone. Fractures (hip, major osteoporotic [hip, clinical vertebral, wrist, or humerus], and any clinical fracture) were ascertained during 10 years of follow-up. Area under the curve (AUC) statistics from receiver operating characteristic (ROC) curve analysis were compared between FRAX® models and simple models.
AUC comparisons revealed no differences between FRAX® models with BMD versus simple models with age and BMD alone in discriminating hip (AUC=0.75 for FRAX® model and 0.76 for simple model, p=0.26); major osteoporotic (AUC=0.68 for FRAX® model and 0.69 for simple model, p=0.51); or clinical fracture (AUC=0.64 for FRAX® model and 0.63 for simple model, p=0.16). Similarly, performance of parsimonious models containing age and fracture history alone was nearly identical to that of FRAX® models without BMD. The proportion of women in each quartile of predicted risk who actually experienced a fracture outcome did not differ between FRAX® and simple models (p≥0.16).
Simple models based on age and BMD alone or age and fracture history alone predicted 10-year risk of hip, major osteoporotic, and clinical fracture as well as more complex FRAX® models.
PMCID: PMC2811407  PMID: 20008691
8.  Assessment of Alveolar Bone Mineral Density as a Predictor of Lumbar Fracture Probability 
Advances in Therapy  2013;30(5):487-502.
Osteoporosis and tooth loss have been linked with advancing age, but no clear relationship between these conditions has been proven. Several studies of bone mineral density measurements of the jaw and spine have shown similarities in their rate of age-related deterioration. Thus, measurements of jawbone density may predict lumbar vertebral bone density. Using jawbone density as a proxy marker would circumvent the need for lumbar bone measurements and facilitate prediction of osteoporotic spinal fracture susceptibility at dental clinics. We aimed to characterize the correlation between bone density in the jaw and spine and the incidence of osteoporotic spinal fractures.
We used computerized radiogrammetry to measure alveolar bone mineral density (al-BMD) and dual-energy X-ray absorptiometry to measure lumbar bone mineral density (L-BMD). L-BMD and al-BMD in 30 female patients (average age: 59 ± 5 years) were correlated with various patient attributes. Statistical analysis included area under the curve (AUC) and probability of asymptomatic significance (PAS) in a receiver operating characteristic curve. The predictive strength of L-BMD T-scores (L-BMD[T]) and al-BMD measurements for fracture occurrence was then compared using multivariate analysis with category weight scoring.
L-BMD and al-BMD were significantly correlated with age, years since menopause, and alveolar bone thickness. Both were also negatively correlated with fracture incidence. Category weight scores were −0.275 for a L-BMD(T) <80%; +0.183 for a L-BMD(T) ≥80%; −0.860 for al-BMD <84.9 (brightness); and +0.860 for al-BMD ≥84.9. AUC and PAS analyses suggested that al-BMD had a higher association with fracture occurrence than L-BMD.
Our results suggest the possible association between al-BMD and vertebral fracture risk. Assessment of alveolar bone density may be useful in patients receiving routine dental exams to monitor the clinical picture and the potential course of osteoporosis in patients who may be at a higher risk of developing osteoporosis.
PMCID: PMC3680661  PMID: 23674163
Alveolar; Bone mineral density; Computerized; Fracture; Lumbar; Osteoporosis; Periodontitis; Predictive; Radiogrammetry
9.  Utilization of DXA Bone Mineral Densitometry in Ontario 
Executive Summary
Systematic reviews and analyses of administrative data were performed to determine the appropriate use of bone mineral density (BMD) assessments using dual energy x-ray absorptiometry (DXA), and the associated trends in wrist and hip fractures in Ontario.
Dual Energy X-ray Absorptiometry Bone Mineral Density Assessment
Dual energy x-ray absorptiometry bone densitometers measure bone density based on differential absorption of 2 x-ray beams by bone and soft tissues. It is the gold standard for detecting and diagnosing osteoporosis, a systemic disease characterized by low bone density and altered bone structure, resulting in low bone strength and increased risk of fractures. The test is fast (approximately 10 minutes) and accurate (exceeds 90% at the hip), with low radiation (1/3 to 1/5 of that from a chest x-ray). DXA densitometers are licensed as Class 3 medical devices in Canada. The World Health Organization has established criteria for osteoporosis and osteopenia based on DXA BMD measurements: osteoporosis is defined as a BMD that is >2.5 standard deviations below the mean BMD for normal young adults (i.e. T-score <–2.5), while osteopenia is defined as BMD that is more than 1 standard deviation but less than 2.5 standard deviation below the mean for normal young adults (i.e. T-score< –1 & ≥–2.5). DXA densitometry is presently an insured health service in Ontario.
Clinical Need
Burden of Disease
The Canadian Multicenter Osteoporosis Study (CaMos) found that 16% of Canadian women and 6.6% of Canadian men have osteoporosis based on the WHO criteria, with prevalence increasing with age. Osteopenia was found in 49.6% of Canadian women and 39% of Canadian men. In Ontario, it is estimated that nearly 530,000 Ontarians have some degrees of osteoporosis. Osteoporosis-related fragility fractures occur most often in the wrist, femur and pelvis. These fractures, particularly those in the hip, are associated with increased mortality, and decreased functional capacity and quality of life. A Canadian study showed that at 1 year after a hip fracture, the mortality rate was 20%. Another 20% required institutional care, 40% were unable to walk independently, and there was lower health-related quality of life due to attributes such as pain, decreased mobility and decreased ability to self-care. The cost of osteoporosis and osteoporotic fractures in Canada was estimated to be $1.3 billion in 1993.
Guidelines for Bone Mineral Density Testing
With 2 exceptions, almost all guidelines address only women. None of the guidelines recommend blanket population-based BMD testing. Instead, all guidelines recommend BMD testing in people at risk of osteoporosis, predominantly women aged 65 years or older. For women under 65 years of age, BMD testing is recommended only if one major or two minor risk factors for osteoporosis exist. Osteoporosis Canada did not restrict its recommendations to women, and thus their guidelines apply to both sexes. Major risk factors are age greater than or equal to 65 years, a history of previous fractures, family history (especially parental history) of fracture, and medication or disease conditions that affect bone metabolism (such as long-term glucocorticoid therapy). Minor risk factors include low body mass index, low calcium intake, alcohol consumption, and smoking.
Current Funding for Bone Mineral Density Testing
The Ontario Health Insurance Program (OHIP) Schedule presently reimburses DXA BMD at the hip and spine. Measurements at both sites are required if feasible. Patients at low risk of accelerated bone loss are limited to one BMD test within any 24-month period, but there are no restrictions on people at high risk. The total fee including the professional and technical components for a test involving 2 or more sites is $106.00 (Cdn).
Method of Review
This review consisted of 2 parts. The first part was an analysis of Ontario administrative data relating to DXA BMD, wrist and hip fractures, and use of antiresorptive drugs in people aged 65 years and older. The Institute for Clinical Evaluative Sciences extracted data from the OHIP claims database, the Canadian Institute for Health Information hospital discharge abstract database, the National Ambulatory Care Reporting System, and the Ontario Drug Benefit database using OHIP and ICD-10 codes. The data was analyzed to examine the trends in DXA BMD use from 1992 to 2005, and to identify areas requiring improvement.
The second part included systematic reviews and analyses of evidence relating to issues identified in the analyses of utilization data. Altogether, 8 reviews and qualitative syntheses were performed, consisting of 28 published systematic reviews and/or meta-analyses, 34 randomized controlled trials, and 63 observational studies.
Findings of Utilization Analysis
Analysis of administrative data showed a 10-fold increase in the number of BMD tests in Ontario between 1993 and 2005.
OHIP claims for BMD tests are presently increasing at a rate of 6 to 7% per year. Approximately 500,000 tests were performed in 2005/06 with an age-adjusted rate of 8,600 tests per 100,000 population.
Women accounted for 90 % of all BMD tests performed in the province.
In 2005/06, there was a 2-fold variation in the rate of DXA BMD tests across local integrated health networks, but a 10-fold variation between the county with the highest rate (Toronto) and that with the lowest rate (Kenora). The analysis also showed that:
With the increased use of BMD, there was a concomitant increase in the use of antiresorptive drugs (as shown in people 65 years and older) and a decrease in the rate of hip fractures in people age 50 years and older.
Repeat BMD made up approximately 41% of all tests. Most of the people (>90%) who had annual BMD tests in a 2-year or 3-year period were coded as being at high risk for osteoporosis.
18% (20,865) of the people who had a repeat BMD within a 24-month period and 34% (98,058) of the people who had one BMD test in a 3-year period were under 65 years, had no fracture in the year, and coded as low-risk.
Only 19% of people age greater than 65 years underwent BMD testing and 41% received osteoporosis treatment during the year following a fracture.
Men accounted for 24% of all hip fractures and 21 % of all wrist fractures, but only 10% of BMD tests. The rates of BMD tests and treatment in men after a fracture were only half of those in women.
In both men and women, the rate of hip and wrist fractures mainly increased after age 65 with the sharpest increase occurring after age 80 years.
Findings of Systematic Review and Analysis
Serial Bone Mineral Density Testing for People Not Receiving Osteoporosis Treatment
A systematic review showed that the mean rate of bone loss in people not receiving osteoporosis treatment (including postmenopausal women) is generally less than 1% per year. Higher rates of bone loss were reported for people with disease conditions or on medications that affect bone metabolism. In order to be considered a genuine biological change, the change in BMD between serial measurements must exceed the least significant change (variability) of the testing, ranging from 2.77% to 8% for precisions ranging from 1% to 3% respectively. Progression in BMD was analyzed, using different rates of baseline BMD values, rates of bone loss, precision, and BMD value for initiating treatment. The analyses showed that serial BMD measurements every 24 months (as per OHIP policy for low-risk individuals) is not necessary for people with no major risk factors for osteoporosis, provided that the baseline BMD is normal (T-score ≥ –1), and the rate of bone loss is less than or equal to 1% per year. The analyses showed that for someone with a normal baseline BMD and a rate of bone loss of less than 1% per year, the change in BMD is not likely to exceed least significant change (even for a 1% precision) in less than 3 years after the baseline test, and is not likely to drop to a BMD level that requires initiation of treatment in less than 16 years after the baseline test.
Serial Bone Mineral Density Testing in People Receiving Osteoporosis Therapy
Seven published meta-analysis of randomized controlled trials (RCTs) and 2 recent RCTs on BMD monitoring during osteoporosis therapy showed that although higher increases in BMD were generally associated with reduced risk of fracture, the change in BMD only explained a small percentage of the fracture risk reduction.
Studies showed that some people with small or no increase in BMD during treatment experienced significant fracture risk reduction, indicating that other factors such as improved bone microarchitecture might have contributed to fracture risk reduction.
There is conflicting evidence relating to the role of BMD testing in improving patient compliance with osteoporosis therapy.
Even though BMD may not be a perfect surrogate for reduction in fracture risk when monitoring responses to osteoporosis therapy, experts advised that it is still the only reliable test available for this purpose.
A systematic review conducted by the Medical Advisory Secretariat showed that the magnitude of increases in BMD during osteoporosis drug therapy varied among medications. Although most of the studies yielded mean percentage increases in BMD from baseline that did not exceed the least significant change for a 2% precision after 1 year of treatment, there were some exceptions.
Bone Mineral Density Testing and Treatment After a Fragility Fracture
A review of 3 published pooled analyses of observational studies and 12 prospective population-based observational studies showed that the presence of any prevalent fracture increases the relative risk for future fractures by approximately 2-fold or more. A review of 10 systematic reviews of RCTs and 3 additional RCTs showed that therapy with antiresorptive drugs significantly reduced the risk of vertebral fractures by 40 to 50% in postmenopausal osteoporotic women and osteoporotic men, and 2 antiresorptive drugs also reduced the risk of nonvertebral fractures by 30 to 50%. Evidence from observational studies in Canada and other jurisdictions suggests that patients who had undergone BMD measurements, particularly if a diagnosis of osteoporosis is made, were more likely to be given pharmacologic bone-sparing therapy. Despite these findings, the rate of BMD investigation and osteoporosis treatment after a fracture remained low (<20%) in Ontario as well as in other jurisdictions.
Bone Mineral Density Testing in Men
There are presently no specific Canadian guidelines for BMD screening in men. A review of the literature suggests that risk factors for fracture and the rate of vertebral deformity are similar for men and women, but the mortality rate after a hip fracture is higher in men compared with women. Two bisphosphonates had been shown to reduce the risk of vertebral and hip fractures in men. However, BMD testing and osteoporosis treatment were proportionately low in Ontario men in general, and particularly after a fracture, even though men accounted for 25% of the hip and wrist fractures. The Ontario data also showed that the rates of wrist fracture and hip fracture in men rose sharply in the 75- to 80-year age group.
Ontario-Based Economic Analysis
The economic analysis focused on analyzing the economic impact of decreasing future hip fractures by increasing the rate of BMD testing in men and women age greater than or equal to 65 years following a hip or wrist fracture. A decision analysis showed the above strategy, especially when enhanced by improved reporting of BMD tests, to be cost-effective, resulting in a cost-effectiveness ratio ranging from $2,285 (Cdn) per fracture avoided (worst-case scenario) to $1,981 (Cdn) per fracture avoided (best-case scenario). A budget impact analysis estimated that shifting utilization of BMD testing from the low risk population to high risk populations within Ontario would result in a saving of $0.85 million to $1.5 million (Cdn) to the health system. The potential net saving was estimated at $1.2 million to $5 million (Cdn) when the downstream cost-avoidance due to prevention of future hip fractures was factored into the analysis.
Other Factors for Consideration
There is a lack of standardization for BMD testing in Ontario. Two different standards are presently being used and experts suggest that variability in results from different facilities may lead to unnecessary testing. There is also no requirement for standardized equipment, procedure or reporting format. The current reimbursement policy for BMD testing encourages serial testing in people at low risk of accelerated bone loss. This review showed that biannual testing is not necessary for all cases. The lack of a database to collect clinical data on BMD testing makes it difficult to evaluate the clinical profiles of patients tested and outcomes of the BMD tests. There are ministry initiatives in progress under the Osteoporosis Program to address the development of a mandatory standardized requisition form for BMD tests to facilitate data collection and clinical decision-making. Work is also underway for developing guidelines for BMD testing in men and in perimenopausal women.
Increased use of BMD in Ontario since 1996 appears to be associated with increased use of antiresorptive medication and a decrease in hip and wrist fractures.
Data suggest that as many as 20% (98,000) of the DXA BMD tests in Ontario in 2005/06 were performed in people aged less than 65 years, with no fracture in the current year, and coded as being at low risk for accelerated bone loss; this is not consistent with current guidelines. Even though some of these people might have been incorrectly coded as low-risk, the number of tests in people truly at low risk could still be substantial.
Approximately 4% (21,000) of the DXA BMD tests in 2005/06 were repeat BMDs in low-risk individuals within a 24-month period. Even though this is in compliance with current OHIP reimbursement policies, evidence showed that biannual serial BMD testing is not necessary in individuals without major risk factors for fractures, provided that the baseline BMD is normal (T-score < –1). In this population, BMD measurements may be repeated in 3 to 5 years after the baseline test to establish the rate of bone loss, and further serial BMD tests may not be necessary for another 7 to 10 years if the rate of bone loss is no more than 1% per year. Precision of the test needs to be considered when interpreting serial BMD results.
Although changes in BMD may not be the perfect surrogate for reduction in fracture risk as a measure of response to osteoporosis treatment, experts advised that it is presently the only reliable test for monitoring response to treatment and to help motivate patients to continue treatment. Patients should not discontinue treatment if there is no increase in BMD after the first year of treatment. Lack of response or bone loss during treatment should prompt the physician to examine whether the patient is taking the medication appropriately.
Men and women who have had a fragility fracture at the hip, spine, wrist or shoulder are at increased risk of having a future fracture, but this population is presently under investigated and under treated. Additional efforts have to be made to communicate to physicians (particularly orthopaedic surgeons and family physicians) and the public about the need for a BMD test after fracture, and for initiating treatment if low BMD is found.
Men had a disproportionately low rate of BMD tests and osteoporosis treatment, especially after a fracture. Evidence and fracture data showed that the risk of hip and wrist fractures in men rises sharply at age 70 years.
Some counties had BMD utilization rates that were only 10% of that of the county with the highest utilization. The reasons for low utilization need to be explored and addressed.
Initiatives such as aligning reimbursement policy with current guidelines, developing specific guidelines for BMD testing in men and perimenopausal women, improving BMD reports to assist in clinical decision making, developing a registry to track BMD tests, improving access to BMD tests in remote/rural counties, establishing mechanisms to alert family physicians of fractures, and educating physicians and the public, will improve the appropriate utilization of BMD tests, and further decrease the rate of fractures in Ontario. Some of these initiatives such as developing guidelines for perimenopausal women and men, and developing a standardized requisition form for BMD testing, are currently in progress under the Ontario Osteoporosis Strategy.
PMCID: PMC3379167  PMID: 23074491
10.  FRAX® tool, the WHO algorithm to predict osteoporotic fractures: the first analysis of its discriminative and predictive ability in the Spanish FRIDEX cohort 
The WHO has recently published the FRAX® tool to determine the absolute risk of osteoporotic fracture at 10 years. This tool has not yet been validated in Spain.
A prospective observational study was undertaken in women in the FRIDEX cohort (Barcelona) not receiving bone active drugs at baseline. Baseline measurements: known risk factors including those of FRAX® and a DXA. Follow up data on self-reported incident major fractures (hip, spine, humerus and wrist) and verified against patient records. The calculation of absolute risk of major fracture and hip fracture was by FRAX® website. This work follows the guidelines of the STROBE initiative for cohort studies. The discriminative capacity of FRAX® was analyzed by the Area Under Curve (AUC), Receiver Operating Characteristics (ROC) and the Hosmer-Lemeshow goodness-of-fit test. The predictive capacity was determined using the ratio of observed fractures/expected fractures by FRAX® (ObsFx/ExpFx).
The study subjects were 770 women from 40 to 90 years of age in the FRIDEX cohort. The mean age was 56.8 ± 8 years. The fractures were determined by structured telephone questionnaire and subsequent testing in medical records at 10 years. Sixty-five (8.4%) women presented major fractures (17 hip fractures). Women with fractures were older, had more previous fractures, more cases of rheumatoid arthritis and also more osteoporosis on the baseline DXA. The AUC ROC of FRAX® for major fracture without bone mineral density (BMD) was 0.693 (CI 95%; 0.622-0.763), with T-score of femoral neck (FN) 0.716 (CI 95%; 0.646-0.786), being 0.888 (CI 95%; 0.824-0.952) and 0.849 (CI 95%; 0.737-0.962), respectively for hip fracture. In the model with BMD alone was 0.661 (CI 95%; 0.583-0.739) and 0.779 (CI 95%; 0.631-0.929). In the model with age alone was 0.668 (CI 95%; 0.603-0.733) and 0.882 (CI 95%; 0.832-0.936). In both cases there are not significant differences against FRAX® model. The overall predictive value for major fracture by ObsFx/ExpFx ratio was 2.4 and 2.8 for hip fracture without BMD. With BMD was 2.2 and 2.3 respectively. Sensitivity of the four was always less than 50%. The Hosmer-Lemeshow test showed a good correlation only after calibration with ObsFx/ExpFx ratio.
The current version of FRAX® for Spanish women without BMD analzsed by the AUC ROC demonstrate a poor discriminative capacity to predict major fractures but a good discriminative capacity for hip fractures. Its predictive capacity does not adjust well because leading to underdiagnosis for both predictions major and hip fractures. Simple models based only on age or BMD alone similarly predicted that more complex FRAX® models.
PMCID: PMC3518201  PMID: 23088223
11.  A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers 
BMC Bioinformatics  2012;13:326.
Biomarker panels derived separately from genomic and proteomic data and with a variety of computational methods have demonstrated promising classification performance in various diseases. An open question is how to create effective proteo-genomic panels. The framework of ensemble classifiers has been applied successfully in various analytical domains to combine classifiers so that the performance of the ensemble exceeds the performance of individual classifiers. Using blood-based diagnosis of acute renal allograft rejection as a case study, we address the following question in this paper: Can acute rejection classification performance be improved by combining individual genomic and proteomic classifiers in an ensemble?
The first part of the paper presents a computational biomarker development pipeline for genomic and proteomic data. The pipeline begins with data acquisition (e.g., from bio-samples to microarray data), quality control, statistical analysis and mining of the data, and finally various forms of validation. The pipeline ensures that the various classifiers to be combined later in an ensemble are diverse and adequate for clinical use. Five mRNA genomic and five proteomic classifiers were developed independently using single time-point blood samples from 11 acute-rejection and 22 non-rejection renal transplant patients. The second part of the paper examines five ensembles ranging in size from two to 10 individual classifiers. Performance of ensembles is characterized by area under the curve (AUC), sensitivity, and specificity, as derived from the probability of acute rejection for individual classifiers in the ensemble in combination with one of two aggregation methods: (1) Average Probability or (2) Vote Threshold. One ensemble demonstrated superior performance and was able to improve sensitivity and AUC beyond the best values observed for any of the individual classifiers in the ensemble, while staying within the range of observed specificity. The Vote Threshold aggregation method achieved improved sensitivity for all 5 ensembles, but typically at the cost of decreased specificity.
Proteo-genomic biomarker ensemble classifiers show promise in the diagnosis of acute renal allograft rejection and can improve classification performance beyond that of individual genomic or proteomic classifiers alone. Validation of our results in an international multicenter study is currently underway.
PMCID: PMC3575305  PMID: 23216969
Biomarkers; Computational; Pipeline; Genomics; Proteomics; Ensemble; Classification
12.  Clinical performance of osteoporosis risk assessment tools in women aged 67 years and older 
Clinical performance of osteoporosis risk assessment tools was studied in women aged 67 years and older. Weight was as accurate as two of the tools to detect low bone density. Discriminatory ability was slightly better for the OST risk tool, which is based only on age and weight.
Screening performance of osteoporosis risk assessment tools has not been tested in a large, population-based US cohort.
We conducted a diagnostic accuracy analysis of the Osteoporosis Self-assessment Tool (OST), Osteoporosis Risk Assessment Instrument (ORAI), Simple Calculated Osteoporosis Risk Estimation (SCORE), and individual risk factors (age, weight or prior fracture) to identify low central (hip and lumbar spine) bone mineral density (BMD) in 7779 US women aged 67 years and older participating in the Study of Osteoporotic Fractures.
The OST had the greatest area under the receiver operating characteristic curve (AUC 0.76, 95% CI 0.74, 0.77). Weight had an AUC of 0.73 (95% CI 0.72, 0.75), which was ≥AUC values for the ORAI, SCORE, age or prior fracture. Using cut points from the development papers, the risk tools had sensitivities ≥85% and specificities ≤48%. When new cut points were set to achieve a likelihood ratio of negative 0.1–0.2, the tools ruled out fewer than 1/4 of women without low central BMD.
Weight identified low central BMD as accurately as the ORAI and SCORE. The risk tools would be unlikely to show an advantage over simple weight cut points in an osteoporosis screening protocol for elderly women.
PMCID: PMC2562917  PMID: 18219434
Bone density; Female; Mass screening; Osteoporosis; Postmenopause; Risk assessment
13.  The WHO Absolute Fracture Risk Models (FRAX): Do Clinical Risk Factors Improve Fracture Prediction in Older Women Without Osteoporosis? 
Bone mineral density (BMD) is a strong predictor of fracture, yet most fractures occur in women without osteoporosis by BMD criteria. To improve fracture-risk prediction, the World Health Organization recently developed a country-specific fracture risk index of clinical risk factors (FRAX®) that estimates 10-year probabilities of hip and major osteoporotic fracture. Within differing baseline BMD categories, we evaluated 6252 women age 65 and older in the Study of Osteoporotic Fractures using FRAX 10-year probabilities of hip and major osteoporotic fracture (hip, clinical spine, wrist, humerus) compared to incidence of fractures over 10 years of follow-up. Overall ability of FRAX to predict fracture risk based on initial BMD T-score categories (normal, low bone mass, and osteoporosis) was evaluated with receiver-operating-characteristic (ROC) analyses using area-under-the-curve (AUC). Over 10 years of follow-up, 368 women incurred a hip fracture, and 1011 a major osteoporotic fracture. Women with low bone mass represented the majority (n=3791; 61%); they developed many hip (n=176; 48%) and major osteoporotic fractures (n=569; 56%). Among women with normal and low bone mass, FRAX (including BMD) was an overall better predictor of hip fracture risk (AUC = 0.78 and 0.70, respectively) than major osteoporotic fractures (AUC = 0.64 and 0.62). Simpler models (e.g., age+prior fracture) had similar AUCs to FRAX, including among women for whom primary prevention is sought (no prior fracture or osteoporosis by BMD). The FRAX, and simpler models, predict 10-year risk of incident hip and major osteoporotic fractures in older U.S. women with normal or low bone mass.
PMCID: PMC3622725  PMID: 21351144
osteopenia; osteoporosis; fracture; risk; prediction
14.  Comparison of Candidate Serologic Markers for Type I and Type II Ovarian Cancer 
Gynecologic oncology  2011;122(3):560-566.
To examine the value of individual and combinations of ovarian cancer associated blood biomarkers for the discrimination between plasma of patients with type I or II ovarian cancer and disease-free volunteers.
Levels of 14 currently promising ovarian cancer-related biomarkers, including CA125, macrophage inhibitory factor-1 (MIF-1), leptin, prolactin, osteopontin (OPN), insulin-like growth factor-II (IGF-II), autoantibodies (AAbs) to eight proteins: p53, NY-ESO-1, p16, ALPP, CTSD, B23, GRP78, and SSX, were measured in the plasma of 151 ovarian cancer patients, 23 with borderline ovarian tumors, 55 with benign tumors and 75 healthy controls.
When examined individually, seven candidate biomarkers (MIF, Prolactin, CA-125, OPN, Leptin, IGF-II and p53 AAbs) had significantly different plasma levels between type II ovarian cancer patients and healthy controls. Based on the receiver operating characteristic (ROC) curves constructed and area under the curve (AUC) calculated, CA125 exhibited the greatest power to discriminate the plasma samples of type II cancer patients from normal volunteers (AUC 0.9310), followed by IGF-II (AUC 0.8514), OPN (AUC 0.7888), leptin (AUC 0.7571), prolactin (AUC 0.7247), p53 AAbs (AUC 0.7033), and MIF (AUC 0.6992). p53 AAbs levels exhibited the lowest correlation with CA125 levels among the six markers, suggesting the potential of p53 AAbs as a biomarker independent of CA125. Indeed, p53 AAbs increased the AUC of ROC curve to the greatest extent when combining CA125 with one of the other markers. At a fixed specificity of 100%, the addition of p53 AAbs to CA125 increased sensitivity from 73.8% to 85.7% to discriminate type II cancer patients from normal controls. Notably, seropositivity of p53 AAbs is comparable in type II ovarian cancer patients with negative and positive CA125, but has no value for type I ovarian cancer patients.
p53 AAbs might be a useful blood-based biomarker for the detection of type II ovarian cancer, especially when combined with CA125 levels.
PMCID: PMC3152615  PMID: 21704359
15.  Bone fracture risk estimation based on image similarity 
Bone  2009;45(3):560-567.
We propose a fracture risk estimation technique based on image similarity. We employ image similarity indices to determine how images are similar to each other in their 3D bone mineral density distributions. Our premise for fracture risk estimation is that if a given scan is more similar to scans of subjects known to have fractures than to scans of control subjects, this subject is likely to have a higher degree of fracture risk. To test this hypothesis, we analyzed hip QCT scans of 37 patients with hip fractures and 38 age-matched controls. We divided the scans randomly into two groups: the Model Group and the Test Group. For each scan in the the Test Group, the difference between the mean value of its image similarities to the Model fracture group and the mean value of its image similarities to the Model control group was used as index of fracture risk. We then used the estimated fracture risk indices to discriminate the fractured patients and controls in the Test Group. A test scan with a larger mean value of image similarities with respect to the Model fracture group was classified as a scan from a fractured patient, otherwise it was classified as a scan from a control subject. Based on ROC analysis, we compared the discrimination performances using image similarity measures with that obtained by using bone mineral density (BMD). When using BMD measured in the femoral neck, with the optimal BMD cutoff, the sensitivity and specificity were 86.5% and 73.7%. For the image similarity measures, the sensitivity ranged between 86.5% and 100%, and specificity ranged between 63.2% and 76.3%. By combining BMD with image similarity measures, the sensitivity and specificity reached 94.6% and 76.3% using linear discriminant analysis (LDA) algorithm, or 91.9% and 81.6% using recursive partitioning and regression trees (RPART) algorithm. In the RPART approach, the AUC value of the ROC curve was 0.923, higher than the AUC value of 0.835 when using BMD alone (p-value: 0.0046). Our results showed that combining BMD with image similarity measures resulted in improved hip fracture risk estimation.
PMCID: PMC2896043  PMID: 19414074
osteoporosis; proximal femur; QCT; mutual information; image registration
16.  FRAX®: Prediction of Major Osteoporotic Fractures in Women from the General Population: The OPUS Study 
PLoS ONE  2013;8(12):e83436.
The aim of this study was to analyse how well FRAX® predicts the risk of major osteoporotic and vertebral fractures over 6 years in postmenopausal women from general population.
Patients and methods
The OPUS study was conducted in European women aged above 55 years, recruited in 5 centers from random population samples and followed over 6 years. The population for this study consisted of 1748 women (mean age 74.2 years) with information on incident fractures. 742 (43.1%) had a prevalent fracture; 769 (44%) and 155 (8.9%) of them received an antiosteoporotic treatment before and during the study respectively. We compared FRAX® performance with and without bone mineral density (BMD) using receiver operator characteristic (ROC) c-statistical analysis with ORs and areas under receiver operating characteristics curves (AUCs) and net reclassification improvement (NRI).
85 (4.9%) patients had incident major fractures over 6 years. FRAX® with and without BMD predicted these fractures with an AUC of 0.66 and 0.62 respectively. The AUC were 0.60, 0.66, 0.69 for history of low trauma fracture alone, age and femoral neck (FN) BMD and combination of the 3 clinical risk factors, respectively. FRAX® with and without BMD predicted incident radiographic vertebral fracture (n = 65) with an AUC of 0.67 and 0.65 respectively. NRI analysis showed a significant improvement in risk assignment when BMD is added to FRAX®.
This study shows that FRAX® with BMD and to a lesser extent also without FN BMD predict major osteoporotic and vertebral fractures in the general population.
PMCID: PMC3875449  PMID: 24386199
17.  Age-related frailty and its association with biological markers of ageing 
BMC Medicine  2015;13:161.
The relationship between age-related frailty and the underlying processes that drive changes in health is currently unclear. Considered individually, most blood biomarkers show only weak relationships with frailty and ageing. Here, we examined whether a biomarker-based frailty index (FI-B) allowed examination of their collective effect in predicting mortality compared with individual biomarkers, a clinical deficits frailty index (FI-CD), and the Fried frailty phenotype.
We analyzed baseline data and up to 7-year mortality in the Newcastle 85+ Study (n = 845; mean age 85.5). The FI-B combined 40 biomarkers of cellular ageing, inflammation, haematology, and immunosenescence. The Kaplan-Meier estimator was used to stratify participants into FI-B risk strata. Stability of the risk estimates for the FI-B was assessed using iterative, random subsampling of the 40 FI-B items. Predictive validity was tested using Cox proportional hazards analysis and discriminative ability by the area under receiver operating characteristic (ROC) curves.
The mean FI-B was 0.35 (SD, 0.08), higher than the mean FI-CD (0.22; SD, 0.12); no participant had an FI-B score <0.12. Higher values of each FI were associated with higher mortality risk. In a sex-adjusted model, each one percent increase in the FI-B increased the hazard ratio by 5.4 % (HR, 1.05; CI, 1.04–1.06). The FI-B was more powerful for mortality prediction than any individual biomarker and was robust to biomarker substitution. The ROC analysis showed moderate discriminative ability for 7-year mortality (AUC for FI-CD = 0.71 and AUC for FI-B = 0.66). No individual biomarker’s AUC exceeded 0.61. The AUC for combined FI-CD/FI-B was 0.75.
Many biological processes are implicated in ageing. The systemic effects of these processes can be elucidated using the frailty index approach, which showed here that subclinical deficits increased the risk of death. In the future, blood biomarkers may indicate the nature of the underlying causal deficits leading to age-related frailty, thereby helping to expose targets for early preventative interventions.
Electronic supplementary material
The online version of this article (doi:10.1186/s12916-015-0400-x) contains supplementary material, which is available to authorized users.
PMCID: PMC4499935  PMID: 26166298
Ageing; Biomarkers; Cellular ageing; Deficit accumulation; Frailty; Frailty index; Frailty phenotype; Immunosenescence; Inflammation; Newcastle 85+ study
18.  Prediction of protein-protein interaction sites using an ensemble method 
BMC Bioinformatics  2009;10:426.
Prediction of protein-protein interaction sites is one of the most challenging and intriguing problems in the field of computational biology. Although much progress has been achieved by using various machine learning methods and a variety of available features, the problem is still far from being solved.
In this paper, an ensemble method is proposed, which combines bootstrap resampling technique, SVM-based fusion classifiers and weighted voting strategy, to overcome the imbalanced problem and effectively utilize a wide variety of features. We evaluate the ensemble classifier using a dataset extracted from 99 polypeptide chains with 10-fold cross validation, and get a AUC score of 0.86, with a sensitivity of 0.76 and a specificity of 0.78, which are better than that of the existing methods. To improve the usefulness of the proposed method, two special ensemble classifiers are designed to handle the cases of missing homologues and structural information respectively, and the performance is still encouraging. The robustness of the ensemble method is also evaluated by effectively classifying interaction sites from surface residues as well as from all residues in proteins. Moreover, we demonstrate the applicability of the proposed method to identify interaction sites from the non-structural proteins (NS) of the influenza A virus, which may be utilized as potential drug target sites.
Our experimental results show that the ensemble classifiers are quite effective in predicting protein interaction sites. The Sub-EnClassifiers with resampling technique can alleviate the imbalanced problem and the combination of Sub-EnClassifiers with a wide variety of feature groups can significantly improve prediction performance.
PMCID: PMC2808167  PMID: 20015386
NeuroImage  2013;87:220-241.
Many neuroimaging applications deal with imbalanced imaging data. For example, in Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, the mild cognitive impairment (MCI) cases eligible for the study are nearly two times the Alzheimer’s disease (AD) patients for structural magnetic resonance imaging (MRI) modality and six times the control cases for proteomics modality. Constructing an accurate classifier from imbalanced data is a challenging task. Traditional classifiers that aim to maximize the overall prediction accuracy tend to classify all data into the majority class. In this paper, we study an ensemble system of feature selection and data sampling for the class imbalance problem. We systematically analyze various sampling techniques by examining the efficacy of different rates and types of undersampling, oversampling, and a combination of over and under sampling approaches. We thoroughly examine six widely used feature selection algorithms to identify significant biomarkers and thereby reduce the complexity of the data. The efficacy of the ensemble techniques is evaluated using two different classifiers including Random Forest and Support Vector Machines based on classification accuracy, area under the receiver operating characteristic curve (AUC), sensitivity, and specificity measures. Our extensive experimental results show that for various problem settings in ADNI, (1). a balanced training set obtained with K-Medoids technique based undersampling gives the best overall performance among different data sampling techniques and no sampling approach; and (2). sparse logistic regression with stability selection achieves competitive performance among various feature selection algorithms. Comprehensive experiments with various settings show that our proposed ensemble model of multiple undersampled datasets yields stable and promising results.
PMCID: PMC3946903  PMID: 24176869
Alzheimer’s disease; classification; imbalanced data; undersampling; oversampling; feature selection
20.  Computerised analysis of osteoporotic bone patterns using texture parameters characterising bone architecture 
The British Journal of Radiology  2013;86(1021):20101115.
To evaluate the geometric change of osteoporotic bone trabecular patterns using root mean square (RMS) values, first moment power spectrum (FMP) values and fractal dimension values. With the use of these methods, we attempted computerised analysis of osteoporotic bone patterns using texture parameters characterising bone architecture and computer-aided diagnosis of osteoporosis.
32 patient cases from Korea University Guro Hospital were analysed. Patient ages ranged from 51 to 89 years, with a mean age of 65 years. Receiver operating characteristic curve analysis was performed with determination of the area under the curve (AUC).
The bone mineral density (BMD) measurement (AUC=0.78) was a better indicator of bone quantity than the RMS, FMP and fractal dimension values (AUC=0.72) for diagnosis; therefore the combination of RMS, FMP and fractal dimension values was a better indicator of bone quality.
Measurements that combined BMD measurement and RMS values and combined FMP and fractal dimension values (AUC=0.85) together produced better results than the use of the two parameter sets separately for a diagnosis of osteoporosis.
Advances in knowledge
For more effective application, additional study on more cases and data will be required.
PMCID: PMC3615401  PMID: 23239687
21.  Variable Importance and Prediction Methods for Longitudinal Problems with Missing Variables 
PLoS ONE  2015;10(3):e0120031.
We present prediction and variable importance (VIM) methods for longitudinal data sets containing continuous and binary exposures subject to missingness. We demonstrate the use of these methods for prognosis of medical outcomes of severe trauma patients, a field in which current medical practice involves rules of thumb and scoring methods that only use a few variables and ignore the dynamic and high-dimensional nature of trauma recovery. Well-principled prediction and VIM methods can provide a tool to make care decisions informed by the high-dimensional patient’s physiological and clinical history. Our VIM parameters are analogous to slope coefficients in adjusted regressions, but are not dependent on a specific statistical model, nor require a certain functional form of the prediction regression to be estimated. In addition, they can be causally interpreted under causal and statistical assumptions as the expected outcome under time-specific clinical interventions, related to changes in the mean of the outcome if each individual experiences a specified change in the variable (keeping other variables in the model fixed). Better yet, the targeted MLE used is doubly robust and locally efficient. Because the proposed VIM does not constrain the prediction model fit, we use a very flexible ensemble learner (the SuperLearner), which returns a linear combination of a list of user-given algorithms. Not only is such a prediction algorithm intuitive appealing, it has theoretical justification as being asymptotically equivalent to the oracle selector. The results of the analysis show effects whose size and significance would have been not been found using a parametric approach (such as stepwise regression or LASSO). In addition, the procedure is even more compelling as the predictor on which it is based showed significant improvements in cross-validated fit, for instance area under the curve (AUC) for a receiver-operator curve (ROC). Thus, given that 1) our VIM applies to any model fitting procedure, 2) under assumptions has meaningful clinical (causal) interpretations and 3) has asymptotic (influence-curve) based robust inference, it provides a compelling alternative to existing methods for estimating variable importance in high-dimensional clinical (or other) data.
PMCID: PMC4376910  PMID: 25815719
22.  Underestimation of the Calculated Area Under the Concentration-Time Curve Based on Serum Creatinine for Vancomycin Dosing 
Infection & Chemotherapy  2014;46(1):21-29.
The ratio of the steady-state 24-hour area under the concentration-time curve (ssAUC24) to the MIC (AUC24/MIC) for vancomycin has been recommended as the preferred pharmacodynamic index. The aim of this study was to assess whether the calculated AUC24 (cAUC24) using the creatinine clearance (CLcr) differs from the ssAUC24 based on the individual pharmacokinetic data estimated by a commercial software.
Materials and Methods
The cAUC24 was compared with the ssAUC24 with respect to age, body mass index, and trough concentration of vancomycin and the results were expressed as median and interquartile ranges. A correlation between the cAUC24 and ssAUC24 and the trough concentration of vancomycin was evaluated. The probability of reaching an AUC24/MIC of 400 or higher was compared between the cAUC24 and ssAUC24 for different MICs of vancomycin and different daily doses by simulation in a subgroup with a trough concentration of 10 mg/L and higher.
The cAUC24 was significantly lower than the ssAUC24 (392.38 vs. 418.32 mg·hr/L, P < 0.0001) and correlated weakly with the trough concentration (r = 0.649 vs. r = 0.964). Assuming a MIC of 1.0 mg/L, the probability of reaching the value of 400 or higher was 77.5% for the cAUC24/MIC and 100% for the ssAUC24/MIC in patients with a trough concentration of 10 mg/L and higher. If the MIC increased to 2.0 mg/L, the probability was 57.7% for the cAUC24/MIC and 71.8% for the ssAUC24/MIC at a daily vancomycin dose of 4,000 mg.
The cAUC24 using the calculated CLcr is usually underestimated compared with the ssAUC24 based on individual pharmacokinetic data. Therefore, to obtain a more accurate AUC24, therapeutic monitoring of vancomycin rather than a simple calculation based on the CLcr should be performed, and a more accurate biomarker for renal function is needed.
PMCID: PMC3970305  PMID: 24693466
Vancomycin; Pharmacodynamics; Area under curve; Drug monitoring, Therapeutic
23.  A Wild Bootstrap approach for the selection of biomarkers in early diagnostic trials 
In early diagnostic trials, particularly in biomarker studies, the aim is often to select diagnostic tests among several methods. In case of metric, discrete, or even ordered categorical data, the area under the receiver operating characteristic (ROC) curve (denoted by AUC) is an appropriate overall accuracy measure for the selection, because the AUC is independent of cut-off points.
For selection of biomarkers the individual AUC’s are compared with a pre-defined threshold. To keep the overall coverage probability or the multiple type-I error rate, simultaneous confidence intervals and multiple contrast tests are considered. We propose a purely nonparametric approach for the estimation of the AUC’s with the corresponding confidence intervals and statistical tests. This approach uses the correlation among the statistics to account for multiplicity. For small sample sizes, a Wild-Bootstrap approach is presented. It is shown that the corresponding intervals and tests are asymptotically exact.
Extensive simulation studies indicate that the derived Wild-Bootstrap approach keeps and exploits the nominal type-I error at best, even for high accuracies and in case of small samples sizes. The strength of the correlation, the type of covariance structure, a skewed distribution, and also a moderate imbalanced case-control ratio do not have any impact on the behavior of the approach. A real data set illustrates the application of the proposed methods.
We recommend the new Wild Bootstrap approach for the selection of biomarkers in early diagnostic trials, especially for high accuracies and small samples sizes.
Electronic supplementary material
The online version of this article (doi:10.1186/s12874-015-0025-y) contains supplementary material, which is available to authorized users.
PMCID: PMC4426186  PMID: 25925052
AUC; Diagnostic study; Resampling; Simultaneous intervals; Wild bootstrap
24.  Biomarker selection for medical diagnosis using the partial area under the ROC curve 
BMC Research Notes  2014;7:25.
A biomarker is usually used as a diagnostic or assessment tool in medical research. Finding an ideal biomarker is not easy and combining multiple biomarkers provides a promising alternative. Moreover, some biomarkers based on the optimal linear combination do not have enough discriminatory power. As a result, the aim of this study was to find the significant biomarkers based on the optimal linear combination maximizing the pAUC for assessment of the biomarkers.
Under the binormality assumption we obtain the optimal linear combination of biomarkers maximizing the partial area under the receiver operating characteristic curve (pAUC). Related statistical tests are developed for assessment of a biomarker set and of an individual biomarker. Stepwise biomarker selections are introduced to identify those biomarkers of statistical significance.
The results of simulation study and three real examples, Duchenne Muscular Dystrophy disease, heart disease, and breast tissue example are used to show that our methods are most suitable biomarker selection for the data sets of a moderate number of biomarkers.
Our proposed biomarker selection approaches can be used to find the significant biomarkers based on hypothesis testing.
PMCID: PMC3923449  PMID: 24410929
Discriminatory power; Hypothesis testing; Optimal linear combination; Partial area under ROC curve; Stepwise biomarker selection
25.  AMS 4.0: consensus prediction of post-translational modifications in protein sequences 
Amino Acids  2012;43(2):573-582.
We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at under Apache 2.0 licensing.
Electronic supplementary material
The online version of this article (doi:10.1007/s00726-012-1290-2) contains supplementary material, which is available to authorized users.
PMCID: PMC3397139  PMID: 22555647
Post-translational modifications; AMS-4; High quality indices; MLP; Consensus

Results 1-25 (1123233)