Search tips
Search criteria

Results 1-7 (7)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Fuzzy association rule mining and classification for the prediction of malaria in South Korea 
Malaria is the world’s most prevalent vector-borne disease. Accurate prediction of malaria outbreaks may lead to public health interventions that mitigate disease morbidity and mortality.
We describe an application of a method for creating prediction models utilizing Fuzzy Association Rule Mining to extract relationships between epidemiological, meteorological, climatic, and socio-economic data from Korea. These relationships are in the form of rules, from which the best set of rules is automatically chosen and forms a classifier. Two classifiers have been built and their results fused to become a malaria prediction model. Future malaria cases are predicted as LOW, MEDIUM or HIGH, where these classes are defined as a total of 0–2, 3–16, and above 17 cases, respectively, for a region in South Korea during a two-week period. Based on user recommendations, HIGH is considered an outbreak.
Model accuracy is described by Positive Predictive Value (PPV), Sensitivity, and F-score for each class, computed on test data not previously used to develop the model. For predictions made 7–8 weeks in advance, model PPV and Sensitivity are 0.842 and 0.681, respectively, for the HIGH classes. The F0.5 and F3 scores (which combine PPV and Sensitivity) are 0.804 and 0.694, respectively, for the HIGH classes. The overall FARM results (as measured by F-scores) are significantly better than those obtained by Decision Tree, Random Forest, Support Vector Machine, and Holt-Winters methods for the HIGH class. For the MEDIUM class, Random Forest and FARM obtain comparable results, with FARM being better at F0.5, and Random Forest obtaining a higher F3.
A previously described method for creating disease prediction models has been modified and extended to build models for predicting malaria. In addition, some new input variables were used, including indicators of intervention measures. The South Korea malaria prediction models predict LOW, MEDIUM or HIGH cases 7–8 weeks in the future. This paper demonstrates that our data driven approach can be used for the prediction of different diseases.
PMCID: PMC4472166  PMID: 26084541
Malaria; Prediction; Association rule mining; Fuzzy logic; Classification; Environmental data; Socio-economic data; Epidemiological data
2.  A Meta-analysis of Point-of-care Laboratory Tests in the Diagnosis of Novel 2009 Swine-lineage Pandemic Influenza A(H1N1) 
This paper reviews fourteen published studies describing performance characteristics, including sensitivity and specificity, of commercially-available rapid, point-of-care (POC) influenza tests in patients affected by an outbreak of a novel swine-related influenza A (H1N1) that was declared a pandemic in 2009. Although these POC tests weren’t intended to be specific for this pandemic influenza strain, the non-specialized skills required and the timeliness of results make these POC tests potentially valuable for clinical and public health use. Pooled sensitivity and specificity for the POC tests studied were 68% and 81%, respectively, but published values were not homogeneous with sensitivities and specificities ranging from 10–88% and 51–100%, respectively. Pooled positive and negative likelihood ratios were 5.94 and 0.42, respectively. These results support current recommendations for use of rapid POC tests when H1N1 is suspected, recognizing that positive results are more reliable than negative results in determining infection, especially when disease prevalence is high.
PMCID: PMC3058416  PMID: 21396538
Influenza A (H1N1); pandemic; swine-origin; rapid point-of-care test
3.  Preliminary Development of a Fiber Optic Sensor for Measuring Bilirubin 
Preliminary development of a fiber optic bilirubin sensor is described, where an unclad sensing portion is used to provide evanescent wave interaction of the transmitted light with the chemical environment. By using a wavelength corresponding to a bilirubin absorption peak, the Beer–Lambert Law can be used to relate the concentration of bilirubin surrounding the sensing portion to the amount of absorbed light. Initial testing in vitro suggests that the sensor response is consistent with the results of bulk absorption measurements as well as the Beer–Lambert Law. In addition, it is found that conjugated and unconjugated bilirubin have different peak absorption wavelengths, so that two optical frequencies may potentially be used to measure both types of bilirubin. Future development of this device could provide a means of real-time, point-of-care monitoring of intravenous bilirubin in critical care neonates with hyperbilirubinemia.
PMCID: PMC4085104  PMID: 25057239
fiber optic sensor; bilirubin; point-of-care technology
4.  Prediction of High Incidence of Dengue in the Philippines 
Accurate prediction of dengue incidence levels weeks in advance of an outbreak may reduce the morbidity and mortality associated with this neglected disease. Therefore, models were developed to predict high and low dengue incidence in order to provide timely forewarnings in the Philippines.
Model inputs were chosen based on studies indicating variables that may impact dengue incidence. The method first uses Fuzzy Association Rule Mining techniques to extract association rules from these historical epidemiological, environmental, and socio-economic data, as well as climate data indicating future weather patterns. Selection criteria were used to choose a subset of these rules for a classifier, thereby generating a Prediction Model. The models predicted high or low incidence of dengue in a Philippines province four weeks in advance. The threshold between high and low was determined relative to historical incidence data.
Principal Findings
Model accuracy is described by Positive Predictive Value (PPV), Negative Predictive Value (NPV), Sensitivity, and Specificity computed on test data not previously used to develop the model. Selecting a model using the F0.5 measure, which gives PPV more importance than Sensitivity, gave these results: PPV = 0.780, NPV = 0.938, Sensitivity = 0.547, Specificity = 0.978. Using the F3 measure, which gives Sensitivity more importance than PPV, the selected model had PPV = 0.778, NPV = 0.948, Sensitivity = 0.627, Specificity = 0.974. The decision as to which model has greater utility depends on how the predictions will be used in a particular situation.
This method builds prediction models for future dengue incidence in the Philippines and is capable of being modified for use in different situations; for diseases other than dengue; and for regions beyond the Philippines. The Philippines dengue prediction models predicted high or low incidence of dengue four weeks in advance of an outbreak with high accuracy, as measured by PPV, NPV, Sensitivity, and Specificity.
Author Summary
A largely automated methodology is described for creating models that use past and recent data to predict dengue incidence levels several weeks in advance for a specific time period and a geographic region that can be sub-national. The input data include historical and recent dengue incidence, socioeconomic factors, and remotely sensed variables related to weather, climate, and the environment. Among the climate variables are those known to indicate future weather patterns that may or may not be seasonal. The final prediction models adhere to these principles: 1) the data used must be available at the time the prediction is made (avoiding pitfalls made by studies that use recent data that, in actual practice, would not be available until after the date the prediction was made); and 2) the models are tested on data not used in their development (thereby avoiding overly optimistic measures of accuracy of the prediction). Local public health preferences for low numbers of false positives and negatives are taken into account. These models appear to be robust even when applied to nearby geographic regions that were not used in model development. The method may be applied to other vector borne and environmentally affected diseases.
PMCID: PMC3983113  PMID: 24722434
5.  A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data 
Dengue is the most common arboviral disease of humans, with more than one third of the world’s population at risk. Accurate prediction of dengue outbreaks may lead to public health interventions that mitigate the effect of the disease. Predicting infectious disease outbreaks is a challenging task; truly predictive methods are still in their infancy.
We describe a novel prediction method utilizing Fuzzy Association Rule Mining to extract relationships between clinical, meteorological, climatic, and socio-political data from Peru. These relationships are in the form of rules. The best set of rules is automatically chosen and forms a classifier. That classifier is then used to predict future dengue incidence as either HIGH (outbreak) or LOW (no outbreak), where these values are defined as being above and below the mean previous dengue incidence plus two standard deviations, respectively.
Our automated method built three different fuzzy association rule models. Using the first two weekly models, we predicted dengue incidence three and four weeks in advance, respectively. The third prediction encompassed a four-week period, specifically four to seven weeks from time of prediction. Using previously unused test data for the period 4–7 weeks from time of prediction yielded a positive predictive value of 0.686, a negative predictive value of 0.976, a sensitivity of 0.615, and a specificity of 0.982.
We have developed a novel approach for dengue outbreak prediction. The method is general, could be extended for use in any geographical region, and has the potential to be extended to other environmentally influenced infections. The variables used in our method are widely available for most, if not all countries, enhancing the generalizability of our method.
PMCID: PMC3534444  PMID: 23126401
Dengue fever; Prediction; Association rule mining; Fuzzy logic; Predictor variables
6.  Developing open source, self-contained disease surveillance software applications for use in resource-limited settings 
Emerging public health threats often originate in resource-limited countries. In recognition of this fact, the World Health Organization issued revised International Health Regulations in 2005, which call for significantly increased reporting and response capabilities for all signatory nations. Electronic biosurveillance systems can improve the timeliness of public health data collection, aid in the early detection of and response to disease outbreaks, and enhance situational awareness.
As components of its Suite for Automated Global bioSurveillance (SAGES) program, The Johns Hopkins University Applied Physics Laboratory developed two open-source, electronic biosurveillance systems for use in resource-limited settings. OpenESSENCE provides web-based data entry, analysis, and reporting. ESSENCE Desktop Edition provides similar capabilities for settings without internet access. Both systems may be configured to collect data using locally available cell phone technologies.
ESSENCE Desktop Edition has been deployed for two years in the Republic of the Philippines. Local health clinics have rapidly adopted the new technology to provide daily reporting, thus eliminating the two-to-three week data lag of the previous paper-based system.
OpenESSENCE and ESSENCE Desktop Edition are two open-source software products with the capability of significantly improving disease surveillance in a wide range of resource-limited settings. These products, and other emerging surveillance technologies, can assist resource-limited countries compliance with the revised International Health Regulations.
PMCID: PMC3458896  PMID: 22950686
Electronic biosurveillance; Software development; Public health; Disease outbreak; Resource-limited settings
7.  Pediatric patient asthma-related emergency department visits and admissions in Washington, DC, from 2001–2004, and associations with air quality, socio-economic status and age group 
The District of Columbia (DC) Department of Health, under a grant from the US Centers for Disease Control and Prevention, established an Environmental Public Health Tracking Program. As part of this program, the goals of this contextual pilot study are to quantify short-term associations between daily pediatric emergency department (ED) visits and admissions for asthma exacerbations with ozone and particulate concentrations, and broader associations with socio-economic status and age group.
Data included daily counts of de-identified asthma-related pediatric ED visits for DC residents and daily ozone and particulate concentrations during 2001–2004. Daily temperature, mold, and pollen measurements were also obtained. After a cubic spline was applied to control for long-term seasonal trends in the ED data, a Poisson regression analysis was applied to the time series of daily counts for selected age groups.
Associations between pediatric asthma ED visits and outdoor ozone concentrations were significant and strongest for the 5–12 year-old age group, for which a 0.01-ppm increase in ozone concentration indicated a mean 3.2% increase in daily ED visits and a mean 8.3% increase in daily ED admissions. However, the 1–4 yr old age group had the highest rate of asthma-related ED visits. For 1–17 yr olds, the rates of both asthma-related ED visits and admissions increased logarithmically with the percentage of children living below the poverty threshold, slowing when this percentage exceeded 30%.
Significant associations were found between ozone concentrations and asthma-related ED visits, especially for 5–12 year olds. The result that the most significant ozone associations were not seen in the age group (1–4 yrs) with the highest rate of asthma-related ED visits may be related to the clinical difficulty in accurately diagnosing asthma among this age group. We observed real increases in relative risk of asthma ED visits for children living in higher poverty zip codes versus other zip codes, as well as similar logarithmic relationships for visits and admissions, which implies ED over-utilization may not be a factor. These results could suggest designs for future epidemiological studies that include more information on individual exposures and other risk factors.
PMCID: PMC1845147  PMID: 17376237

Results 1-7 (7)