|Home | About | Journals | Submit | Contact Us | Français|
Challenges exist in the study of social determinants of health (SDH) because of limited comparability of population-based U.S. data on SDH. This limitation is due to differences in disparity or equity measurements, as well as general data quality and availability. We reviewed the current SDH variables collected for HIV, viral hepatitis, sexually transmitted diseases, and tuberculosis at the Centers for Disease Control and Prevention through its population-based surveillance systems and assessed specific system attributes. Results were used to provide recommendations for a core set of SDH variables to collect that are both feasible and useful. We also conducted an environmental literature scan to determine the status of knowledge of SDH as underlying causes of disease and to inform the recommended core set of SDH variables.
Scientists, physicians, policy makers, and others are now considering the total ecology of population health outcomes, which include complex, integrated, and overlapping social structures and economic systems, collectively referred to as social determinants of health (SDH).1 SDH are the economic and social conditions that influence the health of people and communities as a whole.1–3 Research continues to show that personal choices or behaviors are only part of what determines individual health status.1,4,5 Five determinants of population health are generally recognized in scientific literature:
The last three categories are associated with SDH. Genetics and individual behavior affect the individual's environment in ways that are unique to the individual, but as such, the risk for disease is greater when associated with inequitable distribution of income, access to health care, and environmental concerns.4,5
SDH have been implicitly understood as underlying causes of disease, yet only in the last decade or so has rigorous research been conducted to better understand SDH variables that play major roles in population health (including minority health).8–10 SDH include social factors that strongly impact morbidity and mortality; discrimination on the basis of race, ethnicity, gender, or sexual orientation; cultural customs, traditions, language, beliefs, and norms; and access to education and health resources.3,11–21 Populations that typically experience lower income levels are more likely to have lower education levels; live in densely populated areas, remote rural areas, or areas with little or no access to healthier food outlets and markets; experience violence and poorer sexual health outcomes; have no or inadequate health insurance; and be employed in positions that are more labor-intensive with fewer opportunities for upward mobility.11,12,14–16,22–27 Having data to address and monitor the prevalence of these factors and their individual contributions to health outcomes is important in understanding disease incidence as well as developing interventions. However, despite this need, there is a lack of appropriate SDH and disease outcome analyses to quantify the contribution of SDH variables to specific outcomes of interest.28 The need for an increased focus on the science of SDH as an approach to achieving health equity has been identified as an area of importance by the World Health Organization (WHO), the Institute of Medicine, the U.S. Department of Health and Human Services, and the Centers for Disease Control and Prevention (CDC).3,7,28,29
A number of models describe the relationships between determinants of health and health status. WHO convened the Commission on Social Determinants of Health (CSDH) in 2005 to support partners in addressing SDH in their public health efforts. The CSDH created a model, released in 2008 with its final report, that describes the overlapping, multidirectional pathways that allow SDH—when combined with social capital, political influences, individual traits, and the health-care system—to affect health outcomes.3
Another popular SDH model developed by Ansari et al. demonstrates the relationship among health-care systems, SDH, behaviors, and health outcomes, and the dynamic relationship between psychological risks and the effects of socioeconomic determinants.1,30 Dahlgren and Whitehead also created a relatively simple model that builds upon the influence of biology, individual behaviors, SDH, and health outcomes.1,31
More recently, the Health Impact Pyramid, developed by Frieden, shows the influences on overall population health, beginning with socioeconomic factors. Each layer builds upon those factors, including changing the context of health decision-making, interventions, and education. Each higher level results in a lesser influence on health outcomes, but a more feasible target for individual-level interventions. It is important to note that in the Health Impact Pyramid, achieving improved population health requires the most attention to influences that are out of the individual's control, such as the underlying social and economic factors.32
Health disparities in human immunodeficiency virus (HIV), viral hepatitis, sexually transmitted diseases (STDs), and tuberculosis (TB) have been documented for racial and ethnic minority groups, sexual and gender minority groups, young people, females, and incarcerated people.33–35 SDH may explain the common co-occurrence of risk factors among these groups and, thus, the co-occurrence of diseases such as HIV, hepatitis, and STDs, and, in some populations, TB and HIV. Current challenges exist in the study of SDH due to limited comparability of population-based U.S. data on SDH because of differences in measurements of disparity and data quality.3,13,16,18,28,36–38 Multiple national health and science agencies are calling for increased surveillance capabilities and increased data reporting to obtain a more complete picture of population disease, to identify the underlying causes of morbidity and mortality, and to reduce the stigma associated with certain diseases.1,3,28
For this project, we identified SDH measures collected in CDC's population-based surveillance systems for HIV, viral hepatitis, STDs, and TB and provided recommendations for the collection of supplemental SDH variables. We also scanned the literature to determine the evidence for consistent associations between SDH and these four diseases and to help inform our recommendations.
We identified SDH variables collected in CDC population-based surveillance systems addressing HIV, viral hepatitis, STDs, and TB as of 2007 and assessed specific system attributes: timeliness, percent completeness as of the reporting year, and availability of published quality standards. CDC's Guidelines for Evaluating Public Health Surveillance Systems were used as the framework for the assessment. The guidelines provide a structure for evaluating systems to ensure that morbidity and mortality are being monitored effectively and efficiently.
The standards chosen for review are among a list of system attributes—including simplicity, flexibility, acceptability, sensitivity, predictive value positive, representativeness, and stability—that should be assessed annually to ensure surveillance system data quality.39 The systems reviewed were the enhanced HIV/AIDS Reporting System (HARS);40 the National Electronic Telecommunications System for Surveillance (NETSS)41 and the National Notifiable Disease Surveillance System (NNDSS)42 for viral hepatitis; NETSS for STDs; and Report of Verified Case of Tuberculosis (RVCT)43/NETSS for TB. (Information collected on the RVCT form is added to NETSS; we reviewed both the case report form and then information reported to NETSS.) These systems collect information from medical records and case reports, which typically do not include SDH information such as income or education.
We inventoried the current case-based systems; the 2007 annual reports for HIV, viral hepatitis, and STDs; and the 2008 annual report for TB to identify currently collected variables. We reviewed data-collection forms, variable proxies, data-collection routes, and data dictionaries. Variable proxies—a variable the system uses that may be similar or equivalent to a particular SDH variable—can be valuable due to the fact that some desired data may not be available for collection or analysis because of confidentiality concerns (e.g., patient's residence at time of diagnosis). Other data are available that can serve as adequate substitutes or proxies (e.g., clinic/laboratory's county at diagnosis instead). Choosing proxies may be a subjective process that depends on the needs of the system and users.
Beginning with a small set of SDH variables discussed in the 2008 CSDH report and CDC's 2010 SDH report, the authors conducted an environmental scan of the literature to assess the depth of evidence available for SDH variables, as well as to build upon this initial set of SDH variables.1,3 We identified 68 articles from six databases that index medical and scientific literature, including PsycINFO®, PubMed, Ovid, Embase™, CINAHL®, and Google Scholar. Search terms included “U.S.,” “health disparities,” “health inequities,” “social determinants of health,” “social factors/determinants,” “health inequalities,” “minorities,” “disparities/differences,” “health inequities,” “structural factors/determinants,” “environmental factors/determinants,” “HIV/AIDS,” “sexually transmitted diseases,” “chlamydia,” “gonorrhea,” “syphilis,” “viral hepatitis,” and “tuberculosis.” We included articles published from 1990 through 2009 that discussed evidence of health inequities in the U.S. based on SDH for HIV, viral hepatitis, STDs, and TB. This review was not a comprehensive, systematic literature search, and the intent was to estimate the extent of evidence currently available. It was not intended to describe the strengths of associations between SDH variables and disease. Understanding the depth of evidence and data available is an important step in identifying the gaps in SDH information. It is understood that more information is needed on SDH and disease, yet it is unclear how and where those gaps lie.28 The main purpose of the environmental literature scan was to help inform a recommended core set of SDH variables for surveillance.
In addition to the literature scan, we reviewed external databases for potential linkage to surveillance data to obtain SDH variables using the Data Set Directory of Social Determinants of Health at the Local Level prepared by CDC.44 The Directory is a comprehensive, if not exhaustive, listing of datasets that can be used for geographic linkage of disease data and SDH variables.
The findings of this analysis are discussed separately for each surveillance system and are summarized in Figure 1, the Table, and Figure 2. Figure 1 shows the number of variables found in each system, categorized by health determinant. The Table lists SDH variables monitored in each surveillance system, the year the variable was first collected, and the percent completeness of each SDH variable. Figure 2 lists the recommended core set of variables for each surveillance system, which all include a basic set of eight variables consistent across the systems. This recommended list was informed by the environmental scan of the literature, discussion with CDC surveillance contacts, and current constraints in each surveillance system.
Most variables collected by HARS fall into the categories of (1) genetics and biology, and (2) individual behaviors. Possible SDH variables collected include one social environment variable (ethnicity), two physical environment variables (country of birth and residence, including three proxies that could substitute for residence: city, state, and county), and two health services variables (insurance status and date of initial health exam, although these are both incomplete).45,46 Country of birth appears on case report forms, and although foreign-born is not specifically stated as such on the form, these data can be extracted to form the foreign-born variable. The same situation is found for incarceration status, which can be derived from facility of diagnosis.
Our evaluation of the data showed a high level of percent completeness as of 2007, with most collected variables achieving 80% or higher completeness. CDC maintains quality standards for HIV data, including completeness and timeliness of case reporting and completeness and quality of information for individual data elements. HIV variables that are marked incomplete in Figure 2 reflect a quality standard maintained by the HIV surveillance system that states a variable must achieve greater than 85% completeness status before it is considered complete. Outcome and process standards are assessed annually.47,48
NETSS/NNDSS data for hepatitis were incomplete regarding SDH information. (NETSS is a system of computerized record forms used to transmit NNDSS data from health departments to CDC. Hepatitis data are collected for NNDSS and then submitted to NETSS, which also contains data from both nationally notifiable and non-notifiable diseases.) The majority of collected variables address biology, genetics, and individual behavior. The system collects two social environment variables (ethnicity and medically related occupation), one physical environment variable (two proxies: state and county), and one health services variable (vaccination status). Medically related occupation refers to a medical employee who acquired hepatitis through blood contact. No information is available for other types of occupations. Evaluation of the hepatitis reporting system uncovered challenges that could pose threats to data validity. States are not required to report hepatitis diagnoses nor additional information such as exposure history and clinical information to CDC. Due to the passive nature of the system, reported cases are not followed up, nor are asymptomatic cases identified.
Completeness of reported data as of 2007 also varied.49 For hepatitis A virus (HAV), almost 50% of risk factor data were not available; similar numbers were also found for hepatitis B virus (HBV) (52%) and hepatitis C virus (HCV) (52%). Percent completeness of the analyzed data (a measurement of states/territories reporting hepatitis data to CDC) ranged from 0%–85%. Percent completeness for hepatitis is unique in that it refers to the overall percentage of states to report a specific variable—e.g., 0%–85% of all states reported data on ethnicity. No information was identified regarding quality standards for data analysis.50 Some states, laboratories, and health-care providers differ regarding their definition of date of diagnosis. It may be defined as date of receipt of treatment, date of receipt of lab results, or date of testing. Although timeliness remains a challenge for hepatitis reporting due to these differences, data for 2007 were considered timely if reported before December 29, 2007.
NETSS data for STDs followed a similar pattern as HIV. One social environment variable (ethnicity), one variable addressing physical environment (residence at time of diagnosis and three proxies: city, state, and county), and one health services variable (date of initial health exam, first collected in 2008) are captured. Date of initial health exam reflects the first time the patient received treatment for STD-related issues. CDC has monitored all variables collected on an ongoing basis since at least 1987, with sentinel surveillance systems adding variables as recently as 2002.
Evaluation of percent completeness, timeliness, and quality found mixed results for STD data. Much of the data had high percent completeness as of 2007, mostly greater than 70%. However, for individual behavior data, such as sexual behavior, percent completeness was very low—less than 50% for all (including sexual behavior, injection drug use, number of sexual contacts, and concurrent partnerships) and most data were less than 10%. No current quality standards were identified that serve as set guidelines for data analysis or reporting, but data for 2007 were considered timely if received by June 25, 2008.51
Currently, the TB surveillance system has the most complete set of SDH data. As of 2008, CDC collected four social environment variables (ethnicity, occupation, incarceration status, and two proxies for immigration status: foreign-born, and date of arrival in U.S.), three physical environment variables (homeless status, country of birth, and two proxies for residence at time of diagnosis: county and ZIP code), and three health services variables (three proxies for therapy received: date therapy started, date therapy stopped, reason therapy stopped; resident of long-term care facility; and two proxies for previous health-care visit: previous TB diagnosis, and previous HIV diagnosis).
All variables collected have been monitored on an ongoing basis since 1993, with the exception of “reason therapy stopped,” which has not been available since 2006.35 In 2009, additional variables on the RVCT form included immigration status at first entry to U.S., a variable titled “sex at birth,” and additional TB risk factors including diabetes status (which may serve as a proxy for determining contact with the health-care system).52,53 Data for 2008 were considered timely if received for analysis by May 20, 2009.54,55 The majority of data elements were considered complete as of 2008, with most more than 90% complete. Quality is not assessed routinely for the data, although pilot testing in sentinel sites is being conducted to evaluate the quality of the data. Results of the pilot quality assessment, however, will not be available for a few years.
The majority of the 68 articles reviewed were theoretical in nature, drawing conclusions from observational data regarding the relationships between SDH and health outcomes. Fewer than 20 articles discussed challenges in current methodologies, disparity measurement, guidance, and/or data collection of SDH variables. Minority health issues, racial/ethnic disparities, disparities in socioeconomic status, and HIV were the subjects for the majority of the articles. Searches for information regarding SDH and viral hepatitis revealed the least evidence of the four diseases.
The environmental scan of the literature and a review of a small set of variables recommended by WHO and CDC identified a number of core variables similar across the four diseases, including suggestions for proxies. The naming of SDH variables differed across surveillance system; to simplify this naming, the authors recommend a core set of standard SDH variables common to each system, including proxies. The core set, by category of health determinant, included the following eight variables: (1) incarceration status, (2) income, (3) occupation, (4) educational attainment (for social environment), (5) homeless status (for physical environment), (6) receipt of treatment (for health services), (7) gender, and (8) sexual orientation. Additional recommended SDH variables specific to each disease are displayed in Figure 2.
All CDC surveillance systems reviewed in this article report SDH data, but each system varies as to the number of variables reported, as well as the availability of published quality standards and percent completeness of the data. It is apparent that most of the data collected are not considered SDH; in fact, they are categorized as individual behavior, and biology and genetics. Most of the collected SDH data are considered social environment variables or health services, with less emphasis on the collection of physical environment information. Again, consistent collection and quality measures of additional SDH variables across surveillance systems would enable public health practitioners and providers to first identify (e.g., through statistical modeling) and then address (e.g., through structural interventions) the underlying causes of disease. The main findings from this analysis suggest the adoption, in addition to those already collected, of a core set of SDH variables—incarceration status, income, occupation, educational attainment, homeless status, receipt of treatment, gender, and sexual orientation—to further enhance surveillance efforts.
Without the collection of other social environment SDH variables, a large portion of HIV data is incomplete in the context of broader population health. CDC does collect some HIV SDH variables, including four that were cited in the environmental literature scan—residence, insurance status, receipt of treatment, and ethnicity. Nevertheless, there is room for enhancement of these data. Interest in collecting SDH data is high, and many public health experts consider the reporting of SDH data in addition to disease outcome data an important initial process to a more balanced prevention portfolio, one that includes individual behavioral and structural interventions.32,35 CDC currently utilizes an HIV surveillance system that attempts to adhere to published data quality standards and is annually evaluated.
CDC captures four SDH variables cited in the environmental literature scan—ethnicity, occupation, residence indirectly through two proxies (state and county), and vaccination status. A passive collection system, along with sentinel surveillance systems such as CDC's Sentinel Counties Study of Acute Viral Hepatitis, have suggested acceptable reliability and accuracy of the data, although a strong case for improved data quality can be made for an active reporting system.50 Currently, there is a dearth of literature and data addressing associations between SDH and hepatitis, suggesting that awareness of the relationships between SDH and hepatitis infection is low. Unlike HIV or STDs, hepatitis can be transmitted and acquired through low-risk behaviors during international travel, which is a unique SDH variable encompassing issues of income and social mobility. Inclusion of additional SDH variables could help define more focused and targeted interventions for certain groups that are at risk of HAV, HBV, and HCV infection.
STD surveillance, which includes data on gonorrhea, chlamydia, and syphilis, collects three SDH variables—ethnicity, residence indirectly through three proxies (city, state, and county), and date of initial health exam. Challenges currently exist in the reporting of STD diagnoses to CDC. We found disparities in percent completeness and no published quality standards. Gonorrhea, chlamydia, and syphilis are the only STDs that require mandatory reporting, yet they are unique diagnoses with different surveillance systems, which are fed into NETSS. Recommendations exist for reporting STD diagnoses, yet states have the option to utilize state-created forms or CDC STD case report forms. This option creates information that is nonuniform and inconsistently reported by providers and labs. Because of this inconsistency, and because few SDH data are available in CDC's STD population-based systems, we reviewed other data sources. Sentinel systems such as the Gonococcal Isolate Surveillance Project,56 Corrections STD Prevalence Monitoring Project,57 and MSM [men who have sex with men] Prevalence Monitoring Project58 are collecting social environment variables including sexual orientation and incarceration status, as well as health services variables such as previous HIV or STD diagnosis. These special studies may serve as examples for SDH data collection in the future.
CDC collects 10 SDH variables found in the environmental literature scan—ethnicity, occupation, incarceration status, immigration status indirectly through foreign-born and date of arrival in the U.S., two proxies for residence at time of diagnosis (ZIP code and country), country of birth, homelessness, resident of long-term care facility at time of treatment, two proxies for medical visits (TB or HIV status), and three proxies for receipt of treatment (date therapy started, date therapy stopped, and reason therapy stopped). There is high interest in collecting TB-related SDH information, and CDC added SDH variables to the RVCT form in 2009.32,35 Timeliness of data reporting has improved since 2001, when there were considerable delays in reporting.59
Individual behavior data can be extremely valuable to researchers as proxies for SDH. While identifying sexual behaviors is not the same as sexual orientation or sexual identity, for MSM, capturing this information could be used as a close proxy for sexual orientation, which is highly influenced by social environment. A review of CDC's special studies and sentinel systems suggests the addition of two SDH variables: gender (which is a socially determined construct) and sexual orientation (which is not always representative of sexual behaviors and vice versa). There is evidence that these determinants are highly influential in determining health outcomes, although extensive discussion of gender and sexual orientation as social environment variables is outside the scope of this review. While there is disagreement in the public health community about the category in which these two variables belong, there is consensus that they should be collected regardless. The CDC Sexual and Gender Minorities Workgroup recommends data collection on three levels: gender identity (male, female, or transgender [male-to-female or female-to-male]), sexual orientation (gay, lesbian, bisexual, or heterosexual), and sexual behavior (MSM, women who have sex with women, men who have sex with women, or women who have sex with men).60
One of CDC's main roles is monitoring population health, which should include consistently collecting quality, comparable data on underlying causes of HIV, hepatitis, STDs, and TB. Presently, surveillance systems are only partially accomplishing this goal. Additional data would provide a holistic characterization of the communities affected by these diseases. A number of next steps must be considered before adding SDH variables to each of the population-based surveillance systems' reporting processes.
First, as evidenced by the Institute of Medicine report, the importance of monitoring key SDH variables cannot be overstated.28 The best procedures for monitoring additional SDH variables need to be established, as well as which core variables are indeed feasible for surveillance. Understandably, each proposed variable may not be fully incorporated in the next few years of data collection, but implementation would allow CDC to more effectively address prevention goals.
Second, adding SDH variables to case report forms is a lengthy process requiring review by the Office of Management and Budget and puts an increased burden on data collectors from state and local health departments. In addition, SDH may not be captured from sources of surveillance data, such as medical records, where this type of information may be recorded. Geographical linkage to external databases when collecting data is highly recommended, as SDH variables have been collected in some form or proxy by other research teams and data systems. The Data Set Directory of SDH at the Local Level, for example, provides a comprehensive list of SDH databases pertinent to CDC.44 Databases from the U.S. Census61 or American Housing Survey,62 Bureau of Labor Statistics,63 Bureau of Justice Statistics,64 Current Population Survey,65 and others would provide supplemental population-based information. However, linkage of individual-level disease data to external data sources that provide information on a geographic level (linkage by geographic variable) may address aggregates of place, not person.
When linking to other databases by a geographic variable (e.g., county, census-tract, or another geographic unit), care must be taken to not compromise confidentiality. CDC's HIV, Viral Hepatitis, STD, and TB Surveillance Workgroup is developing a security and confidentiality guidance document to articulate a vision on data sharing across these surveillance systems. It is also important to note that these databases are not all updated at regular intervals, some are not representative, and some data are not publicly available.
Third, geocoding of data is highly recommended. CDC's mission includes monitoring the epidemiology of disease, which includes the reporting of person, place, and time. While CDC has mechanisms in place to annually report the person and time of disease occurrence, reporting place or location beyond the state level remains a challenge. Some state and local areas have been able to geocode data and use this information to help inform decision-making and strategic planning. Analyses will be limited to the geographic variable level available, which is expected to vary in different surveillance systems. Moving forward with geographic analyses requires strict attention to confidentiality issues. Multiple studies show the importance and utility of geocoding, which provides more information on spatial location and spread of disease and can help direct policy decisions.12,13,15,18,66
The creation and implementation of a core set of SDH variables can enhance CDC's population-based surveillance for HIV, viral hepatitis, STDs, and TB. Because CDC alone cannot implement this process, feedback from state and local jurisdictions will be solicited during the next few years. Going forward, work on SDH variables and database linkage should be consistent with the Patient Protection and Affordable Care Act,67 electronic health records, and patient privacy laws. Geocoding to the smallest level possible while ensuring appropriate confidentiality measures will afford programs the most flexibility in analyzing and displaying data.12,13,15,18,57 By creating a more comprehensive database for these diseases, with the addition of SDH variables, we can gain a more complete picture of disease epidemiology and social and environmental characteristics in affected populations. This increased understanding may lend more credibility to the science of SDH, and prevention efforts will be able to use and execute more contextually appropriate initiatives to reduce health disparities and promote health equity.
The authors acknowledge the following colleagues at the Centers for Disease Control and Prevention (CDC) for their contributions to this project: Ms. Danni Daniels, Dr. Niko Gaffga, Dr. Norma Harris, Ms. Alesia Harvey, Ms. Carla Jeffries, Dr. Ruth Jiles, Ms. Suzanne Marks, Ms. Lauren Payne, Dr. Valerie Robison, Ms. Gail Scogin, Dr. Tanya Telfair Sharpe, and Dr. Hillard Weinstock.
The findings and conclusions in this article are those of the authors and do not necessarily represent the official position of CDC.