|Home | About | Journals | Submit | Contact Us | Français|
We describe our approach to delineate slum and non-slum areas using satellite data for an impact evaluation of a reproductive health and family planning program in six cities of Uttar Pradesh, India. The urban focus of the program reflects the fact that the vast majority of global population growth over the next four decades is expected to occur in towns and cities in developing countries. Africa and Asia together will account for 86 percent of growth in the global urban population (UN Department of Economic and Social Affairs 2013). In Asia, more than half of the population will live in urban areas by 2020, and Africa is expected to reach this milestone in 2035 (UN Department of Economic and Social Affairs 2013). Poverty rates and income inequality are also highest in these regions (World Bank 2013). And yet official poverty lines may underestimate the extent of urban poverty; as shown in India, the number of urban residents living in poor conditions may be much larger than the number living under the official poverty line (Bapat 2009). Adverse health conditions among urban populations living in poor areas have been documented using health and household living indicators from survey data such as the Demographic and Health Surveys (Montgomery and Hewett 2003). Contraceptive use tends to be lower among the urban poor, sometimes even lower than use among rural women (Ezeh, Kodzi and Emina 2010). The highest birth rates are concentrated among the poorest populations, and a substantial number of pregnancies are unintended.
Under its Reproductive Health Strategy, the Bill & Melinda Gates Foundation established the Urban Reproductive Health Initiative (URHI). The initiative is comprised of reproductive health programs in four countries: Nigeria, Senegal, Kenya and Uttar Pradesh (UP), India. These programs aim to reduce maternal and child mortality, and unintended pregnancies by increasing the use of modern contraceptives through access to high-quality, voluntary family planning services. The Urban Health Initiative (UHI) India is a consortium of international, national, nongovernmental, and community-based organizations that work together with the government to improve the health of the urban poor.
From the outset, URHI incorporated a rigorous evaluation component in order to track and assess programmatic impacts at the population level and among the urban poor in the program cities. The Measurement, Learning & Evaluation project (MLE) for the Urban Reproductive Health Initiative, implemented by the Carolina Population Center at the University of North Carolina – Chapel Hill, is conducting the external evaluations of all four URHI programs. To evaluate the UHI program in UP, India, a total of six cities were selected for the impact evaluation, encompassing the four initial intervention cities: Agra, Aligarh, Allahabad and Gorakhpur; and the delayed intervention cities of Moradabad and Varanasi (Guilkey, Speizer et al. 2009). Using a longitudinal survey design, a comparison of women at the beginning, middle and end of the project intervention will provide estimates of program impact in cities which experienced the full program efforts for the duration of the project period, as well as the impact in the delayed intervention cities. The longitudinal design will allow us to assess programmatic impact using powerful panel data analysis methods.
The establishment of the Millennium Development Goals (MDG) in 2000 brought monitoring of the urban poor to the forefront of global health priorities. MDG 7 made specific reference to improving the lives of slum dwellers. However, amidst efforts to improve health and wellbeing among slum dwellers, nation states and the development community have faced challenges in monitoring progress towards this goal. The main challenge has been to rapidly and systematically characterize slum areas, and then to collect population health data that can be compared across slums, cities and countries. Population-based sampling frames drawn from Censuses, such as those used for the Demographic and Health Surveys (DHS) or the Multiple Indicator Cluster Surveys (MICS), typically do not undertake stratification by slum status, with few exceptions (for example in the India National Family Health Survey in 2005/2006 this was done for eight cities, El-Zanaty and Way 2006; International Institute for Population Sciences and Macro International 2007). The Bangladesh 2005 slum census is a notable exception, and demographic surveillance sites (DSS) in slum areas, such as the Nairobi DSS, have provided extensive data which has been used to monitor the health and wellbeing of slum dwellers (Angeles, Lance et al. 2009; Zulu, Beguy et al. 2011). Urban demographic surveillance sites provide rich information on large cohorts of urban slum dwellers living in specific slums; however, surveillance data do not provide city-level estimates nor are such surveillance systems organized dynamically to capture new slums as they develop, or drop areas that may no longer be classified as slums.
Currently, there is little empirical evidence for urban India at the national or state level which permits the comparison of health and demographic outcomes between slum and non-slum areas. Yet recent studies suggest that in India, slum dwellers may be worse off than non-slum dwellers. For example, the 2005/06 National Family Health Survey (NFHS) provides one data source used to examine slum versus non-slum areas. The 2005/06 NFHS over-sampled slum populations in eight Indian cities, including one in Uttar Pradesh, based on the classification scheme of the national Census of 2001 (International Institute for Population Sciences and Macro International 2007; Gupta, Arnold et al. 2009). Descriptive analyses of the NFHS results indicated that not all slum dwellers were poor nor were they worse-off when compared to non-slum dwellers on a range of health outcomes (Gupta, Arnold et al. 2009). However, further multivariate analyses of the 2005/06 NFHS data demonstrated that slum dwellers were worse-off on reproductive health indicators including contraceptive use, delivery at a health facility, and skilled attendance at delivery, controlling for background individual and household characteristics (Hazarika 2010). More recent analyses of the 2005/06 NFHS data demonstrate that women in slum areas were more likely to be malnourished compared to their non-slum counterparts, who were also more likely to be overweight or obese compared to slum-dwellers (Gaur, Keshri et al. 2012). Finally, Rooban and colleagues found that male slum-dwellers were more likely to use tobacco and thus at greater risk of chronic and tobacco-related illness (Rooban, Joshua et al. 2012).
A major challenge in measuring and monitoring health and population change in slum areas is the availability of current information on the location and population size of slum areas. Because of the dynamic nature of slums and slum dwellers, frequent updates are needed to provide enough information for a population-based survey sample. City-level efforts to update the 2001 Census list of slum areas have been carried out by the Urban Health Resource Center (UHRC) in Delhi, Indore and Agra (see EHP 2004; Agarwal, Kaushik and Srivasatav, 2006; Taneja and Agarwal 2004). UHRC conducted a series of situation analyses to identify and assess slums from official lists and to assess and identify unlisted slums in selected cities (Agarwal and Taneja, 2005). The UHRC assessments involved first identifying slum areas from existing lists, and subsequent field visits to slum areas to collect qualitative information on vulnerability according to housing type, land tenure, availability of public services, employment, health status and access to health services, and other socioeconomic measures (Agarwal and Taneja, 2005). The method however did not include satellite or georeferenced datasets.
Previous work has been carried out to try to identify slum areas from satellite data. Because satellite data is readily available at varying temporal and spatial scales, and can be analyzed systematically for an entire city on a desktop computer, a satellite data-based method to identify slums could prove valuable and useful to identify slum areas and to capture changes in slum areas over time. A study of Hyderabad, India developed and tested a method of statistically analyzing spectral signatures from high-resolution satellite data and found that slums of roughly 3600 square meters could be successfully identified with satellite data alone, but acknowledged that the method may not detect smaller slums (Kit et al, 2012). Object-oriented analysis techniques have been applied to settings in Africa (Kisumu, Kenya and Accra, Ghana), and while this family of methods performs well in classifying areas, the results were not fully validated against a ground-based list of all slums (Kohli et al, 2012; Stow et al, 2007; Stoler et al, 2012). Finally, a study by Jain (2007) used IKONOS satellite data to identify and validate areas of temporary structures in part of Dehradun, India. We did not identify any published study prior to 2010 (or to date) that used the analysis from satellite data alone to validate the identification of slum areas previously identified from official records (for example, municipal records or Census offices) for an entire city.
Neither the 2005/06 NFHS nor reports from the 2001 Census indicate the level of geographic concentration of slum dwellers, or of the poor. In neighboring Bangladesh, a census of slums in six cities found that 35 percent of the population of those cities lived in slums, but that the slums occupied only four percent of the land area of the cities (Angeles, Lance et al. 2009). Similar data is currently not available for India. While empirical evidence suggests that on average, slum dwellers are worse-off in terms of health status, the available data tell us little about whether the slum dwellers are actually the poorest. The average health of slum dwellers may mask heterogeneity within slums, and household income may also vary greatly within slum areas. Currently available asset indicators such as those from the NFHS are known to be biased in urban areas, and may not paint a complete picture of urban poverty much less poverty within slum areas (Rutstein 2008). The lack of such data for India highlighted the need for an alternative strategy to delineate the slum and non-slum areas for the UHI study cities in order to measure differences in average changes among slum and non-slum dwellers, in addition to capturing total change across cities.
An additional challenge in measuring and monitoring the health and population of slum areas is whether slum areas are adequate proxies for identifying the urban poor. Given that there is no standard definition of slum and that slum areas constantly change and improve or become worse over time, whether or not slum areas can serve as a good proxy for the poor will inherently depend on the definition used for slums, and for the poor. Chandrasekhar and Montgomery (2009) reviewed the findings from two National Sample Survey Organization (NSSO) surveys carried out in India: the 61st round of the household expenditure survey and the 58th round of the survey on housing conditions to assess how changes in the poverty line could result in changes in affordability of minimally adequate accommodation. The authors found that a substantial proportion of the urban poor as defined by the official poverty line (23.4 percent) do not live in notified or non-notified slums (Chandrasekhar and Montgomery 2009). The authors found that about half the population (51.7 percent) living in non-notified slums and about 44 percent of those living in notified slums were below the poverty line. And while the authors note that this distribution could be a result of misclassification of non-slum areas which are really slums, the results also suggest that the a substantial percentage of the urban poor may live in non-slum areas, according to the definition of slums used in the NSSO surveys.
Because the UHI program is tasked with targeting the urban poor and intended to target slum areas with their program, the MLE project sought to capture changes in contraceptive use and other health outcomes among the urban poor and slum areas in particular. Notably, the goal of the UHI program was to increase contraceptive use at the city level, thus both slum and non-slum areas were required for the evaluation. Therefore, a representative sample of women in slum and non-slum areas was necessary. A wealth index was also created to assess relative wealth at the household level. Since slum areas may not cover a large geographic proportion of the study cities, the sampling approach was designed to ensure that the slum dwellers would be adequately represented in the sample. To do this, the MLE project needed to over-sample slum areas to ensure enough women were included who were exposed to the program in the slum areas. Given that the 2001 Census data and Census slum designations were outdated, alternative sources of data were required in order to spatially delineate the cities so that the sample selection could be carried out. At baseline, a representative sample of women were selected and interviewed in 2010; a midterm survey was conducted in 2012, and an endline survey will be conducted in 2014 among the same women. The overall goal of the sample design for the baseline household survey was to create a population-based sample that would represent the slum and non-slum areas within each of the study cities.
Defining and delineating slum areas in the UHI study cities for the purposes of the evaluation would ideally be consistent with the local and governmental context. In part to facilitate international comparability in monitoring the lives of slum dwellers, UNHABITAT adopted a household-level definition of slum as “individuals living under the same roof in an urban area lacking one or more of the following five amenities: durable housing, sufficient living area, access to improved water, access to improved sanitation facilities, and secure tenure (UNHABITAT 2002). This definition was applied to operationalize the slum classifications used in a subsequent series of DHS surveys to enable slum and non-slum comparisons in a nationally representative and internationally comparable dataset (UNHABITAT 2013). In India however, the government uses a context-specific adaptation of the UNHABITAT definition, where a slum-like household meets all of the following characteristics: roofing material is not concrete, water source not available on the premises, no latrine facility within the household premise, and the household does not have closed drainage (Gupta et al, 2009).
Each state in India has adopted its own adaptation of the definition and classification of slum areas, however the Government of India consolidates this information in order to report on state and national level population trends in slum areas (Government of India and Ministry of Housing and Urban Poverty Alleviation 2010). The national level Government of India designation of slum areas for Census purposes is defined by the Registrar General, and this definition has been applied to the Census 2001 and 2011 to constitute one or more of the following conditions:
In Uttar Pradesh, the state definition of slum includes: areas where the majority of buildings are dilapidated and overcrowded, lack ventilation, light or sanitation facilities, or are otherwise unfit for human inhabitation (Government of India and Ministry of Housing and Urban Poverty Alleviation 2010).
In 2010, when the baseline evaluation survey was being designed, the last national Census of India had been conducted in 2001, and the available population sampling frame was almost 10 years old. Efforts to identify slum areas or areas of deprivation in the study cities using Census-based household indicators would be therefore out of date. To meet the goal of selecting a current representative sample of women from the slum and non-slum populations in each city, a new sample frame of slum and non-slum areas was needed. Furthermore, to draw a sufficiently sized sample of the urban poor that would capture changes in contraceptive use and other health outcomes in the target group of the program, oversampling of slum areas was necessary. This required the delineation of mutually exclusive slum and non-slum sampling domains in the six study cities, using a common definition of slum.
Three spatial datasets were used to develop the slum and non-slum sampling domains: slum area boundaries, ward boundaries, and QuickBird satellite imagery. The study area included the outer boundary of all populated areas of each ward in each city. Some areas of the study cities contained cantonments (permanent or semi-permanent military quarters); these areas were excluded from the survey study area.
The spatial dataset of slum areas was obtained from the Remote Sensing Applications Center (RSAC) of Uttar Pradesh (Tangri 2009). The slum dataset was created by the RSAC from a series of inputs. The RSAC first prepared digital base maps of the study cities using data from the Survey of India, supplemented by other government and private agency spatial data. A list of registered and notified, recognized and identified slums was obtained from the city administrations and city developmental authorities in each city. These lists were supplemented by information from the city health administration offices, which maintain lists of slum areas in order to provide health services to slum dwellers. The combined list was then merged with the spatial administrative dataset to locate the general location of each slum area. The RSAC overlaid these slum polygons on QuickBird satellite imagery to compare the accuracy of the boundaries with the footprints of structures visible on the imagery. The imagery allowed for the identification of slum areas using physical parameters including the shape and size of each individual hutment, clustering or high density of structures with or without a road network, irregular and haphazardly grouped temporary, poorly-constructed or semi-permanent households.
Using the spectral signature characteristics of the slum areas from the QuickBird imagery, some newly developed slums which were not on the consolidated list were added by RSAC. Likewise, some slum areas on the consolidated list had vanished and were replaced by organized residential areas, and these were eliminated by RSAC from the spatial dataset. The signatures associated with these characteristics were used to spatially identify all slum areas in each city. The RSAC conducted some field-based ground validation to verify the results of the satellite imagery processing and include any additional slum areas found in the vicinity of identified slum areas during validation.
To supplement the administrative boundary data, pre-rectified QuickBird satellite images were obtained by UNC for each study city. The images were taken between September 2007 and June 2009; no post-processing was done. The high-resolution QuickBird imagery has a spatial resolution of 60 centimeters, which allows for the identification of any object with a spatial footprint of approximately120 square centimeters (about four square feet). The images were then merged into a geographic information system (GIS) using ArcGIS along with the administrative map data. The slum areas identified from the lists and georeferenced on the administrative maps by RSAC were then overlaid at UNC with the 2009 QuickBird imagery. The final set of slum polygons for the study cities were compiled, representing the spatial footprint of the slum areas.
Administrative boundary, roads and landmark data for the cities were obtained from the third party vendor Map My India (MMI) (Map My India 2009). Ward data obtained from MMI included Census 2001 population information associated with each ward in each city. The cities contained 60 to 91 wards with an average of approximately 1,550 to 2,500 households per ward, based on 2001 Census counts (the 2001 Census counts were not used in this study).
The sampling design first called for each city to be partitioned into slum and non-slum sampling domains. Within each domain, the wards and slums were then divided into mutually exclusive enumeration areas or primary sampling units (PSU) of approximately 150 structures each, covering each entire city. In each city, a total of 64 slum and 64 non-slum PSUs were selected. Once the pre-determined number of PSUs was selected, each PSU was visited by a team of mappers and listers to create a set of base maps for the next stage of sampling of individual households.
The slum sampling domain was created from the set of slum polygons provided by RSAC and described above. Slum polygons which contained more than 150 structures as determined by manual observation in ArcGIS were then subdivided manually into PSUs of the target size of about 150 structures. Slum polygons which appeared to contain fewer than 150 structures based on manual observation with the QuickBird imagery in ArcGIS were merged with adjacent or nearby slum polygons to get a PSU in the right size range. This process was carried out for each study city, resulting in a final set of slum PSUs. The final set of slum PSUs were then numbered for later random selection.
To create the non-slum sampling domain, the slum polygons were overlaid on the ward polygons and then subtracted from the ward polygons of each study city area using ArcGIS. The remaining non-slum areas of each city were then manually subdivided into PSUs in ArcGIS. Using the QuickBird imagery, ward boundaries, and road vector data, analysts at UNC-Chapel Hill reviewed each study city and manually digitized non-slum PSU boundaries so that, like the slum PSUs, each non-slum PSU contained approximately 150 structures. Non-slum PSUs were created in contiguous fashion so that no PSU overlapped, and the outer boundaries of the PSUs were readily identifiable by field staff. Furthermore, PSU boundaries did not cut through buildings, across water bodies or other non-navigable areas, but did cross open land areas such as fields. As with the slum PSUs, the final set of non-slum PSUs were numbered for later random selection.
The resulting dataset for each city consisted of all slum polygons, including those that were merged or divided in order to meet the 150 structure target per PSU, and all non-slum polygons as created from the non-slum areas within the study areas. The number of slum PSUs within each of the study cities ranged from 66 in Moradabad to 565 in Varanasi, while the number of non-slum PSUs ranged from 426 in Gorakhpur to 868 in Allahabad. In cities where the ratio of slums to non-slums was geographically small, it took much more work to delineate the non-slum PSUs from the satellite data. Where the geographic coverage of slums was larger, these PSUs were already delineated, and it took less effort to divide up the rest of the study city into non-slum PSUs.
After the full set of slum and non-slum PSUs were delineated and numbered for each city, 64 PSUs were selected from the slum and non-slum domains in each city. An equal number of PSUs from slum and non-slum domains was selected to ensure that the full sample included a large enough number of slum residents, the target group for the program. Within a domain, PSUs were selected with equal probability. Sampling weights were later generated to account for the probability of selection for each household and woman in each strata of each city. Detailed manual analysis was then repeated for the selected PSUs to prepare field maps for the household listing (see figures 4–6). An overview map was prepared to enable field teams to find the general location of each selected PSU, and detailed maps with latitude and longitude coordinates were provided to enable the field teams to determine the PSU boundaries. The overview map displayed the QuickBird imagery, latitude and longitude coordinates for vertexes of the outer PSU boundaries, landmarks, major street names, and the outline of the selected PSU. Supplemental maps showing only the PSU boundaries on a white background with line features of the major and residential roads were also provided to the teams so that they could draw in the structures and landmarks for the entire PSU as the households were listed. Field teams were organized on the ground, equipped with the area maps and PSU maps for the listing of households in each PSU. These maps allowed the teams to accurately determine the location and boundary of each PSU. The mapping and listing teams then drew in symbols for each structure and listed all households.
After the PSU selections were made, field teams found that some PSUs were found to contain substantially more than 150 structures and were too large to be listed; these PSUs were. In these cases, the field team divided the PSU into several segments that contained roughly the same population. One of the segments was then chosen randomly with equal probability of selection, and was then mapped and household listed. After the PSUs were listed, approximately 25 households were systematically selected for interview for the second phase of survey fieldwork. The number of households listed in each cluster was divided by 25, and the resulting number became the sampling interval in that cluster. A random starting number was chosen (between 1 and the sampling interval), and subsequent households were selected based on the interval for that cluster.
The results of the household survey have been described in depth elsewhere (Nanda, Achyut et al. 2011; Speizer, Nanda et al. 2012). Speizer and colleagues (2012) found that at the time of the baseline population survey in 2010, women in the poorest household wealth quintiles were less likely to be using contraceptives and had a greater unmet need for family planning than women in the richer quintiles. Results from Speizer and colleagues (2012) also show that women in slum areas were more likely to be using sterilization as their contraceptive method, and were generally less educated compared to women in non-slum areas.
In this paper, we demonstrate the use of satellite data combined with existing locally-produced slum area polygons to identify and delineate urban poor areas in UP, India and to develop a sampling frame for subsequent use. Based on the developed maps, we were able to select a large, representative sample of households to be included in the evaluation of the UHI program. Because the typical sampling frame for these types of studies -- the Census -- was out of date, and due to continued rural to urban growth in the study cities, this GIS-based sampling approach provided an alternative sampling design in a timely fashion. In urban areas where it is possible to obtain information on slum polygon boundaries and combine it with high-resolution satellite data, this approach may be a useful strategy to permit over-sampling of the poor or for undertaking a study of predominately poor areas. While we are unable to verify our slum sample domain with external data sources, based on the characteristics of the sample in the slum and non-slum areas, we generally find that the indicators for the slum population are worse off than those for the non-slum population (Nanda, Achyut et al. 2011; Speizer, Nanda et al. 2012).
This approach is not without limitations that would need to be considered prior to replicating it in other sites. In particular, in our study cities, slums were sometimes geographically small, and on the ground were not easily distinguishable from non-slum areas. Conversely, large slums were sometimes difficult to split into smaller PSUs even with the high-resolution imagery; this was largely because the slum areas often did not contain many roads or visible walking paths which could be used as navigable boundaries of the PSUs. There was no recent or reliable ancillary demographic data to help calibrate the PSU delineation. Future applications of this approach would require spending more time to carry out ground-based validation during the development of slum or non-slum PSU boundaries because it is virtually impossible to distinguish household structures from non-household structures strictly from satellite data. An additional phase of fieldwork could provide the opportunity to identify field-appropriate ways to divide the larger slums into PSUs of the target size for a similar sample design.
Our baseline results highlighted earlier in this paper suggest that slum areas in the six study cities are different from non-slum areas in terms of socioeconomic measures, and slightly worse in some health outcomes. Indeed, many of the slum areas in the study cities have existed for a long time, and very wealthy areas may be located within close proximity, as in many other urban settings around the world. Some slum dwellers might be able to afford to live in much nicer neighborhoods of the city, but choose to live in the slum areas for convenience or for family reasons. While we cannot delve into the within-slum differences in this study, the observed heterogeneity in our own results and that of previous research, and lack of pronounced differences in health and wellbeing among slum and non-slum dwellers suggests an important area of further research.
Identifying the size and geographic boundaries of each selected PSU was an important part of the overall quality of the sample, as the accurate PSU boundaries meant that the listing teams could identify and count all households in the PSU. Feedback from India field teams provided insights into problems they encountered with the PSU maps, especially where there were large structures with multiple stories and significantly more than 150 households for listing, and these were quickly segmented. Because an accurate listing of all households in the selected clusters was an essential part of the second stage of the sample design to inform the calculation of sample weights, having appropriately sized PSUs was crucial. In some cases, the PSUs -- typically slum -- had too few households to reach the sampling quota, and in other cases the PSUs -- typically non-slum -- had too many households; this slowed down the listing process and increased the duration and costs of fieldwork. This could have been remedied with earlier ground validation fieldwork, however this too would have increased the costs of the survey.
Given the decennial schedule of population censuses, continued rapid population growth and the dynamic nature of cities in the less-developed world, demographic and health research will always face the challenges of updating sample frames for surveys. The increasing availability, quality and resolution of ancillary spatial datasets provide increasing opportunities to fill the gaps in up-to-date population sample frames, but extensive groundwork must still be carried out alongside the spatial data processing and analysis required to generate spatial representations of the current population. Applying a similar methodology in a setting where slum areas may be larger than those in UP, India, and easier to delineate with satellite data, such as in some sub-Saharan African cities, could prove promising. Using these types of novel sampling approaches will permit a better assessment of the health status of urban poor and non-poor populations to inform programs that seek to improve child, maternal, and urban health and well-being.
The authors gratefully acknowledge support from the Bill &Melinda Gates Foundation for the Measurement, Learning & Evaluation project, the evaluation component of the Urban Reproductive Health Initiative, and for general support from the Carolina Population Center (R24 HD050924).The authors also thank the Remote Sensing Application Center for providing data used in this project.