The methods of recruiting young men into the Thai military have been described previously (
3,
4). Approximately 50,000–60,000 men, mostly 21 years of age, are selected each year by lottery in their home province (the province in which they are listed on the house registration). The system produces a representative national sampling of Thai men. Because of their young age, HIV prevalence in these annual recruit classes may serve as a proxy for HIV incidence. Induction into the military occurs in May (M) and November (N) of each year. At the time when blood is collected, recruits provide information about the location of their main residence (including province and district) during the previous 2 years (
3). Although the actual locations where infections occurred are unknown, residential data enables analysis of the association between HIV prevalence and this key geographic marker.
To refine the HIV prevalence analyses, geographic localization uses districts as the first administrative subunit of provinces. When data were analyzed by using annual classes grouped at the district level, calculations of the percentage testing positive for HIV were statistically unreliable because the number of men tested in some rural districts was so small. Therefore, we merged data in two ways to decrease variability attributable to the small sample size of the prevalence figures: classes were combined across time, and districts were combined across space. Sixteen classes of men recruited from 1991 to 2000 were combined temporally into four larger classes, each representing discrete 2-year periods: 1) N91, M92, N92, M93; 2) N94, M95, N95, M96; 3) N96, M97, N97, M98; and 4) N98, M99, N99, M00. Data on classes recruited from the M91, N93, M94, and N00 lotteries were not available for analysis (M91, before full implementation; N93 and M94, protocol under revision; N00, completed after dataset closed). However, even after combining classes into these 2-year periods, a number of districts still had numbers too low for statistical reliability.
We also merged some districts with neighboring districts so that each had a minimum denominator of 20 in the HIV prevalence calculation. Numbers of >20 persons provide minimal, but acceptable, reliability in the percentage-positive calculations. Districts with <20 were combined with other districts according to the following protocol, following a sequence of priorities: we combined districts if they were in the same province, had historic connections (formerly part of single larger district), had similarly small numbers tested, had similar demographics, and had similar topography or other geographic features.
For the GIS analysis, data tables provided in Excel files (Microsoft Corp., Redmond, WA) by the Armed Forces Research Institute of Medical Sciences were joined to district-level GIS maps obtained from the Thai Environmental Institute and National Statistical Office. We used Arcview 3.2a software (ESRI, Redlands, CA) to create dot density and choropleth maps. (A choropleth map uses shades or colors to demonstrate the geographic distribution of a range of values.)