The present study analyzed epidemiological datasets of confirmed influenza cases whose diagnosis took place at Narita International Airport during the early stages of the 2009 pandemic and of a selected and suspected fraction of passengers screened from September 2009 to January 2010. In our retrospective assessments of the diagnostic performances of fever screening in detecting and diagnosing influenza at the main entrance airport to Japan, three key findings appeared to be notable. First, despite the small sample size, the sensitivity of fever (e.g. 38.0°C) for detecting H1N1-2009 upon arrival was estimated to be as low as 22.2% among the confirmed cases with H1N1-2009. In addition, 5 of the 9 confirmed cases with H1N1-2009 (55.6%) were under antipyretic medications upon arrival. Second, the estimates of the diagnostic performances of the infrared thermoscanners in identifying fever among the selected and suspected fraction of passengers were smaller than those in previously published studies, in which the samples were mostly general populations based on prospective study designs and/or under ideal study conditions [15
]. For example, the sensitivity and AUC for the cut-off level of 38.0°C in the present study were as low as 50.8% and 72.4%, respectively. Third, even though we examined a suspected fraction of passengers as our subjects (i.e. those who were theoretically more likely to be febrile than the remaining passengers), the PPV still appeared to be as low as 37.3-68.0%. Considering the total passengers arriving at Narita International Airport, the actual PPV will be smaller than our estimates (owing to the smaller prevalence of hyperthermia), implying more false-positive passengers during mass screening if one relies on infrared thermoscanners for active detection of hyperthermia [21
]. In summary, our retrospective study demonstrates that reliance on fever alone is unlikely to be feasible as an entry screening measure.
The most important caveat of the present study is that there are two independent processes when considering the diagnostic performances of fever screening at an international border [28
]. The first is the sensitivity of fever for detecting influenza cases. Although influenza-like illness (e.g. defined as a temperature higher than 37.8°C plus either cough or sore throat) can be accurately found by clinical examinations, it is known that the clinical findings do not permit the confirmation or exclusion of the diagnosis of influenza [29
]. Whereas the sensitivity of fever alone is undoubtedly higher than that of influenza-like illness and fever screening may be useful for avoiding a substantial number of false-negatives [31
], more critical studies on influenza-like illness have indicated that a high temperature (37.8°C or higher) is not the prime indicator of influenza [32
]. Thus, even with these facts alone, it is evident that active identification of influenza cases by fever screening alone is unlikely to be feasible. In addition, our experience at Narita International Airport led us to realize that the axillary temperature tends to be readily modified by commercial medications (e.g. antipyretics) in practical settings. Although the proportion of febrile cases among confirmed H1N1-2009 cases was reported to be 94% in the United States [34
], no direct comparison can strictly be made because the fraction of febrile cases at an international border is different from that among a total number of confirmed cases in a community. However, that figure of 94% and the figure of 22.2% obtained in our study indicate that the antipyretic medications taken by our study participants potentially reduced the risk of fever by 76.4%.
Second, even though the diagnostic performances of the infrared thermoscanners in detecting fever were not sufficiently high, the prevalence of hyperthermia would be very small among the total number of international passengers, and thus the PPV would be considerably lowered [20
]. The finding our study adds to the literature on this point is that the PPV of infrared thermoscanners was still insufficient for actively detecting febrile passengers, even when our interest was restricted to a suspected fraction of passengers. The sensitivity of entry screening in correctly detecting and diagnosing symptomatic influenza is measured by the product of the above-mentioned two different sensitivities [28
], i.e. the sensitivity of fever for detecting influenza cases and the sensitivity of a non-invasive device for detecting febrile passengers. The PPV of entry screening is therefore smaller than that of the infrared thermoscanners alone. Of course, a confirmatory diagnosis of influenza is further required to account for the limited sensitivity of the rapid diagnostic testing. The present study does not criticize the use of infrared thermoscanners, but does emphasize that reliance on its use during the entry screening of influenza is unlikely to be feasible. Such devices could be used for other purposes (e.g. estimation of true prevalence based on known estimates of sensitivity and specificity among the total passengers) or in other settings (e.g. screening of fever in a setting with a far greater prevalence of hyperthermia), because infrared thermoscanners improve the detection of fever and are especially useful in settings where the PPV and NPV do not matter [35
Our estimates of the diagnostic performances must be interpreted with caution (Table ). The analyses of our second dataset were based on a retrospective non-random sample that was considered to represent a suspected fraction of passengers. In other words, the estimated sensitivity and specificity are not applicable to other passengers owing to the imposed selection criteria, and instead are only useful for the sample population that we examined. Nevertheless, given the previous reports of the sensitivity and specificity among a wider spectrum of the population [20
], this point should not be regarded as a negative aspect. The scientific value of our retrospective study was to demonstrate that the diagnostic performances of infrared thermoscanners in detecting febrile passengers, especially the sensitivity, can be even worse among the suspected fraction of passengers than among all the passengers. In addition to previous studies indicating that the use of infrared thermoscanners for fever screening prior to voluntary self-reporting was not fully justified [20
], our study has demonstrated that infrared thermoscanners were not useful for actively detecting fever, even among a selected and suspected fraction of passengers. Our investigation of a selected and suspected fraction of passengers only, especially with the inclusion of those detected by the infrared thermoscanners, could partly provide a reason for the small estimates of the specificity. For example, owing to the representation of the suspected fraction of passengers, there were not many subjects with low axillary temperatures among our subjects, thereby leading to small estimates of the specificity compared with all arriving passengers. Since the inclusion of cases detected by the infrared thermoscanners in our samples complicates an explicit interpretation of our estimates, we also examined the diagnostic performances only among the self-reported cases. The estimates of PPV and NPV among the self-reporting passengers did not differ significantly from those among our total subjects.
In addition to the limited diagnostic performance of fever screening in identifying febrile influenza cases, it should be remembered that the readings of infrared thermoscanners are known to be influenced by other confounding factors, most notably by age and outdoor temperature [15
]. Although we were not able to adjust for room temperature owing to its variation depending on air-conditioning and individual routes (e.g. gate and satellite combinations), age was shown to be a confounding factor, even among the suspected fraction of passengers. There are two plausible explanations for these findings: (a) physiological reasons including age-dependent vascular reactivity (e.g. the temperature varies more easily among children than among elderly persons) [36
] and (b) influenza H1N1-2009 has mainly been observed in younger individuals, most notably among school-age children [37
]. Although no confirmatory diagnoses of H1N1-2009 were made during the screening from September 2009 to January 2010, it is likely that substantial numbers of undetected cases were allowed into Japan during the study period [41
]. The above-mentioned point (b) poses a technical challenge, because the real-time dependence of age on the epidemiology of influenza introduces a time-dependency in its influence on the readings of the infrared thermoscanners (i.e. a simple statistical adjustment does not hold in such instances). As an additional complication but perhaps one of the most important features among international passengers, our experience at Narita International Airport led us to realize that the use of antipyretics and antivirals is very likely among febrile passengers in practical settings, thereby greatly complicating the detection owing to masked symptoms. Among those with any suspicious symptoms, it is natural that medications with commercially available antipyretics are widely used without any restrictions, and the different timings, doses and medicines do not permit us to adjust for the influence by statistical modeling.
Except for cases of imminent public health risk, the revised International Health Regulations (IHR) in 2005 were intended to minimize interference with world travel, permitting only non-invasive and least intrusive medical examinations that could achieve a "public health objective" [42
]. Although infrared thermoscanners are non-invasive and may detect a small portion of febrile influenza cases among the total passengers, our study has demonstrated fundamental problems in the reliance on fever in detecting and diagnosing influenza in international passengers. In addition to the issue of screening, the effectiveness of entry screening involves the presence of incubating individuals [5
] and asymptomatic cases [7
]. Given the limited information that we can gain from fever alone, one could further examine other vital signs to improve the detection during mass screening [44
], along with efforts to promote self-reporting and improve its coverage. In addition to such devices, it is vital to reconsider the public health objectives of entry screening measures with a specific disease in mind (e.g. influenza) [45
], and the way forward requires us to explicitly define the roles and purposes of international border control in the event of the next pandemic [46