|Home | About | Journals | Submit | Contact Us | Français|
The evaluation of continuous glucose monitor (CGM) alert performance should reflect patient use in real time. By evaluating alerts as real-time events, their ability to both detect and predict low and high blood glucose (BG) events can be examined.
True alerts (TA) were defined as a CGM alert occurring within ± 30 minutes from the beginning of a low or a high BG event. The TA time to detection was calculated as [time of CGM alert] – [beginning of event]. False alerts (FA) were defined as a BG event outside of the alert zone within ± 30 minutes from a CGM alert. Analysis was performed comparing DexCom™ SEVEN® PLUS CGM data to BG measured with a laboratory analyzer.
Of 49 low glucose events (BG ≤70 mg/dl), with the CGM alert set to 90 mg/dl, the TA rate was 91.8%. For 50% of TAs, the CGM alert preceded the event by at least 21 minutes. The FA rate was 25.0%. Similar results were found for high alerts.
Continuous glucose monitor alerts are capable of both detecting and predicting low and high BG events. The setting of alerts entails a trade-off between predictive ability and FA rate. Realistic analysis of this trade-off will guide patients in the effective utilization of CGM.
Low and high glucose alerts are standard features on every continuous glucose monitor (CGM). The availability of alerts in real time may help patients reduce time spent in low and high glucose zones1–5 and reduce hemoglobin A1c (HbA1c) without increasing hypoglycemia risk.6–9 There is little consensus on methods to evaluate CGM alert performance.10 The first reported CGM alert study applied receiver–operator characteristic (ROC) curve analysis11 to paired CGM and blood glucose (BG) readings. The ROC describes alert performance in practical terms of true alert (TA) and false alert (FA) rates. However, as applied in this and other reports,12 the ROC did not account for the time series of CGM data. When a low glucose alert sounds, it indicates that CGM readings have crossed the alert level and are falling. Because glucose, when falling, requires time to slow down, turn around, and move in the other direction, CGM alerts are predictive in nature.13,14 The objectives of this article were to present evaluation methods that reflect (1) how CGM alerts behave and are used by patients in real time and (2) how the results can assist patients in setting their alerts. Because of their predictive capability, CGM alerts can be used not only to detect undesirable BG excursions after they occur, but to minimize or avoid these excursions. However, the setting of alerts requires patients to make trade-off decisions between the timeliness of excursion detection and the frequency of nuisance alerts. To demonstrate these trade-offs, the evaluation methods are applied to clinical data collected with the DexCom™ SEVEN® PLUS CGM.
The first CGM ROC analysis11 categorized alerts as true positive (TP), false positive (FP), true negative (TN), and false negative (FN) (Table 1). From the results of this categorization, alert sensitivity and specificity were derived:
When ROC analysis is applied to paired CGM-BG readings without accounting for the time of event occurrence, the results are misleading. In typical CGM studies, about 10% of BG values are ≤70 mg/dl.5,12 If the likelihood of collecting a low BG reading is 10%, a low glucose alert that is false 50% of the time will still appear highly specific. To illustrate, if 1000 BG readings are collected during a clinical study, 100 of which are ≤70 mg/dl, a CGM with 50 TPs and 50 FPs would have a sensitivity of 50%, as sensitivity = 50/100, and a specificity of 94%, as specificity = 850/900 (Table 2). Of the 100 CGM alerts, TP + FP, 50% are false. The 94% specificity, which suggests a very different likelihood of false alerts (6%), overestimates alert performance because the specificity calculation incorrectly assumes that all TNs are related to the low alert. In fact, the TN compartment is mainly composed of data unrelated to the timing of the alert.
By using paired CGM-BG readings, and not accounting for when alerts actually occur, alert performance can be underestimated as well. In an example from the current study (Figure 1), an alert sounded at 1:50 pm. Twenty-five minutes later, the BG confirmed the alert as a TA. If, however, individual paired CGM-BG readings are categorized as TP, FP, and so on, then three FPs (BG >100 mg/dl, CGM <100 mg/dl) would be reported more than 80 minutes after the low alert sounded. In fact, these three paired CGM-BG readings occurred when the alert was not sounding, and presumably had already been responded to, as glucose levels were rising. Such data points, which would have been reported as FPs in previous studies,11,12 should not be considered as a FA because they are not related in time with an actual alert.
To evaluate CGM alerts accurately, only data proximate in time to a low or high BG event or CGM alert should be considered relevant. If alerts are used as a means to avoid excursions into low and high glucose zones, not merely to detect such excursions, alerts should be evaluated for their predictive capability. This article investigated the effectiveness of alerts by measuring how often CGM alerts proximate to low or high glucose events, as identified by reference blood values, were predictive of, i.e., preceded, the excursions. We also investigated how often, when CGM alerts occurred, the proximate reference blood values determined the alert to be false.
Continuous glucose monitor (DexCom SEVEN PLUS, DexCom, Inc., San Diego, CA) and reference BG data were collected from 53 adults with insulin-dependent diabetes [43 (81%) type 1] across three sites within the United States, wearing 72 sensors (18 wore 2 sensors, and 1 sensor was replaced). Subject demographics (mean ± standard deviation, unless otherwise indicated):
Subjects were instructed to calibrate twice per day with a BG meter. Each subject wore the CGM for 7 days and underwent an 8-hour in-clinic session on day 1, 4, or 7. During the in-clinic session, venous plasma samples [measured with a Yellow Springs Instruments (YSI)-2300 (YSI Life Sciences, Yellow Springs, OH)] were taken every 15 minutes.
Low BG events were defined as follow:
Target BG thresholds of 70, 80, and 90 mg/dl were considered. CGM low alerts were defined as the time at which the CGM readings transitioned from > to ≤ low alert level.
Using time windows similar to McGarraugh and colleagues,15 CGM low glucose alerts were evaluated:
Low alert levels of 70, 80, and 90 mg/dl were evaluated.
High BG events were defined by similar criteria and were evaluated using TA, MA, FA, and BA criteria appropriate for high alerts. Target high BG thresholds and CGM alert levels of 140, 160, and 180 mg/dl were evaluated.
Analysis methods are illustrated in an example scenario (Figure 2). In this example, a low glucose alert is set at 90 mg/dl to detect a target low BG of 70 mg/dl. The alert analysis uses two frames of reference. The first frame of reference is based on reference BG values (Figure 2A). In this case, a low BG event occurs when reference values are below the target level (red line). If CGM values (black lines) fall below the alert level within ± 30 minutes from the beginning of the event, it is considered a TA; if not, it is considered an MA. MAs pose a clinical risk to the patient because they represent undetected BG excursions. The second frame of reference is based on the CGM alert (Figure 2B). The low alert sounds when CGM values falls below the alert level (black line). A BA occurs if, within ± 30 minutes from the alert, reference values (red lines) fall below the alert level, but not below the target level. An FA occurs if reference values do not fall below the alert level. FAs, which represent alerts that may not require corrective action, are a potential nuisance to the patient.
Analyses were performed under the assumption that each CGM device could be treated independently. Analyses were performed using MATLAB (MathWorks Inc., Natick MA) version 7.8.
There were 49 low glucose events less than or equal to 70 mg/dl across 44 sensors, 31 subjects; 51 low glucose events less than or equal to 80 mg/dl across 47 sensors, 32 subjects; and 57 low glucose events less than or equal to 90 mg/dl across 49 sensors, 34 subjects (Table 3).
The first evaluation was with the low alert set to the low BG target level. If set to 70 mg/dl to detect BG less than or equal to 70 mg/dl, the TA rate was 65.3% (Figure 3A). During half of the 34.7% low glucose events missed, the CGM read between 71 and 83 mg/dl. With the low alert level set to 80 or 90 mg/dl, to detect a BG target level less than or equal to 80 or 90 mg/dl, TA rates were 80.4% and 77.2%, respectively. More than half of all low alerts sounded prior to the beginning of the low glucose event, i.e., more than half of low alerts were predictive. Across alert levels, the median time to detection varied between –6.6 and –6.8 minutes (Figure 4A).
If low alerts were set 10 or 20 mg/dl above the low BG threshold to detect a low BG of 70 mg/dl, the TA rate and alert predictive ability improve (Figure 3A). With the low alert level set to 80 mg/dl, the TA rate was 77.6%, and during half of the events missed, the CGM read between 81 and 85 mg/dl. The TA rate with the low alert level set to 90 mg/dl was 91.8%. At either alert level, more than 75% of TAs predicted the corresponding low glucose event. For the 80- and 90-mg/dl alert levels, median time to detection was –12.5 and –21.8 minutes, respectively (Figure 4A).
There were 57 high glucose events greater than or equal to 180 mg/dl across 55 sensors, 39 subjects, and 58 high glucose events greater than or equal to 160 and 140 mg/dl across 54 sensors, 38 subjects (Table 3).
With the high alert level set at the high BG target level, TA rates for alerts of 140, 160, and 180 mg/dl were between 87.7 and 87.9% (Figure 3B). Median time to detection varied between –3.3 and +2.9 minutes (Figure 4B). If, to detect a high BG target of 180 mg/dl, the high alert is set 20 or 40 mg/dl below the target level, TA rates and time to detection improve. With the high alert level set to 160 or 140 mg/dl, TA rates were 93.0 and 96.5% (Figure 3B), and median time to detection was –9.8 and –16.4 minutes, respectively (Figure 4B).
The increase in the TA rate that occurred with alert levels of 80 or 90 mg/dl, compared with 70 mg/dl, is accompanied by an increase in the FA rate (Table 4). With the definition of FA as all BG values proximate to the low alert above 90 mg/dl, the false alert rate increased from 8.0% (of 50 alerts) with the alert set to 70 mg/dl to 14.8% (of 54 alerts) and 25.0% (of 60 alerts) with the alerts set to 80 or 90 mg/dl, respectively. Similarly, at high glucose, with the definition of FA as BG never rising above 140 mg/dl within the ± 30-minute window from the alert, the FA rate increased from 0.0% (of 52 alerts) with the alert set to 180 mg/dl to 1.9% (of 54 alerts) and 17.3% (of 52 alerts) with the alerts set to 160 or 140 mg/dl, respectively (Table 4).
There is a trade-off in setting CGM alerts between increased safety with higher TA rates (lower MA rates) and increased nuisance with higher FA rates. This trade-off is illustrated in the ROC curve (Figure 5). The ideal alert setting would achieve a 100% TA rate and a 0% FA rate and would fall on the upper left extreme of the ROC grid. As actual alert settings are adjusted to increase the TA rate (the vertical position on the grid), the FA rate (the horizontal distance from the left extreme of the grid) also increases (Figure 5).
This evaluation of CGM alerts as time series events provides information that can assist patients in setting their alerts. These results show that for a low BG target level of 70 mg/dl, a low alert set to 90 mg/dl is capable of detecting 91.8% of low BG events. Seventy-five percent of these alerts will occur 7 minutes or more before the low glucose excursion, and 50% of these alerts will occur 21 minutes or more before the excursion. However, with the low alert set to 90 mg/dl, it can be expected that 30.0% of low alerts will correspond to BG levels between 70 and 90 mg/dl and that 25.0% will correspond to BG levels >90 mg/dl.
A useful analogy for CGM alerts is an alarm clock. For people to arrive at their office by 8:00 am, they typically set their alarm clock up to several hours ahead of the desired arrival time. If they set their clock for 8:00 am, they would be late. Factors involved in managing glucose levels and setting CGM alerts are of course more complex than those involved in setting an alarm clock, but perhaps similar considerations apply. Continuous data from a CGM allow the patient to account for the “turnaround time” needed to minimize or avoid an undesirable excursion. In a low alert scenario, the threshold alert is set at a level above the undesirable low glucose zone. Because glucose moving downward follows strict laws of physics and physiology, once it passes a threshold while declining it will continue to go in the direction it was originally going. In order to turn around and go in the other direction, it has to decelerate and then begin to accelerate in the other direction.13,14 Setting the alert at a level above the undesirable low glucose zone gives the patient time to take corrective action.
The trade-off between safety and nuisance in the setting of CGM alerts can also be illustrated with an alarm clock. The alarm clock can be set to account for known time requirements and for potential unexpected delays, such as weather or traffic problems. The earlier the alarm clock is set, the higher the likelihood of timely arrival. However, this benefit is offset by the nuisance of an early alarm. When a low alert level is set above a low BG target, an increase in the TA rate (decrease in the MA rate) and an improvement of the time to detection, which reduce risk of hypoglycemia, are accompanied by an increase in obtrusive and potentially annoying FAs (Figure 6).
Although the perception and tolerance of FAs may be subjective, the evaluation of FAs should attempt to answer the question: when an alert sounds, what is the likelihood that BG levels are in the normal zone and do not require immediate attention? A binary evaluation that categorizes alerts as either true or false may not reflect how alerts are actually perceived. The method of this investigation defines a gray or “benign” zone in which alerts could be perceived as tolerable (Figure 2B). If BG levels fall in this zone, an alert that sounds cannot be considered a TA, but may not be perceived as a nuisance either. To detect BG levels less than or equal to 70 mg/dl, if the low alert is set to 90 mg/dl, alerts that occur when BG levels fall below 90 mg/dl but not below 70 mg/dl may be perceived as tolerable or benign (Figure 6A). A binary definition in which an alert that sounds when BG levels do not fall below 70 mg/dl as false would result in a 55.0% FA rate. The perception of a high likelihood of FAs may discourage patients from setting the low alert to 90 mg/dl. However, in 30.0% of alerts that occurred at this setting, BG levels fell below the 90-mg/dl alert level. Given an alert setting at 90 mg/dl, these alerts can be expected to occur. In only 25.0% of alerts did BG levels not fall below the 90-mg/dl alert level. In these 25.0%, the alerts may have been an unnecessary intrusion. The setting of alerts requires each patient to make trade-off decisions between the timeliness of excursion detection and the frequency of FAs that depend on their individual tolerance for risk and nuisance.
This evaluation relied on reference BG values sampled every 15 minutes to define the beginning of each low and high BG event. If reference values had been sampled more frequently, e.g., every minute, or if interpolation between values had been used to simulate a more frequent sample rate, the beginning of each event would occur 1 to 14 minutes earlier than was measured. On average, the beginning of the event would occur between 6 and 9 minutes earlier than as measured at a 15-minute sample rate. The time-to-detection results would accordingly shift toward less timely detection. If reference values had been sampled at a less frequent rate to more closely simulate actual patient practice, e.g., every 4 hours,13 the relative detection capabilities of CGM vs BG measurements would be higher than suggested by this analysis. Results presented here should therefore be interpreted as a comparison of the detection capabilities of CGM, at its intrinsic sample rate, and BG measurements if taken every 15 minutes.
Because of time lags between glucose in interstitial fluid (ISF) and blood, it has been speculated that CGMs can be assumed to read high when glucose levels are falling.16 As reported previously with clinical data from the DexCom SEVEN, when glucose levels were falling faster than 1 mg/dl, the CGM read lower than BG by an average of 5.5% (relative difference).17 Previous theoretical suggestions that sensors should have a high bias when glucose was falling16 did not take into account (1) that sensors do not measure native ISF, they measure wound fluid, which is in more rapid equilibrium with blood than ISF; (2) calibration algorithms, which can take time lag into account; and (3) error from the calibration process, which is the greatest contributor to sensor error.14,17
This investigation focused on the evaluation of CGM alerts as they occur in real time and the predictive capability of alerts to help patients avoid undesirable excursions. The alerts evaluated in this investigation are threshold alerts that sound when glucose readings cross a patient-selected level. These alerts are distinct from alerts designed explicitly to predict glucose excursions, called “projected” alerts.15,18 With projected alerts, the trend of glucose readings is extrapolated up to 30 minutes into the future, and if the extrapolated glucose readings cross a threshold, an alert will sound. Like threshold alerts, the projected alerts are time series events. The first available real-time projected alert was the GlucoWatch G2 Biographer “down alert,” which extrapolated the glucose trend forward 20 minutes.18 In a study intended to evaluate the effectiveness of the “down alert” at predicting overnight hypoglycemia, although the sensitivity to detect hypoglycemia was improved from 8% with the threshold alert to 77% with the down alert, 65% of down alerts were false (BG >70 mg/dl within ± 30 minutes from the alert) compared to 16% for the threshold alert. More recently, McGarraugh and colleagues15 reported on the performance of Abbott Navigator projected alerts. With a threshold alert setting of 85 mg/dl, 47.7% of TAs occurred after the low glucose excursion (less than 70 mg/dl) had begun. The addition of projected alerts was intended to increase warning time, but to avoid the occurrence of FAs, the projected alerts were only activated when glucose was falling rapidly. As a result, 27.7% of (true) projected alerts occurred after the low glucose excursion had begun. To best utilize projected alerts, it may be helpful to ask the question: what is the likelihood that, prior to a low glucose excursion, the projected alert will sound? And if it sounds, how many minutes prior to the low excursion will the alert sound?
An approach to predictive alerts that perhaps avoids the nuisance cost of auditory FAs is as applied by Buckingham and associates19 to prevent nocturnal hypoglycemia by the suspension of insulin delivery. Because of the duration of action of basal insulin, the prediction horizon needed to prevent hypoglycemia could be 45 minutes or longer. The cost of FAs in this application is not the sounding of a nuisance alert, but rather rebound hyperglycemia from reduced basal insulin. In the 21 subjects studied following a 90-minute suspension of insulin delivery, rebound hyperglycemia was not seen. If effective utilization of predictive alerts requires prediction horizons beyond 30 minutes, a horizon likely to result in high FA rates, perhaps an important application, one intended to be unobtrusive to patients, is toward the automatic control of insulin delivery.
Evaluation of CGM alerts as time series events better reflects their real-time performance and utilization than the paired point-based methods used in early analyses.
If set to appropriate levels, CGM alerts are predictive of low and high glucose excursions. However, when set to levels for increased predictive ability, the false alert rate also increases. A realistic analysis of the trade-off between predictive ability and FA rate of alerts will guide patients in the effective utilization of CGM.
The authors thank Drs. Timothy Bailey (AMCR Institute, Escondido CA), Howard Zisser (Sansum Diabetes Research Institute, Santa Barbara CA), and Anna Chang (John Muir Physician Network Clinical Research Center, Concord, CA) for their involvement in the clinical study used for this article.