|Home | About | Journals | Submit | Contact Us | Français|
Potential reduced exposure tobacco products (PREPs) may have promise in reducing tobacco-related morbidity or mortality or may promote greater harm to individuals or the population. Critical to determining the risks or benefits from these products are valid human clinical trial PREP assessment methods. Assessment involves determining the effects of these products on biomarkers of exposure and of effect, which serve as proxies for harm, and assessing the potential for consumer uptake and abuse of the product. This article raises the critical methodological issues associated with PREP assessment, reviews the methods that have been used to assess PREPs, and describes the strengths and limitations of these methods. Additionally, recommendations for clinical trials PREP assessment methods and future research directions in this area based on this review and on the deliberations from a National Cancer Institute sponsored Clinical Trials PREP Methods Workshop are provided.
In recent years, there has been an increasing need to evaluate tobacco products for human toxicity and disease risk because of tobacco company efforts to manufacture and market new products that purportedly decrease exposure to tobacco and tobacco smoke toxicants. Past attempts to manufacture safer cigarettes (e.g. “light” cigarettes) only led to false hopes of reduced health risk. With newer technologies and promotion of another generation of reduced toxicant exposure or reduced risk tobacco products by the tobacco industry, described as Potential Reduced Exposure Products (PREPs) by the Institute of Medicine (IOM), the development of a science base to inform the current debates about whether PREPs present promise or harm is required (1–2). These newer products include novelly designed combustible products, such as those that have new filter designs or are processed or cured in a way to reduce some toxicants. They also include cigarette like devices that heat tobacco, oral tobacco products that may reduce risk by eliminating exposure to toxicants associated with the combustion of tobacco or electronic cigarettes that purportedly deliver only nicotine. Some of these PREPs have been on the market with implicit or explicit health claims that appear to be unsubstantiated. It is recognized that misleading claims related to PREPs might undermine successful tobacco control by adversely affecting consumer perception, leading to continued use of tobacco products in potential quitters, inducing former smokers to resume smoking, or promoting initiation. The importance of substantiating claims also points to the necessity of scientifically evaluating the effects of these products.
In an effort to avoid the mistakes that were made by the marketing of light, ultra-light and mild cigarettes, a careful study and strategic approach to evaluating tobacco products have been considered (2–6). For the most part, these reports describe three essential components for tobacco product evaluation, which is illustrated (Figure 1): 1) preclinical evaluation which involves assessment of type and amount of tobacco constituents and smoke emissions, in vitro studies (e.g., cell culture studies) and animal in vivo studies; 2) clinical evaluation in humans and epidemiology studies involving assessment of pattern of product use, extent of exposure to toxicants and biological effect, abuse potential and consumer perception of the product; and 3) population effect of the product involving post-marketing surveillance and population surveys. The goal of these evaluations is to ensure that the PREP does not worsen exposure and disease risk compared to conventional products and to assess the amount of risk above complete cessation. Today, given that there are no acceptable biomarkers for cancer risk (6) with the possible exception of 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol and its glucuronides (total NNAL), a biomarker for 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) exposure (7–8), the best evaluation of a PREP is limited to assessing human exposure reduction. However, exposure reduction is distinct from risk reduction, e.g., exposure reduction of one or several tobacco toxicants might not result in reduced disease risk for the individual. Furthermore, although risk reduction in an individual might be feasible, the overall tobacco toxicant exposure in the population might increase because of delayed quitting or resumption of the use of tobacco products due to the availability of the PREPs. Thus, as shown in Figure 2, there is a spectrum for assessing PREPs in humans that are somewhat distinct, and that exposure and risk reduction refer to individual effects, and harm reduction refers to population effects.
Although frameworks for studying PREPs were proposed by the IOM and other reports, the actual methodologies to be used was left for later evaluation and research by scientists. One critical component of tobacco product assessment that has not undergone extensive review is the methods and measures for conducting clinical trials to assess whether a new PREP results in exposure and risk reduction, and if there can be an inferred effect on harm reduction. The goal of this paper is to identify and discuss the challenges and questions associated with clinically assessing PREPs and to review the literature on published human clinical trials studies on methods and measures used in evaluating PREPs.
The choice of trial designs, methods and measures is complex, and many decisions are necessarily made in the absence of scientific data. To help address questions that are critical to trial designs, and for which there is insufficient data to answer them, a Clinical Trials Workshop of experts (see Appendix 1) was convened. This Workshop was held on June 9 and 10, 2008 and included presentations and discussions by experts within and outside of the tobacco research area using the review paper as a basis for discussion. The deliberations and recommendations from this Workshop are also described along with recommendations for future research directions. It is important to note that this paper is not intended to discuss the relative toxicant levels or risk for disease across the PREP products, which have been described in other reviews (3, 9–11), but to primarily focus on methods used to assess PREPs.
Several challenges are associated with assessing PREPs. These challenges and questions are described in Table 1.
As with any clinical trial, the first decision to be made is the goal(s) of the trial. Depending on the research question and the scientific discipline of the researchers, different approaches can be taken. The primary reason to assess PREPs in humans is to gain an understanding for changes in exposure and risk reduction. This would happen via assessment of tobacco use behavior (e.g., topography), pattern of use and biomarkers of exposure and of effect. For cigarettes, topographical measures would include number of puffs, puff duration and volume, puff velocity, interpuff interval, and inhalation volume and duration. For smokeless tobacco (ST) users, the measures would include number of dips, size of dip, dip duration and interdip interval. Assessment of pattern of product use is also critical, as this could greatly affect individual exposure and risk. This can be assessed in a naturalistic environment by examining: 1) whether the subject uses the PREP solely or uses the PREP with usual brand tobacco products or other nicotine products; 2) the amount of PREP use; 3) time to compensation and stabilization of use; 4) the duration of use; and 5) the impact of PREP use on the use of conventional products (e.g., eventual cessation, return to usual brand at same rate or reduced rate, or switching to another tobacco product). Equally important for the PREP assessment would be to determine the extent of toxicant exposure and biological effect from using the product as compared to conventional or usual brand tobacco products, and also compared to either cessation or medicinal nicotine products. To address these main goals, there are different trial designs that can do this, but they generally involve switching a tobacco user from one product to another, e.g., in a randomized trial with cross-over or parallel arm design. These switching studies are particularly challenging because it might be difficult to decipher whether the changes in biomarkers or exposure and effect are due to the product itself, the way the person uses the product or individual differences in biological response to the product (6).
There is a balance between successful exposure reduction and consumer use. If a product substantially reduces exposure, but consumers do not buy the product, then there cannot be harm reduction. Conversely, if a product has a high level of consumer acceptance, but the exposure reduction is minimal, then there is no harm reduction. Therefore, another goal of a clinical trial would be to assess the potential for use of and addiction to the product. This goal would also ensure that products with increased abuse or addiction potential (even with reduced exposure) are not marketed, which might impact future policies such as reducing the addiction potential of all tobacco products in order to reduce initiation and facilitate cessation (12). These clinical assessments would include determining the pharmacokinetics and pharmacodynamics of a product, subjective responses to the use of the PREP, namely appeal of the product, satisfaction or liking of the product, withdrawal suppression, and improvement in mood. This assessment would provide initial clues to the extent to which the product would be used or abused.
Using PubMed on April 14, 2008, the following search terms were used and searches were limited to humans and English language: reduced exposure products and tobacco; specific PREP product names (e.g., Eclipse, Accord, Advance, Ariva, Snus, Stonewall, Quest, Next, Omni); denicotinized cigarettes and tobacco; light cigarettes and tobacco (limited to last 20 years); ultralight cigarettes and tobacco (limited to last 20 years); low tar and tobacco; low yield and tobacco, electrically heated cigarettes (ECHSS) and smokeless and tobacco. In addition, references cited in each of the articles were searched through for other relevant articles. Studies were selected based on whether or not they assessed a PREP or low yield cigarettes and if it was a human clinical trial. The data was compiled to examine: a) goals of the study, b) experimental designs that were used, c) measures that were used, d) subject recruitment method, content and inclusion criteria, e) methods to determine compliance, particularly in studies that involved using the product outside a laboratory setting.
Studies of the most recent PREPs introduced into the market place, or studies of conventional products with applicability to PREPs usually involve specifically examining potential for use and addiction to the product (e.g., nicotine pharmacokinetics and product preferences), subjective and physiological responses and biomarkers of exposure. These can be divided into 5 groups of studies, namely abuse liability studies, in-laboratory clinical trials (subject uses the product once or a few times, but only in a laboratory setting), short-term clinical trials (<2 weeks of duration on a particular product at home or in a residential facility, and the products are used throughout the day), intermediate-term clinical trials (> 2 weeks and ≤ 12 months, and continuous use) and cross-sectional studies. Studies examining the potential for addiction to the product, that can precede short and intermediate-term clinical trials, have generally been short in duration and have typically occurred in the laboratory although a few studies have occurred outside the laboratory. Other in-laboratory clinical trials are limited to studying only exposure assessments for biomarkers with extremely short time to equilibration and half-lives, (e.g., exhaled carbon monoxide (CO), carboxyhemoglobin and nicotine). They also can measure acute physiological and subjective responses that may provide insight into potential consumer use or interest in the product. These studies are limited because it is unlikely that subjects adapt to the product, stabilize their pattern of use and compensate for differences in nicotine delivery. Short-term clinical trials less than 2 weeks long have similar limitations, although the available biomarkers that can be used increase where the time to equilibration and half-lives can be longer (but less than 2 weeks). Intermediate-term studies provide more information about toxicant exposures where biomarkers with longer half-lives and physiological responses after the subject has adapted to the product can be assessed. Cross-sectional studies of tobacco-users in the population can provide information about longer periods of product use, larger numbers of subjects, outcome measures on persons who self-selected their use of a product, and data analysis by subgroups. However, cross-sectional studies are limited because they do not describe changes in use over time or information about delayed quitting and cessation and are subject to cohort effects.
The study designs, outcome measures, subject recruitment methods, subject characteristics and product compliance procedures are described for each of the 5 categories or types of studies. A summary of the results, Clinical Trials Workshop recommendations and future research questions with regards to study design are addressed under each of the study type subsections. The summary and Workshop recommendations are combined for the short-term and intermediate-term trials. Clinical Trials Workshop recommendations for issues that cross study types are reserved for discussion later in the paper.
The assessment of harm reduction potential of a PREP should include the assessment of its abuse liability, a term used interchangeably with abuse potential. Abuse liability traditionally refers to the likelihood of addiction to the product based on product characteristics (e.g., level and rate of free nicotine delivery, flavorants and method of use). However, in the broader sense of the term, abuse liability can also refer to the population effects of the product and involves the interaction between the product and the user as well as the social and environmental context for its use (e.g., peer uptake, product marketing and cost). For purposes of this review, the more narrow assessments of abuse liability will be addressed. Historically, abuse liability studies have been conducted to examine the potential for abuse of prescription or over-the-counter medications (e.g., sedative-hypnotics, barbiturates, pain medications, nicotine replacement therapies [NRT]) or to examine the relative abuse potential of existing or emerging recreational drugs and medications. Several excellent supplemental journal issues have been written on methods to assess drug abuse liability (13–15) Fewer human clinical trial studies have been conducted in assessing the abuse liability of PREPs.
One method to assess abuse liability is to measure the nicotine pharmacokinetics and pharmacodynamics of a product (16–19) (See Table 2); the faster the absorption of nicotine and the greater the amount of initial nicotine delivery, the greater the subjective response and potential for abuse (20). By this measure, tobacco products with the highest potential for abuse are cigarettes and the lowest potential for abuse would be nicotine patches (see Figure 3). Even within products, such as smokeless tobacco, a significant variability in nicotine pharmacokinetics is observed (17, 21, 22 – see Figure 4). Typically, these studies use a within-subject, cross-over design in which subjects are assigned to all products. Some of these studies included medicinal nicotine as a comparison to the tobacco products (16–17, 21–22) or a non-nicotine tobacco like product (e.g., 17). Subjects are required to be abstinent overnight, report to the laboratory in the morning, where abstinence is verified (CO in the case of cigarette smoking or reduced cotinine in the case of smokeless tobacco) and a sample of the product is administered. Multiple blood samples are taken to determine time to maximum plasma nicotine concentration (Tmax), maximum plasma nicotine concentration (Cmax), area under the curve (AUC) as well as half-life (t1/2) and clearance (CL). Another biomarker that has been used is carboxyhemoglobin and in one study, the extracted dose of nicotine from the tobacco product was examined (22). Also during this time, vitals and/or skin temperature are assessed as well as subjective responses to a product. Such subjective responses include withdrawal symptoms and craving; drug liking (whether they felt any good effects from the study product, how satisfying the product was, how much they liked the study product, how much they desired the study product, how strong the study product was) and drug effect (felt any bad effects from the study product, felt alert, felt relaxed, felt a head rush or high, felt a tremor in hands, arms or face, felt light-headed/dizzy, felt drowsy, felt energetic or stimulated, felt jittery); and product evaluation (strength, smoothness, flavor quality, satisfaction, comparison to usual cigarettes, estimate of nicotine yield). In the case of a study conducted by Benowitz et al., (15), in which a major focus was to determine compensatory smoking behavior in cigarettes that differed in nicotine content but not tar yield, the measures included the ratio of nicotine intake/content and ratio of nicotine intake/Federal Trade Commission (FTC) nicotine yield.
Another method for assessing abuse liability is to examine the relative preference of the PREP relative to other products or another PREP using a forced-choice paradigm (See Table 3). These studies typically involve sampling each of the products and then forcing the subject to choose one product over another (23; see below). The product sampling phase is conducted within a session (24) or over the course of several days and weeks (e.g., 23). Another option is to allow the subject to have concurrent access to the products and allow the subject to choose any of the products over the course of this choice phase or even throughout the course of the study, where the number of choices made of each of the products would be calculated. Subjective responses of withdrawal, product liking, strength of effect and product evaluation of other characteristics are also assessed. PREPs can be compared with conventional tobacco products (e.g., either high or low nicotine yield delivery), medicinal nicotine or other PREPs. Therefore, these models can provide clues regarding the abuse potential or preference of a PREP compared to conventional highly addictive products, to medicinal nicotine (safer) products and over another PREP. To date, few studies on PREPs have used this type of experimental paradigm.
Another paradigm that has been used to examine PREPs involves examining the extent to which a person would work to obtain a product and the extent to which a particular product substitutes for another product and at what cost (price-elasticity). As an example, in one outpatient laboratory study, cigarette-deprived dependent smokers worked for standardized cigarette puffs by pulling on a plunger on a progressive ratio schedule (increasing number of pulls for each puff) for either nicotinized or denicotinized cigarettes. These cigarettes were provided alone, or concurrently with the opportunity to earn money (25). Another variation included using the same paradigm in which subjects earned standardized puffs on both types of cigarettes when provided alone, except in another phase the subjects chose between the two cigarette types (26). The two cigarettes in both studies were compared on such measures as the breakpoints (the ratio at which subjects no longer worked for a puff), number of puffs earned per session, peak response rates, ratio producing peak response rates and the demand elasticity for cigarette puffs across a range of prices (number of required plunger pulls). In addition, cigarettes were rated on subjective measures (e.g., taste, drug effect, smoothness, enjoyment, the amount subjects would pay per pack of each type of cigarette). In another study, nicotine containing cigarettes were available at increasing unit price (increasing number of plunger pulls) with nicotine gum, denicotinized cigarettes or both concurrently available at a fixed price (e.g., fixed number of plunger pulls; 27). The outcome measure was cross-price elasticity (point at which smokers switched to the alternative product) for each alternative that was offered at a fixed priced. In this paradigm, the preference between two concurrently offered products can be determined as well as the reinforcing value of the alternative compared to usual brand cigarettes. In addition, withdrawal and smoking urges also are measured to determine if the products relieve withdrawal and urges to smoke, and if these variables would impact behavioral responses. Similar to the forced-choice paradigm, only a few studies have used this type of experimental design to examine PREPs.
Other components for the assessment of abuse liability of a product include withdrawal relief from a product as a result of switching from usual brand to a PREP, withdrawal effects from the product, how much the product is used, occurrence of compensatory tobacco use behavior or dose escalation of the product over time and dependence on the product. These studies can be conducted within a laboratory (as described below) or in short or intermediate-term studies.
Several groups have published in-laboratory clinical studies where subjects are tested using products in the laboratory to assess acute subjective and physiological responses to a product and biomarkers of exposure (28–43) (See Table 4). These studies vary in the number of PREPs tested, whether PREPs are used ad libitum or in a controlled manner, whether one or more products are provided within a laboratory session or across several sessions, and duration of the session. Eissenberg and his colleagues have conducted several of these laboratory studies. In their experimental designs, laboratory sessions are typically held after overnight abstinence and subjects would participate in a within-subject, cross-over design involving a 2.5 hour session. Subjects were asked to complete an 8-puff smoking bout every thirty minutes, with each session involving one of three to four different products (28–29, 35). A similar laboratory design (e.g., four 30-minute episodes of oral tobacco product use over 4.5 hour session) has been used with smokeless tobacco users (44). Other within-subject laboratory studies have varied the way in which cigarettes are smoked. For example, studies have asked smokers to smoke two cigarettes with different nicotine yields either rapidly (up to nine cigarette puffs every 6 seconds) or at a normal pace (36), ad libitum during a 5 hour session (41), ad libitum at 30, 60 and 240 minutes during a 240 minute session (37) or every 30 minutes over 2 hours (34), or smoking one PREP consecutively in a standard way (e.g., taking large puffs every 30 seconds and inhaling a deeply as possible, with the subsequent PREP smoked 45 minutes later; 33). Other studies have had participants smoke two or three different cigarettes (e.g., denicotinized vs. nicotinized cigarettes) during independent sessions controlling for no other variables (30) or smoke one cigarette ad libitum during independent sessions varying the length of abstinence prior to the session (30, 39, 43). Laboratory studies rarely provide a trial period for the product prior to the laboratory session, however, one study allowed two weeks of acclimation to the cigarette prior to laboratory testing (32). When comparing the instructions for product use on outcome measures across studies examining similar products, the direction of results tend to be the same whether the subjects smoked ad libitum or at a fixed rate (28, 34) (35–36, 39). However because this observation is made across studies, no quantitative comparisons could be made.
Another unique study asked subjects to smoke different combinations of five denicotinized and nicotine cigarettes (i.e., 0, 1, 2, 3, 4 or 5 denicotinized cigarettes out of 5 total cigarettes) during each study day (40). Other non-cigarette studies have examined the effects of tobacco products given in increasing doses every 90 minutes (e.g., 1 Ariva, 2 Arivas 90 minutes later, followed by 3 Arivas 90 minutes later; 45). Only one study was conducted with adolescents where subjects were asked to smoke one of two cigarettes differing in nicotine yields in a between-subject study design (46).
A variety of measures are typically used in these studies. Subjective measures have included: 1) nicotine withdrawal typically using a modified Minnesota Nicotine Withdrawal Scale (28–30, 34–36, 39, 45) or Shiffman-Jarvik Scale (31, 36); 2) smoking urges or desire to smoke using the Questionnaire of Smoking Urges or other measures (28–30, 34–37, 39); 3) subjective responses to the product using such scales as the Nicotine Effects Visual Analogue Scale (nausea, clammy skin, dizziness, light headed, burning throat, tingling sensations, and heart racing) (VAS; Houtsmuller and Stitzer 1999 cited in study by 36), the cigarette effect questionnaire (pleasant, unpleasant, like taste, dislike taste, smoke versus air [anchored with mostly smoke to mostly air], harsh, strength, high in nicotine, like drug effect, dislike drug effect, satisfying, more awake, more calm, easier to concentrate, and less irritable) (Gross et al., 1997 cited in study by 36), or Cigarette Evaluation Scale (satisfaction, psychological reward, nausea or dizziness, craving relief, and enjoyment or airway sensation; 31); 4) sensory questionnaire (estimated nicotine delivery, similarity to usual brand, perceived strength on the tongue, nose, back of mouth and throat, windpipe and chest, 31); or 5) other scales that measure variables such as strength, mildness, taste, satisfaction, pleasantness, harshness, heat, smell, ease of draw, similarity to own brand of cigarettes, good effects and bad effects (30, 38–42, 46)
Smoking topography assessments included measures such as puff volume, duration, interpuff interval, maximum flow rate velocity (28–29, 36, 46) and number of puffs in those studies that did not control for this variable (31, 34–35, 38–39, 46) or signs of vent blocking (31).
The majority of the abuse liability studies recruited subjects who were physically and mentally healthy, currently not taking psychiatric medications or taking medications or products that would interact with the product tested, and who were not dependent on or misusing other substances of abuse. Some studies stated that they excluded pregnant smokers (19) or smokers who had plans to quit smoking (27). The number of subjects for these studies typically ranged from 8 to 12, although one study had a number as high as 39 (20). The subject characteristics varied across studies from relatively young, less dependent population (e.g., reference 16) to relatively young but more dependent population (27).
For in-laboratory clinical studies, the inclusion criteria for adult subjects included being in general good health and for some studies a specified age range (18 to up to 65) (28–29, 31, 34–36, 38, 45), specified number of cigarettes smoked, ranging from at least 10 or 15 cigarettes per day (CPD) (28–29, 31, 34–36, 43, 45) or at least 100 lifetime cigarettes (38), a specific type of cigarette smoked, such as non-menthol, light or ultralight cigarettes depending on the study product being examined (28–29, 35, 38), a specified FTC nicotine yield such as at least .5 mg (31) or a specific cut-off for CO of at least 7 to 15 ppm or greater (28–29, 34–36, 45). Subjects were excluded in some studies if they have had previous experience with the product being tested (28–29, 34–35), were pregnant or breast feeding (28–29, 34–36, 45), engaged in current attempts at smoking cessation or reduction (28–29, 35, 38, 43) or had intentions to quit in the next 6 months (31). Other criteria required for participation included in good mental health or no chronic mental condition requiring medication and with no active drug abuse (36, 39) or excessive alcohol use (43). While the rationale might be apparent in some cases, some of the above inclusion criteria were arbitrary, and may affect research results; however, this has not been studied.
The subject numbers in these in-laboratory clinical studies also tended to be small, typically under 20, with a range up to 32 subjects (35) and 50 subjects (43). Subjects in the studies conducted in Eissenberg's studies tended to be younger (age ranges from 22 to 33 years) smoked fewer cigarettes (range from 15 to 22 cigarettes per day) and showed less dependence on the FTND/FTQ (scores from 4.0 to 5.6) than the other studies. Similarly, O'Connor et al. (38) recruited only college students with a low rate of smoking (11.5 cigarettes per day). With the exception of the study conducted with adolescents (46), the other studies tended to have smokers over the age of 30 (ranging from 34 to 45 years) who smoked greater than 20 cigarettes per day (21 to 31 cigarettes per day) and had higher FTND scores (5.5 to 8.4). Likewise, the Eissenberg research group enrolled smokers who smoked fewer number of years (4 to 7 years; 29, 35, 45), while in other studies, smokers smoked for at least 18 years (30–31, 33, 42–43). Most of the population was white, although a few studies had 50% or over of minorities (31, 37). Most studies were evenly split between males and females, with some studies having all males (30, 33). Some of the studies reported FTC determined nicotine yield of the cigarettes (30–31, 37, 39–40, 43) and body mass index (BMI) (28, 35, 43) or weight (40). How PREP use might differ, and affect outcomes including biomarkers, physiological response, delay for quitting, etc. has not been studied for different age, gender and racial groups.
Product use was not a significant issue because most of these studies were conducted in the laboratory.
Abuse liability and in-laboratory clinical studies can be valuable in providing information on nicotine delivery of a product, acute toxicant exposure using biomarkers with very short-half lives, acute physiological and subjective responses to the product, and the potential for use or abuse of the product. The best methods to measure these outcomes and whether the responses observed in the laboratory generalize to actual product use and risk is unclear. For example, short-term in-laboratory studies differed in the method by which products were tested. Some involved standardized methods of product administration (i.e., established number of puffs) whereas other studies allowed ad libitum use. While the use of controlled smoking conditions might be important to determine how products compare against each other in exposure and subjective responses to the product, the extent to which these values are similar to those observed when smokers are allowed to use the product ad libitum, or how either of these methods reflect how the product will be used or the extent of exposure in the real world, is unknown. Furthermore, many studies involved first time exposure to the product in the laboratory without time for adaptation. Whether or not responses to a product change when the subject is allowed to adapt to the product is also unknown. Additionally, the number and duration of product use required for adaptation are unclear and may be dependent on the product and individual.
For the assessment of abuse liability of PREPs, the Clinical Trials Workshop participants emphasized the importance of examining the weight of evidence based on the results from multiple studies. Furthermore, in interpreting the data from these multiple studies, the emphasis to place on each component of a comprehensive battery to assess abuse potential (e.g., reinforcing effects vs. withdrawal relief vs. dependence) must be carefully considered. Workshop participants also raised the issue of comparison products and suggested a subject's own or preferred brand should be used to anchor the high end of the abuse potential continuum and nicotine replacement product to anchor the low end.
The critical questions regarding these types of studies include the following: 1) how valid are these methods in predicting abuse liability and adverse impact of tobacco products; 2) what types of studies are required to determine weight of evidence and how much valence would be assigned to each type study; 3) how do responses to a tobacco product differ when an individual has had some exposure to the product compared to no exposure prior to the laboratory session; 4) how do responses differ when subjects are asked to use a product ad libitum compared to when they are asked to use the product in a prescribed manner?
Short-term clinical trials, that is, product use of less than 2 weeks where subjects use the product throughout the day, have been conducted in the natural environment and in the residential unit (See Table 5). Most of the studies that have been conducted on the PREPs have been focused on examining the toxicant exposure and biological effect of the product when compared to usual brand cigarettes and in some cases, to medicinal nicotine products or cessation, reference cigarettes or to marketed “low tar” cigarettes.
Most of the short-term studies conducted in the natural environment used cross-over designs such as: 1) assessments during use of usual brand cigarettes for 1 week, use of PREP for 1 week and then use of usual brand cigarettes for 1 week (47–48); 2) own brand for 5 days, 1 or 2 PREP(s) for 5 days each, no smoking for 5 days, with laboratory sessions on the first and last day of each condition (44, 49–50); and 3) usual brand for 1 day, another PREP for 1 day, and abstinent for 1 day then a reversal in order (51). In a study examining the effects of progressive reduction of nicotine content of cigarettes, participants were required to smoke each of the different yields of cigarettes for 1 week, after which time subjects could quit or return to smoking (52). Other natural environment studies have used a between subject design where subjects are typically assessed while smoking usual brand cigarettes and then are randomly assigned to intervention conditions or continued use of usual brand for 1 week or less (53–55).
Some short-term clinical trials have controlled subjects' diets, where food is provided in a cafeteria during the day, or given take-home for weekends (56). This procedure aids in reducing dietary confounders for biomarker analysis. Compliance to diet, however, is difficult to verify.
Residential studies have also been conducted. The advantage of a residential setting is the strict control over diet and assurance of compliance with the use of the assigned products. One residential study involved a forced-switching, parallel group design where subjects were randomized to 1 of 5 treatment conditions for a period of 8 days after a 2 day acclimation phase: own brand (Marlboro Light), one of 2 electrically heated cigarette smoking systems (EHCSS), Marlboro ultralight or no smoking condition (57–58). A similarly designed residential study randomized subjects to 1 of 5 conditions except that instead of the 2 different EHCSSs, this study examined EHCSS under controlled vs. uncontrolled smoking conditions (59). Another inpatient study compared three conditions over a 10 day period of time: no smoking, denicotinized cigarettes, nicotinized cigarettes (60). Sarkar and colleagues (61) conducted 2 inpatient studies that involved randomized, controlled, open-labeled, parallel group, switching design. Following an acclimation and baseline day where the subjects smoked conventional cigarettes, they were randomized to continue smoking conventional cigarettes (6 mg tar for one study and 11 mg tar for the other study), test cigarettes (containing carbon filters) or stopping smoking for 8 days in a confined clinic setting. Cigarettes were smoked in a controlled fashion. Subjects in the conventional or test cigarette groups could participate in a long-term study with these products following the short-term inpatient study.
As described above, PREPs were compared to own brand cigarettes (47–49, 52, 55, 57–59, 62), nicotinized cigarettes that were not the subject's own brand (60), no smoking (49, 51, 53, 57–60), another PREP (50), ultralight cigarettes (57–59) or reduced smoking (53). Most studies allowed ad libitum smoking of the product although some studies required smoking specified number of cigarettes by either maintaining baseline frequency of cigarettes smoked per day (53), smoking no more than baseline frequency of cigarettes or no more than 20% above baseline smoking (57–59) and at predetermined smoking times (58–59).
The measures in short-term continuous-use studies vary, and included: 1) amount of product use and pattern of product use; 2) smoking topography (e.g., determined in the laboratory or a portable device, measuring number of puffs per cigarette, puff volume, puff duration, interpuff interval); 3) nicotine and its metabolites, alveolar carbon monoxide and carboxyhemoglobin; 4) nicotine and COHb boost; 5) carcinogen biomarkers of exposure--e.g., 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol and its glucuronides (total NNAL) for 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) exposure, 1 hydroxypyrene (1-HOP) for pyrene exposure, 3-hydroxypropylmercapturic acid (3-HPMA) for acrolein exposure, S-phenylmercaptureic acid (S-PMA) for benzene exposure, monohydroxybutenyl mercapturic acid (MHBMA) for 1,3-butadiene exposure, trans,trans-muconic acid, 3-methyladenine, 3-ethyladenine, 8-hydroxy-2'-deoxyguanosine, thioethers, urine mutagenicity; 6) biomarkers of effect (inflammatory response, endothelial function, platelet activation, C-reactive protein, fibrinogen, interleukin-8, sICAM, p-selectin); 7) weight, skin temperature and vitals (blood pressure, heart rate); 8) physical activity or diet; 9) subjective responses such as withdrawal symptoms and craving, desire to smoke; 10) product evaluation or product acceptance (e.g., strength/mildness of product, smoothness/harshness, quality of flavor, overall cigarette quality, satisfaction) or comparison of nicotine yield of study tobacco products to conventional products on the market; 11) moods (e.g., Profile of Mood States. 63), Positive and Negative Affect Scale (64), depression (CESD 65), subjective well being; 12) self-efficacy for resisting smoking usual brand cigarettes in high risk situations; 13) dependence (e.g., FTND 66); 14) performance tasks (Stroops task 67); and 15) sleep quality. One study described collecting questions on exposure to occupational and environmental substances, medications, vitamins or anti-oxidants and exercise levels to assess for factors that might affect outcome measures (48).
Subjects were recruited by advertisements (47, 51–54, 60–61), random-dial telephone survey (55) or the method of recruitment was not reported (48–49, 57–58). Many of the studies, including the studies that did not report recruitment methods, did not describe the content of the advertisements (47, 51–52, 61).
Most studies that described inclusion and exclusion criteria indicated subjects needed to be in good mental and physical health and not pregnant or breast-feeding. Other inclusion criteria included limitations on the amount smoked (e.g., ranging from smoking at least 5 to 25 cigarettes per day or no more than 30 cigarettes per day; 48, 51, 54–55, 58–61), specific CO levels (≥15 ppm; 49, 50), specific nicotine or tar yields of cigarettes or type of cigarette smoked such as light or ultralights (e.g., 49–50, 51, 53–55, 58–59, 61), and not planning on quitting (49, 52, 55, 60). Other studies excluded smokers who are using or have used the product being tested or using other tobacco products (48, 50) or using other nicotine containing products (53). Some studies excluded subjects who were currently taking medications (53), who reported at least 15 days of past-month marijuana use (50), or were drug dependent (60). Another study excluded women who were in active menopause (50). Other studies did not report inclusion criteria (47, 57).
Study sample sizes ranged from 12 to 110 with generally 10 to 20 subjects in each condition, with higher sample sizes in tobacco industry conducted studies (57–59). Most of the subjects who participated in the study tended to be young (mean age range from 24 to 38 years). The samples were typically evenly divided between males and females, although some studies were either all females (51) or were predominantly female (49) or male (50). The range of mean cigarette intake was from 16 to 28 cigarettes per day. Some studies indicated the nicotine yield of cigarettes (47, 50–51) or the FTND scores (47, 49, 60) or intentions to quit (55).
Compliance to product use was maximized in the following ways: providing free products (all cited studies); payment contingent on compliance (although no biochemical verification was obtained for product use; 51, 53); payment contingent on verification of abstinence conditions via biochemical verification (49–50); or use of a bogus pipeline (60). Other studies did not address the issues of compliance (48, 52, 55, 62). Some studies were conducted on a residential unit where use of the products was monitored (57–59).
Typically, subjects are requested to not use other tobacco products. However, for PREPs, which are tobacco products, this cannot be verified. For persons who switch to NRT, this can be assessed by measuring urinary anatabine, which is a tobacco alkaloid (68) or total NNAL (69). For persons who report cessation, then urinary or serum cotinine levels can be measured.
Intermediate-term trials are defined as being conducted for longer than 2 weeks but no longer than 52 weeks in the natural environment (See Table 6). Product is used throughout the day, and in most studies, ad libitum product use. These studies used either a between-subject design comparing different products (56, 61, 70–78), within-subject design taking assessments during usual brand cigarette smoking and after switching to a product (79–81) or within-subject cross-over design with different products (23, 82–85). Subjects were required to use a specific product (61, 70, 72–83, 85–89), or were given a choice after sampling the products (56, 71). Some of these studies included additional or different experimental design features. The study conducted by Mendoza-Baumgart et al. (23) required use of one of the two assigned products for two weeks and then a cross-over to the other product, during which time biomarkers for exposure were assessed, and a choice of products for the final week, where no biomarker assessments were made. Another study involved smoking the product for 2 weeks and then a test session after this 2-week period as well as during usual brand use (80). One study provided smokers of “medium tar yield” cigarette with commonly smoked “medium tar yield” brand of cigarettes in unmarked boxes. Subjects were then unknowingly switched to “low tar” cigarettes or continued on “medium tar cigarettes” (84), therefore blinding the subject to the switch in cigarettes. One study allowed a 2 week acclimatization time with higher nicotine yield cigarettes compared to usual brand cigarettes and then made biomarker assessment after another two weeks of use. The values for thiocyanate and carbon monoxide were higher after the two week acclimatization period, indicating a period of time to stabilize to product use may be warranted for some products (70).
Unique designs have been used in studies conducted by a tobacco company. One study involved having all subjects switch to the reference conventional cigarette, then random assignment to continued use of the reference cigarette or switching to an EHCSS. Subjects were assessed in a residential clinic for 36 hours during baseline smoking and at the end of the first week of using the randomly assigned product (EHCSS vs. reference cigarette), and then during continued use of the assigned product over the course of 12 weeks (74). In another study, as previously described, subjects underwent a short-term residential phase (similar to reference 55), and then continued with the assigned product for 24 weeks in the natural environment (73). In one longer-term study, all subjects underwent a two week trial period of a PREP (EHCSS) prior to enrolling in a 12 month long study (75). Baseline and visits at 2 weeks and monthly thereafter were conducted in a controlled, confined clinic setting from 07:00 to 07:00 the next day.
Published intermediate-term studies have control arms such as medicinal nicotine (23, 56, 72, 82–83), nonsmokers (86), smokers using their usual brand of cigarettes (79–81, 87) or a control group with conventional cigarettes similar to usual brand (61, 70, 74–76, 78, 84, 89). Some of these studies examined the effects of cigarettes with different nicotine or tar yields (70–71, 73, 76–78, 84–85, 87–89) or different amounts of product use (83). The duration in trials ranged from 2 weeks to 13 months.
Three intermediate-term studies allowed concurrent use of their usual brand cigarettes with the PREP (56, 82–83). In the studies conducted by Fagerstrom et al. (56, 82), subjects were instructed to smoke as few cigarettes of their own brand as possible without discomfort and instead use as much of the treatment product (nicotine inhaler or Eclipse) as needed. In the study conducted by Hughes and Keely (83), subjects were asked to use a specified number of Accord cigarettes per day (5, 10 or 15). Other studies allowed ad libitum product use and typically, no other nicotine containing products (73–75, 79, 81, 86). Another study allowed ad libitum product use, and excluded subjects if more than 5% of their total daily cigarettes smoked were nonstudy cigarettes (61). Hatsukami et al. (72) and Mendoza-Baumgart et al. (23) required a specified amount of use (the same amount as usual brand or use every 2 hours, respectively).
The measures across the published intermediate-term studies have included: 1) amount of product used and in some studies, when the products were used; 2) extent of compensatory smoking (as measured by smoking topography, cotinine and/or CO); 3) the number of usual brand cigarettes per day and use of other tobacco products; 4) biomarkers of exposure such as carbon monoxide, total nicotine equivalents or cotinine, thiocyanate, total NNAL, 1-HOP, 3-HPMA, MHBMA, S-PMA and 4 aminobiphenyl hemoglobin (4-HBP Hb) adducts reflecting exposure to aromatic amines; 5) biomarkers of effect such as pulmonary function tests, measures of lower respiratory tract or airway inflammation, goblet cell metaplasia, peripheral blood measures, 99mtechniciumdiethylenetriaminepentaacetic acid (DTPA) clearance, blood leukocyte activation, reactive oxygen species, white blood count and hemoglobin levels, and respiratory symptoms; 6) cardiovascular risk factors such as hemoglobin, hematocrit, red blood cell, white blood cell count, fibrinogen, lipoproteins, triglycerides, high-sensitivity C-reactive protein, bilirubin, von Willebrand Factor, 11-dehydrothromboxane B2, 8-epi-prostaglandin F2α and microalbumin; 7) weight and vitals (blood pressure, heart rate); 8) subjective measure of withdrawal and craving; 9) drug effects and liking, sensory ratings and product evaluation (odor, strength, draw resistance, taste, embarrassment regarding use, and liking); and 10) intention or motivation to quit.
The advertisements used to recruit subjects varied in content. Some advertisements described the study as one that was testing new products that may reduce the risk of smoking or may be safer (56, 81) and/or with no second-hand smoke exposure (56, 83). Other advertisements called for smokers or tobacco users who were interested in participating in studies that compared new tobacco products with nicotine replacements (72). One study sent potential subjects a questionnaire asking for details of their smoking habits to determine their eligibility and willingness to participate in a trial requiring subjects to switch to “low tar” cigarettes (71). In another study, subjects were recruited among a group of “acceptors” of a new cigarette (Eclipse) that was in test marketing (86). These acceptors had smoked at least 75% of two cartons of Eclipse cigarettes and expressed future purchase intent. Most of these studies appeared to recruit subjects that were interested in trying a new product.
The inclusion criteria in intermediate-term studies generally specified that subjects had to be in good current physical and/or good mental health with no clinically significant diseases (56, 72–75, 79, 82, 86, 89), with some studies reporting a specified level of lung function (79) or a history absent of any respiratory and/or cardiovascular disease (76, 86, 89) or diabetes mellitus and hypertension (89); no regular medications; (76); a specified age range (e.g., ages 18, 20, 21 or 25 to 50 or 65 years old; 23, 56, 61, 72, 74–75, 82, 86); or at least 18 years old with no upper limit on age (81, 83); (76) a minimum amount of cigarettes smoked per day, ranging from at least 5 cigarettes to 20 per day (56, 61, 72–74, 76, 81–83, 85) or even as high as 40 cigarettes per day (79); smoking a specified type of cigarettes (e.g., full flavor, light or ultralight;) or a tar range for cigarettes (e.g., light or ultralight; 73–74, 83) or a minimum tar yield (77, 86–87) or minimum nicotine yield; not using smoking cessation or reduction methods (72) or intending to quit (79) or to reduce cigarette use (89); wanting to switch to low tar/nicotine cigarettes and perhaps quit (76); and no experience with a PREP that is similar to the study product (83) or use of any other non-cigarette or nicotine product (72, 83, 86) or any product other than the reference conventional cigarette which was smoked 4 weeks preceding the start of the study (74). Most studies reported eliminating pregnant or breast-feeding women.
Some studies had small sample sizes per product condition (N=8–15, e.g., 56, 79, 81, 86, 88–89) or moderate sample sizes per condition (e.g., N=25–75 72–74, 80, 87) One study had about 145 subjects per product condition (71). Not all studies reported the subject characteristics of the sample that was recruited. In general, the study population tended to be older in age (mean of 35 to 48 years old; 23, 73–75, 81, 82, 83); over 50% of the population was female (23, 74–75, 81–83, 87) with the exception of a few studies where subjects were predominantly male (73, 76–77, 85–86); predominantly White (23, 61, 72–75, 81); tended to be heavy smokers (mean of 20 to 29 cigarettes per day; 23, 72–76, 81, 82, 83, 87, 89) and heavily dependent on tobacco (mean FTND of 5.4 to 6.8; 23, 73, 81, 82, 83). Studies that did not specifically recruit for cigarette type observed that most smokers smoked “low tar” cigarettes (54%; 23) or evenly split between “light” and “regular” cigarettes (39%, respectively, 72). Two studies described the stages of change, with one study describing all of its subjects as pre-contemplators (83) or 53% pre-contemplators and 47% contemplators (81).
The majority of the intermediate-term studies that described compensation for participating in the study reported that subjects were paid for their time (56, 61, 74, 81–82, 86). Two studies reported monetary compensation for time and for compliance with assigned product use (23, 72). In one study, subjects purchased cigarettes for reduced prices (77). Another study required subjects to deposit money at the start of the study and a portion of the money was returned each week for attendance and product compliance (76). Compliance to product use could only be determined by measuring abstinence from smoking using CO if smokers were assigned to noncombustible tobacco containing PREPs (oral tobacco), using CO and a tobacco alkaloid such as anatabine or total NNAL if smokers are assigned to nicotine replacements (e.g., 23, 72), or cotinine if one of the conditions involved no smoking and no product use. One study examined spot urines to be analyzed for total NNAL as a compliance check for use of tobacco products other than the PREP with levels above a specified level considered to be an indication of noncompliance (74). Otherwise, compliance was determined by self-report (89), comparing the amount of dispensed products with return of unused products or packaging (70–71, 73–74, 76, 81, 83–85, 90), or returning used products (77, 83). Self-reported number of cigarettes was determined either by written daily diaries (61, 76) at the weekly clinic visit or in two studies by calling an answering machine every night prior to bed time (83) or by recording date, time and brand of each cigarette smoked using an electronic diary (74). Interestingly, the results from the Frost-Pineda et al. (74) study showed that the daily diaries underestimated report of product use compared to the pharmacy logs.
Short-term clinical trials allow assessment of a broader range of exposure biomarkers and some adaptation to product use than laboratory studies. These trials may have the advantage of potentially greater compliance and less drop-outs than longer trials. Furthermore, short-term trials would allow studies to be conducted on a residential unit, which allows a greater control over product use, protocol compliance and control over confounding factors that may affect exposure biomarkers such as diet. However, these residential studies, which are typically 1 week in duration, may be too short in duration for using biomarkers with longer half-lives, which may lead to the necessity to make adjustments for residual effects for these biomarkers (59). Another drawback of these studies is the unnatural environment, which may affect pattern of PREP use and which may not reflect product use in a more naturalistic setting. For example, in one study, subjects underwent a short-term residential phase involving controlled smoking of assigned cigarettes. This phase was followed by a 24-week period of unrestricted smoking of the assigned products in the natural environment. The results from the residential phase showed similar decreases in biomarkers as during the 24-week follow-up phase, however the extent of reduction was greater during the follow-up phase. The authors attributed this difference to more restrictions on smoking in the subject's natural environment (73). On the other hand, in another study, subjects were confined for 36 hours during baseline and 1 week after assigned cigarette use, with instructions for unrestricted product use. The biomarker results showed greater change during confinement than observed at the end of a 12 week phase of product use in the natural environment (74). Thus, although both studies suggest that short-term residential studies show similar trends in results as the longer non-residential studies, the extent of change in exposure may differ.
The strengths of the intermediate-term studies include greater time for stabilization of use, examination of more naturalistic use and the ability to use biomarkers with longer half-lives. The limitations depend on the goal of the study. If the focus of the study is to determine toxicant exposure during long-term use, compliance with use is a major concern. However, if the goal is to examine naturalistic pattern of use over time, than compliance is not a significant issue. One concern over some of the long-term studies is the high drop-out rate, which can be as high as 60% (73).
Several short-term and intermediate studies had design features that are useful in the examination of effects of products on exposure biomarkers. For example, as described previously, some studies have combined both short-term laboratory or residential phase with longer term phase conducted in the natural environment (74). The advantages of these studies include assessing products while under greater control, yet assessing products for a longer period of time in a naturalistic setting. As another example, although the within-subject cross-over design studies tended to be smaller in subject size than a between-subject study design, the within-subject designs minimize the effects of intersubject variability for biomarkers of exposure (e.g., how a subject metabolizes nicotine or carcinogens) and other biologically responses to a product. These types of studies, however, also tend to be shorter-term and may not be conducive to longer-term evaluation of a tobacco product because subject retention may be an issue. Also, because of the small sample size, generalizability of results may be a concern. The shorter duration trials may be conducive to examining dose response curves to determine if the effects on measures are an actual result of the product. For example, in the Hughes & Keely study (83) subjects were required to use 5, 10 and 15 Accord devices per day and then 4 mg gum for 2 weeks each. In between sessions, smokers were required to resume using soley their usual brand cigarette. Although the focus of this study was to determine how the amount and type of product use affects the amount of usual brand cigarettes smoked, this study design can be easily adapted to determine the dose-response effect of sole use of a product.
Differences existed in instructions for product use across studies. In some studies, the amount of PREP use was controlled while other studies allowed ad libitum use. Requiring smokers to use a specified amount of the product may provide some insight into the toxicant exposure and effect from the product, although it would be difficult to control for topography of product use. On the other hand, ad libitum use might more accurately reflect how the product might be used. Few studies have compared the results between these two different instructions for product use. In the residential study conducted by Roethig et al. (52), one group of smokers was randomized to an EHCSS in which smoking opportunities were given every 32 minutes between 07:00 and 23:00 and use did not exceed rates observed during the acclimation period of usual brand use. The other group of smokers was allowed to smoke EHCSS at any time between 07:00 to 23:00 with a cap of 60 cigarettes. With the exception of greater number of EHCSS smoked per day but less puff volume observed in the unrestricted smoking condition compared to the controlled smoking condition, no significant differences were found in any of the biomarkers of exposure. It should be noted that the EHCSS only allows for a maximum of 8 puffs per cigarette. The impact of these different instructions for product use outside of a residential setting is unknown.
Short-term studies generally require subjects to use only the study product, and disallow dual use (e.g., use of concurrent conventional tobacco products), although compliance cannot be verified. Some intermediate-term studies also encourage subjects to only use the PREP, whereas other studies allowed use of the subjects' usual brand of cigarettes. Both study designs are valuable—the former to determine the toxicity of the products per se and the latter studies to determine real world pattern of use and resultant toxicant level. One study required sole use of the study product and then undertook a post-hoc analysis that examined subjects who reported only using the assigned product versus subjects who reported using both the assigned product and conventional cigarettes, with higher exposures observed in the analysis among dual product users (74).
Studies vary on either requiring subjects to use a specified amount of product (which may not be reflective of how much subjects use in non-experimental conditions) or to use the product ad libitum, which may not reflect the actual toxicity of the product. Perhaps using both instructions for use would be valuable and research needs to be conducted to compare these two different instructions for product use.
The measures in both the intermediate-term and short-term studies were somewhat similar, with some variations based on the intent of the study. However, most studies, especially intermediate-term ones, did not consider potential confounding factors that may influence the biomarker results, such as diet, other environmental exposures and alcohol use. Studies on PREPs would benefit if similar measures were used across studies and a broader panel of biomarkers within and across disease states were used.
Short- and intermediate-term studies use control groups, but types of control conditions varied. It would appear that the most valuable comparison groups would include abstinence, with or without use of medicinal nicotine, in order to compare the product to an intervention with known reductions in health risks. Another option would be to use ultra light cigarettes (i.e., <1 mg tar yield), as a comparison to marketed products considered to have relatively low toxicant yield levels. In one intermediate-term study, subjects were required to have smoked an ultra light cigarette, as the reference conventional cigarette, for at least 4 weeks prior to the start of the study (74).
The Clinical Trials Workshop participants recommended that a study design should include multiple arms and include both negative and positive comparators while varying the level of adherence to the product. For example, these arms would include: 1) smokers who use usual brand; 2) smokers who decrease intake of usual brand (which can include different levels of decreased use based on claims); 3) controlled, that is fixed amount use of the PREP product only; 4) ad libitum use of PREP product only; and 5) ad libitum use of PREP plus concurrent use of usual brand. Furthermore, nonsmokers, but particularly smokers who quit, were considered be a valuable comparison group depending on the specific aims of the study. A no smoking group would compare PREP effects to the ideal case (cessation) while also demonstrating the sensitivity of the design and outcome measures.
Other critical features of the study design would include being of sufficient duration to achieve stabilization of tobacco use behavior. Workshop participants reported that in prior studies, this duration has been less than 4 weeks for conventional cigarettes. Additionally, the length of study should be sufficient to achieve “steady state” of the biomarker and when using within-subject study design, the half-life of a biomarker must be taken into consideration to ensure no carry-over effects. This duration depends on the type of biomarker (e.g., exposure or harm) and the pharmacokinetics of this biomarker. If disease outcome is to be assessed in the study, the study length has to be long enough (e.g., in some cases, months or years) to measure changes in disease occurrence. Cost will be a factor, but not a scientific consideration.
The following critical questions need to be addressed for these types of trials: 1) how do responses differ across instructions for product use (ad libitum use vs. use of specific amounts, concurrent product use with usual tobacco products vs. product use only); 2) how do we determine the length of time it takes for product use to stabilize and is stabilization under clinical trial conditions different from that under natural use conditions; 3) how do we determine if the exposure is due to the product, the way the product is used, or characteristics of the individual; and 4) how do results from switchers differ from those who have already chosen to use the products.
Cross-sectional studies have been conducted to compare self-selected product users who use different brands or types of products (See Table 7). For example, past studies have recruited smokers with differing nicotine or tar yields to determine biomarker levels of cotinine, thiocyanate, carbon monoxide, tobacco specific nitrosamines or cardiovascular or lung cancer risk factors across the different types of cigarettes (e.g., 91, 92–99). One study investigated concentrations of urinary biomarkers in relation to concentrations of selected toxicants in mainstream cigarette smoke as determined by machine smoking of cigarettes in a manner that mimics an individual's smoking behavior (100). In cross-sectional studies, subjects have been recruited from general population surveys (92–93, 96, 98), smoking cessation clinics, treatment trials or experimental studies (94–95, 99), or recruited specifically for the study (91, 97, 100). The restrictiveness of the inclusion and exclusion criteria has varied across cross-sectional studies, ranging from just being an active smoker (91) to smokers who smoke within a specific range of number of cigarettes, use no other nicotine containing products, are in good general health with no current mental health problems nor use of psychotropic medications (100). Very stringent inclusion criteria occur when recruiting from treatment studies (90). Other studies have also examined the association with disease status (or risk) across different yields of cigarettes using epidemiological data (see 101 for reviews, 102, 103).
Cross-sectional studies have also compared different types of smokeless tobacco products on carcinogen biomarkers (see Table 5) (e.g., 104, 105) and snuff or snus users versus smokers on cardiovascular or diabetes risk factors using population based samples (e.g., 106, 107–108), on carcinogen exposure biomarkers (109), on the enzyme aldehyde dehydrogenase (110) using a clinical sample, or on differences in the prevalence of actual disease (111–117). One study recruited relatively healthy smokers, smokeless tobacco users, nicotine replacement therapy users and healthy controls to measure serum immunoglobulin levels (118). Recruitment of subjects have either been specifically for the type of tobacco user of interest, as above, through another study sample (104, 109) or random sampling from a population (106–108). Some of these studies had specific criteria for recruitment such as gender, age and ethnic/racial group (107), exclusion of specific medical conditions because biomarkers of effect were related to those conditions (107, 118) or specific criteria that would classify the sample as current users of moist snuff or cigarettes, ex-smokers who used nicotine replacement, or a control group of ex-tobacco users or never users (118). The advantage of the cross-sectional study is that these individuals have chosen to use the product and therefore the results reflect values in a population most interested in using the product. Population-based sampling might provide the additional benefit of obtaining a sample somewhat representative of the general population of users for that product. In addition, in this population-based sample, the product in some of the subjects has been used for a sufficiently long enough duration that the pattern of use has stabilized. The disadvantage is that a sufficient population base of users is necessary so that the product can be evaluated and other factors associated with the product, such as product design, content and marketing may change over time.
Clinical Trials Workshop participants suggested that cohort studies be conducted to compare new, experienced and long-term switchers to PREPs and would describe differences in subject characteristics and outcomes of these populations.
While Workshop recommendations for study design has been addressed in each of the study type subsections, the following provides a summary and recommendations on issues that are relevant to all types of clinical trial studies.
The issue that cuts across all study designs involves recruitment (how should subjects be recruited, what should they be told in advertising and consents, and who should be recruited). Studies differ in how they advertise for subjects, with some studies recruiting specifically for smokers interested in using new tobacco products that may be “safer” or with reduced tobacco toxins. How and what is conveyed to the subject may have major impact on how the subject perceives and uses the products.
In addition, most studies have very restrictive inclusion and exclusion criteria that are not unlike the criteria used in pharmacological trials. The looming question, which also is relevant to pharmaceutical trials, is whether the products are being tested in a population that is representative of the typical user and if not, can the public afford to determine the major effects of the product during post-marketing surveillance rather than during the clinical trials. For example, many of the laboratory studies tended to study populations that were young, healthy and with lower levels of dependence, which may be quite unlike the population interested in using PREPs.
While these criteria may not be problematic in the initial testing of the products, studies need to be conducted that use populations of smokers similar to the ones that are likely to use the products. Just as it is important to carefully examine who is being recruited, it is also important to characterize who drops-out of these studies and why, and who persists in using these products. Only some of the studies captured this information (e.g., 23, 72, 79, 82, 83).
The discussions and recommendations made by the Clinical Trials Workshop participants included the following. Advertising and recruitment information was considered to be important in determining a subject's likelihood of participating and subsequent behavior in the study. Information and instructions should be communicated to the subject in a way that would avoid subject response biases due to subject expectations. To avoid bias, it was considered best if the wording was non-directive, such as advertising for “smokers interested in testing a new tobacco product” or for a “new or reduced exposure tobacco product” or specifically for “smokers not ready to quit smoking but interested in testing a product that may or may not reduce exposure to cancer-causing chemicals” depending on the targeted population. Regardless of the content of advertisement, it was recommended that the consent form should provide toxicity information to the potential study subject.
The Workshop participants thought newspaper or noticeboard advertisements were the most useful and accessible strategy for subject recruitment. Another potential recruitment source, which has been untapped by the majority of studies, are participants of national surveys. Standardized questions regarding PREP knowledge, interest or use could be included in these surveys. Users of PREP products or those tobacco users meeting specific study criteria, such as interested in trying PREP, could be invited to participate in subsequent research for a monetary incentive.
The criteria for inclusion in the study or subject characteristics were also discussed. For abuse potential studies, the characteristics of the study participants would be dependent on the specific study design. Regular smokers are appropriate for acute effects and short-term self-administration studies, whereas heavier smokers would be more appropriate for cross-dependence and compensatory smoking studies. Tobacco users interested in quitting would be most appropriate for cessation studies and to gauge the impact of the product on quitting. In general, however, the Workshop participants believed that people who enter trials are unique, and enter into the trial for a variety of reasons, ranging from interest of the product to financial incentives. A group of persons enrolled in a study may or may not be representative of persons who would naturally select a product. However, the representativeness of the sample cannot be easily assessed. The issue is only important if there are some selection criteria or other factors that result in different biological outcomes. That is, generalizability of the results will only be affected if biologically or mechanistically the association between use of a PREP and outcome (e.g., preference, withdrawal, biomarker) is modified by a factor such as sex, age, ethnicity and/or genetic make-up. Therefore, examining the effects of these factors on outcomes may be necessary. Short of this examination, some Workshop participants recommended that the study population should try to be representative of the population of smokers or tobacco users with respect to: 1) gender; 2) ethnicity; 3) age; 4) SES; and 5) in larger, non-laboratory trials, type of tobacco user (inveterate smoker, those interested in quitting, co-morbid smoker). It is important to keep in mind that clinical trials are not intended to simulate population effects, and cannot, and therefore are not valid for extrapolating results beyond the assessment of clinical effects (e.g., biomarker, physiological and behavioral responses). Similarly, there can be no inferences about why a person stays in a trial or how this might relate to general population acceptance of the product, because people staying in trials might do so because of the implied contract the subject has with the study. It should be recognized that switching trials are only one tool to understand the effects of a PREP on smokers. Other designs are needed such as cross-sectional and prospective studies or post-marketing surveillance.
What is clear is that some studies do not describe recruitment methods, content for recruitment and inclusion and exclusion criteria and few studies describe the characteristics of subjects who dropped out of the study. Uniform reporting of these study design features across all studies is important. Critical questions to address on this topic include: 1) what factors moderate the outcome variables (e.g., biomarkers, see Predictors of Response section); 2) Do different methods of recruitment and types of studies (lab vs. intermediate clinical outcome trials) attract different types of smokers and do these differences affect outcome?
Another major issue that is relevant to the different types of clinical trials is compliance with product use, particularly if the outcome criteria are effects of a product on biomarkers of exposure and health risks. It is a challenge to determine if subjects are not dually using the test product and their usual products. Short of having smokers stay in a residential setting for 3 to 6 months, this issue may never be resolved unless a biomarker can be developed to determine exposure solely to the PREP and no other tobacco products. In prior studies, subjects have been asked to keep daily diaries of tobacco product use, to return used and unused tobacco products, and were paid for complying with use of only PREPs or conversely, not penalized for noncompliance. Additionally, studies have emphasized to the subject the importance of accurate reporting of both assigned and not assigned products.
In many studies subjects were paid for their participation, and although payment is critical for retaining subjects, it biases the population towards those who may be primarily interested in the money rather than using the product. This issue cannot be avoided in laboratory studies but may be particularly important in the long-term clinical trials. On the other hand, it would be difficult to not pay subjects for all the testing that is required of them for the study. No specific recommendations were made by the Workshop participants on the issue of compliance and payment, other than being sensitive to these issues and employing methods to maximize honest reporting.
Very few studies examined what predicts a subject's response to products, e.g., the amount of product use, drop-outs, slipping back to usual brand use, compensatory tobacco use behavior, extent of exposure to biomarkers. This is a critical area of inquiry that is neglected and should include assessment of sex, age, ethnicity, dependence, duration of tobacco use, type of smoker, co-morbid psychiatric history and possibly genetic make-up. For example, females may respond differently to changes in sensory aspects of smoking or nicotine content of smoking compared to males (56, 119–121). African-Americans may metabolize nicotine (122–125) and toxicants such as carcinogens (126) differently than Whites. Expectations for quitting may also be related to outcome (56).
One particular area that was raised in the Clinical Trials Workshop was the role of consumer perception on amount and pattern of tobacco use. In prior clinical trials, subjective effects of PREPs have been assessed which typically fall into consumer perception measures of liking, sensory effects (e.g., harshness, irritation, smoothness, after taste) and withdrawal suppression/craving reduction. These data may indicate which products are likely to catch on with consumers. Knowing such information from laboratory or short-term studies may guide selections of a PREP for long-term clinical trial. However, in the context of a clinical trial, it is very difficult to directly assess consumer perception of a product prior to use and its affect on use, presumably because smokers are paid to participate and will use PREP as part of an implied contract, regardless of perception. The best approach to examine the influence of consumer perception on use is to experimentally manipulate the information provided to the consumer (e.g., presence or absence of relative toxicant level, presence or absence of health warnings, use in situations where the consumer cannot smoke or to reduce exposure to toxicants). These manipulations can then be tied to use of the product. The research gaps that still need to be addressed are 1) how people's perception affect their behavior and use in a trial; and 2) how does perception affect abuse liability, e.g., perceptions of addictiveness of the product.
The need for PREP assessments is rapidly growing due to increased marketing of such products by tobacco companies worldwide. The primary goals for PREP assessment are to determine their impact on morbidity and mortality relative to not having PREPs on the market. The main way to make this determination is conduct long-term epidemiology studies and intervention trials. However, because of the length of time required to conduct these studies, the post-marketing nature of this type of assessment and the large population base that is required, laboratory, short- and intermediate term trials are the most expeditious ways to infer potential harm or benefit from PREPs.
In order to move the science forward in PREP assessment, we need valid methods of product evaluation. Towards this end, we need to develop a battery of valid measures (i.e., subjective, behavioral, biomarkers) which would uniformly be used across clinical trials. Furthermore, all clinical trials need to describe methods for recruitment, inclusion and exclusion criteria, characteristics of subjects calling into a trial, enrolling in a trial, dropping out and completing the trial. Finally, we need a systematic examination of the critical methodological questions that were raised in this review, with the primary intent of determining the generalizability of our results in helping us understand both individual and population effects. Based on the review of the literature and the deliberations of the Workshop, the following is a summary of the recommendations that were made:
Financial Support: NCI Contract N01-PC-64402
Disclosure of Potential Conflicts of Interest D. Hatsukami: commercial research grant with Nabi Biopharmaceuticals