Search tips
Search criteria 


Logo of ijprevmedHomeCurrent issueInstructionsSubmit article
Int J Prev Med. 2013 November; 4(11): 1304–1311.
PMCID: PMC3883256

An Algorithm of Smoking Stages Assessment in Adolescents: A Validation Study Using the Latent Class Analysis Model



Notwithstanding the importance of smoking stages evaluation in adolescents, there is not an appropriate instrument for its measurement. This study aims to introduce an appropriate instrument for measurement of smoking stages in adolescents and to examine its validity using latent class analysis (LCA) model.


We designed an algorithm to measure the smoking stages. The relevancy and clarity of the algorithm was examined by experts and lay experts. We assessed the reliability of our algorithm using test-retest method. Moreover, using the LCA, we studied the validity of the stages measured by the designed algorithm in 4903 students (ages 14-19), who were randomly selected from grade 10 high school students in Tabriz (North-West of Iran).


The algorithm content validity indicates high relevancy and clarity percentages. Intra-class correlation of 0.929 was found in the assessment of the reliability of smoking stages (9 stages) in 154 students within a two-week interval. The LCA model revealed nine interpretable classes (G2 = 0.051, df = 1, P = 0.821) for the measurement of smoking stages. Examination of the smoking cessation stages in a sample of 218 students in the cessation stage demonstrated that the results for five classes could be interpreted (G2 = 0.001, df = 1, P = 0.975).


The results suggested that this algorithm is clear, valid, and reliable.

Keywords: Adolescence, latent class analysis, reliability, smoking stages, validity


The age of smoking onset is decreasing in both developed and developing countries[1] and most of the developed countries have implemented several policies in order to reduce smoking rates, more specifically among adolescents. This is because the age at initiation of cigarette smoking is one of the important determinants of tobacco dependence possibility, likelihood of smoking cessation, and the risk of unwilling health consequences.[2,3,4,5] Many researchers have considered smoking behavior in adolescents as a transition or movement in some stages. Although, smoking onset and continuation are connected processes per se, in several studies it has been attempted to make this process into several stages so that primary and secondary prevention could be included. In the study conducted by Leventhal and Cleary in 1980s, it was suggested that smoking includes a complex continuum and in order to a person to be a smoker, different stages should be passed.[6] Subsequently, Flay et al. (1983)[7] and Stern et al. (1987)[8] accepted the multistage nature of smoking and proposed stages in a more sophisticated way. The initial stages then were shortened and described by Flay in 1993.[9] The following six stages were illustrated by Myhew et al. in 2000[10] using previous studies: percontemplation, contemplation or preparatory stage, tried or initiation, experimenter, regular, and established or daily smoker. Kremers et al. (2001 and 2004)[11,12] showed that nonsmokers in the percontemplation stage are not homogeneous and they can be included in three groups of committers, immotives, and progressives. The first group is sure that they will not smoke in the future. The second group does not plan to start smoking in the future (they do not have any definite program to do so), and the third group want to smoke, but not in the next six months. In addition, for the measurement of smoking stages and cessation, an algorithm has been proposed by Pallonen et al. (1998),[13] which is commonly used in the studies related to smoking stages in the adolescents. However, it is not a comprehensive method for the measurement of the smoking stages.

Understanding the process of adolescent smoking and identifying the predictors of passing through smoking stages are of great importance for policy makers in the development of effective strategies that prevent cigarette smoking. Notwithstanding the importance of smoking stages evaluation in adolescents, there is not an appropriate instrument for its measurement. This study aimed to design a suitable tool for this purpose and to assess its validity by latent class analysis (LCA) model, which is an appropriate technique for the evaluation of staging algorithm validity.[14,15,16] The LCA tests this hypothesis whether individual responses to the algorithm questions can be identified by smoking stage (latent variable) by examining the response patterns. Moreover, the validity and reliability of this algorithm was studied in this paper.


Content validity

Considering the measurement algorithm for the smoking stages and cessation in adolescents proposed by Pallonen et al.,[13] and based on the smoking stages suggested by Myhew et al.[10] and Kremers et al.[12] study for the categorization of nonsmoker adolescents, a questionnaire was developed. Then, to evaluate its content validity, the designed questionnaire attached with an answer sheet for the opinions on relevancy and clarity of the algorithm were sent to the following people: 5 experts in the field of adolescence smoking (content expert), 6 experts on the questionnaire designing and methodology (methodology expert), 4 teachers and education experts in the field of student consulting and training and 5 alert students (lay expert). For the purpose of assessing the relevancy of the questionnaire, individuals were asked to answer the following question: “How much this question is associated with the measured parameter? In other words, how appropriate is this question?” The proposed answers were ranked as follows: 1-not suitable, 2-moderately suitable, 3-suitable and 4-completely suitable. For examining the clarity of the questionnaire, they were asked to answer the following question: “How clear is the meaning of this question?” The proposed answers were categorized as follows: 1-ambiguous, 2-moderately ambiguous, 3-clear and 4-completely clear. Answers 3 and 4 were considered as favorable and answers 1 and 2 were considered as unfavorable, then the relevancy and clarity percentages for two groups (experts and lay experts) were separately measured.

Reliability in the repeatability dimension

After gathering answer sheets and performing the modifications, the questionnaires were presented to 154 grade 10 high school students within a two-week interval so that its reliability could be determined in terms of repeatability dimension using ICC.

Construct validity using latent class analysis model

In order to evaluate the validity of the measured stages through the designed algorithm, the questionnaires were distributed to 4903 students in grade 10 with a mean age of 15.7 ± 0.73 years (range: 14-19) in Tabriz (a big city in North-West of Iran). They were sampled using a stratified method according to the number of students in each region, number of students in school and type of school followed by cluster sampling (each class was one cluster). The subjects were consisted of 2799 females and 2104 males.

To enhance the validity of student's self-reports, they were assured strict confidentiality of their responses and they could not be recognized by their answers. Also, they were informed about the voluntary nature of their participation in the study and their right to refuse or skip any questions.

The Ethics Committee of Tabriz University of Medical Sciences and Research Committee of the East Azarbaijan Province Education Organization approved the questionnaire. Verbal informed consent was obtained from students before completing the questionnaire.

The LCA was applied to evaluate the construct validity of the algorithm. The LCA is a latent categorical variable's model. The latent variable is not measured directly. Instead, it is measured indirectly by means of two or more observed variables. By analyzing given answers to the categorical observed variables, LCA classifies homogeneous individuals. It assumes that besides the measurement error, whether the correlation between observed variables could be justified by latent variable categories. The input data in LCA is response patterns and their frequencies. By various iterations for the number of identified classes of the latent variable and comparing the frequencies of the observed response patterns with the expected ones, the LCA determines the best model and calculates a statistics similar χ2 called G2. The distribution of G2 is similar to the distribution of χ2 if degree of freedom is less than 60. The significance of G2 indicates the great difference between observed response pattern frequencies and the expected ones, as well as unfit of the model. In other words, it suggests that this number of the classes of latent variables is not fit for the observed response patterns.[17]

For performing LCA, four observable variables (i.e., indicators) were extracted from the questionnaire so that smoking stages could be studied as a latent variable. These indicators were as follows: smoking status with five categories, tendency to smoking in the future with five categories, smoking in the last month with two categories and smoking in the last week with two categories. Furthermore, to examine the validity of smoking cessation stages, four observable variables were extracted from the questionnaire: intention to quit smoking in the next 6 months with two categories, thinking about quitting in the next month with two categories, last time trying for smoking cessation with three categories, and smoking cessation status with three categories. The WinLTA (version 3.1) was used to perform LCA.[14]



Final algorithm is presented in the Figure 1. First, the adolescents chose one of the main six options in the left. Then, they completed the second part of that option. In fact, this algorithm was constructed according to the algorithm proposed by Pallonen et al.,[13] Myhew et al.[10] and Kremers et al.[12] Smoking stages of the three above studies can be easily obtained by this algorithm.

Figure 1
Algorithm for the acquisition and cessation stages

Content validity

In the examining of content validity of this algorithm, the percentage of questionnaire relevancy for both groups was 100%. The clarity percent of the experts and lay experts were obtained 75% and 100%, respectively. Through assessment of smoking stages reliability [9 stages, Table 1], which was performed in 154 students with a two-week interval, the mean of intra-class correlation coefficient (ICC) of the questions was obtained 0.929 (CI 95%: 0.903-0.948).

Table 1
Probability of endorsing particular responses to the aqusision staging algorithm conditional upon stage membership

Construct validity of algorithm

For identifying the smoking stages, by using LCA and considering two other questions (smoking in the last month and last week), validity of the algorithm was studied in 4834 students (69 subjects were excluded due to the lack of data on any of 5 observable variables), so that result of LCA model for the whole sample demonstrated nine interpretable classes (G2 = 0.051, df = 1, P = 0.821). Moreover, the obtained results of LCA with nine interpretable classes in females (n = 2796) (G2 = 0.026, df = 1, P = 0.803) and males (n = 2038) were valid (G2 = 0.003, df = 1, P = 0.956). These findings indicate that the model is excellent fit to the data. In other words, as it was shown in Table 1, both male and female students could be categorized in 9 classes (9 smoking stages) comprehensively based on the observed response patterns.

In order to determine the smoking cessation stages, validity of the algorithm was examined in 218 students. The LCA model demonstrated five interpretable valid classes (G2 = 0.001, df = 1, P = 0.975). The results could be observed in Table 2. Similarly, these results indicate the excellent fitness of the model. That is, with respect to the observed response patterns for 4 observable variables related to smoking cessation, students could be categorized in 5 classes (5 stages of smoking cessation). Table 3 shows labels and descriptions of smoking stages, smoking cessation, and their measurements regarding to the algorithm.

Table 2
Probability of endorsing particular responses to the cessation staging algorithm conditional upon stage membership (n=218)
Table 3
Names, definations and measurements of smoking acquisition stages and cessation stages


This study designed an algorithm with high content validity and clarity. The reliability of the algorithm was also proved high within a two-week interval. Aveyard et al.[18] revealed that the algorithm proposed by Pallonen et al.[13] for smoking stages has a moderate reliability by analyzing two samples. It should be noted that most of their studied subjects, as well as previous studies conducted on smoking stages in adolescents who were the in the precontemplation stage.[16,19] In addition, in contrast to our study, they have not considered students in precontemplation stage in three groups which proposed by Kremers et al.[12] Therefore, the inclusion of majority of sample in one group and integrating three groups in one group cause the reliability to be increased. In our study, although we considered individuals in precontemplation stage in three separated groups, 65 percent of the sample was in the committer stage, which could increase the reliability.

The large sample size, the test-retest method for assessment of reliability and possibility of a change in smoking stage of adolescents in a two-week interval were strengths of this study in identifying the algorithm reliability.

We used the LCA to examine if the adolescents in the smoking and cessation stages could be classified based on their responses to the stage algorithm. The results of the LCA suggested that response patterns to this algorithm highly corresponded with 9 smoking and 5 smoking cessation stages. As reported in Table 1, all of the smoking stages (9 stages) could be interpreted with respect to item-response probabilities. For example, in the first stage (committer) those who have never smoked with probability of 0.983, they were confident (0.997%) that they will never start smoking in the future, and those who have not smoked in the last month (0.997%) and they have not smoked in the last week (0.999). Another interesting example is the individuals in stage 5 (preparator) who have never smoked (0.963%). They plan to start smoking in the next month with probability of 0.985 and they have smoked neither in the last month nor in the last week (100%).

It can be seen in Table 2 that according to item-response probabilities, smoking cessation stages can be interpreted. For example, individuals in the first stage (precontemplation) do not think about quitting in the next 6 months (0.834%) and they do not intend to quit smoking with 100% probability in the next month. They have never tried to quit smoking (0.756%) and with. 991% probability have never stopped smoking completely. Another example in this table is the individuals in the second stage (contemplation) who 100% intend to quit smoking in the next 6 months and with 0.906% probability do not think about quitting in the next month and with 0.927% probability have never gave up smoking completely.

The validity of questionnaires for stage measuring and stage definition itself has been questioned.[20,21] It has been shown that the algorithm suggested by Pallonen et al.[13] to measure the smoking stages and cessation have theoretical and methodological problems.[20] Although the questions of this algorithm assess current behavior, quitting attempts, intention to change, and time since quitting, none of them are measured completely. Recently, in a study conducted by Guo et al.[16] other limitations have been raised. More specifically, they demonstrated that the transition across the stages is not sequential. In terms of staging assessment, even though our designed algorithm is comprehensive than the previous one, it does not measure behaviors and intentions comprehensively. Moreover, due to the cross-sectional nature of the present study, it cannot be claimed about the sequential nature of transition across the stages. Also, our method was confirmatory, not exploratory. It means that we did not want to study how individuals could be classified based on their responses to this algorithm, rather we aimed to study whether response patterns to this algorithm corresponds to our stages (9 stages for smoking and 5 stages for smoking cessation). The insignificant result of the LCA model is positive answer to this question. It should be mentioned that although all of the stages could be interpreted according to item-response probabilities, large number of the stages cause the non-significant result of the test.


Our findings showed that this algorithm is transparent, valid, and reliable. So this algorithm can be used for assessment of smoking stages in adolescents. However, the reliability was studied only in smoking stages and not for cessation stages. Thus, further studies are required to evaluate reliability of cessation stages. And a longitudinal study to assess the transition across the stages and their sequential nature are needed.


This article is a part of PhD thesis supported by Tehran University of Medical Sciences. We would like to thank Deputy of Research and Technology of Tehran University of Medical Sciences and Deputy of Research of Tabriz University of Medical Sciences for financial support of this study. We also wish to thank all of the students, teachers, and head masters of Tabriz high schools for their valuable collaboration with this study.


Source of Support: Nil

Conflict of Interest: All authors declare that they have no conflicts of interest which could inappropriately influence the manuscript.


1. Huang M, Hollis J, Polen M, Lapidus J, Austin D. Stages of smoking acquisition versus susceptibility as predictors of smoking initiation in adolescents in primary care. Addict Behav. 2005;30:1183–94. [PubMed]
2. Breslau N, Peterson EL. Smoking cessation in young adults: Age at initiation of cigarette smoking and other suspected influences. Am J Public Health. 1996;86:214–20. [PubMed]
3. Pierce JP, Gilpin E. How long will today's new adolescent smoker be addicted to cigarettes? Am J Public Health. 1996;86:253–6. [PubMed]
4. Taioli E, Wynder EL. Effect of the age at which smoking begins on frequency of smoking in adulthood. N Engl J Med. 1991;325:968–9. [PubMed]
5. Stanton WR. DSM-III-R tobacco dependence and quitting during late adolescence. Addict Behav. 1995;20:595–603. [PubMed]
6. Leventhal H, Cleary PD. The smoking problem: A review of the research and theory in behavioral risk modification. Psychol Bull. 1980;88:370–405. [PubMed]
7. Flay BR, d’Avernas JR, Best JA, Kersell MW, Ryan KB. Cigarette smoking: Why young people do it and ways of preventing it. In: McGrath P, Firestone P, editors. Pediatric and Adolescent Behavioral Medicine. New York: Springer; 1983.
8. Stern RA, Prochaska JO, Velicer WF, Elder JP. Stages of adolescent cigarette smoking acquisition: Measurement and sample profiles. Addict Behav. 1987;12:319–29. [PubMed]
9. Flay BR. Youth tobacco use: Risk patterns and control. In: Slade J, Orleans CT, editors. Nicotine Addiction: Principles and Management. New York: Oxford University Press; 1993. pp. 653–61.
10. Mayhew KP, Flay BR, Mott JA. Stages in the development of adolescent smoking. Drug Alcohol Depend. 2000;59(Suppl 1):S61–81. [PubMed]
11. Kremers SP, Mudde AN, de Vries H. Subtypes within the precontemplation stage of adolescent smoking acquisition. Addict Behav. 2001;26:237–51. [PubMed]
12. Kremers SP, de Vries H, Mudde AN, Candel M. Motivational stages of adolescent smoking initiation: Predictive validity and predictors of transitions. Addict Behav. 2004;29:781–9. [PubMed]
13. Pallonen UE, Prochaska JO, Velicer WF, Prokhorov AV, Smith NF. Stages of acquisition and cessation for adolescent smoking: An empirical integration. Addict Behav. 1998;23:303–24. [PubMed]
14. Collins LM, Lanza ST, Schafer JL, Flaherty BP. WinLTA user's guide. 2002
15. Magidson J, Vermunt JK. Latent class factor and cluster models, bi-plots, and related graphical displays. Sociol Methodol. 2001;31:223–64.
16. Guo B, Aveyard P, Fielding A, Sutton S. Using latent class and latent transition analysis to examine the transtheoretical model staging algorithm and sequential stage transition in adolescent smoking. Subst Use Misuse. 2009;44:2028–42. [PubMed]
17. Collins LM, Lanza ST. New York: Wiley; 2010. Latent class and latent transition analysis for the social, behavioral, and health sciences.
18. Aveyard P, Lancashire E, Almond J, Cheng KK. Can the stages of change for smoking acquisition be measured reliably in adolescents? Prev Med. 2002;35:407–14. [PubMed]
19. Velicer WF, Redding CA, Anatchkova MD, Fava JL, Prochaska JO. Identifying cluster subtypes for the prevention of adolescent smoking acquisition. Addict Behav. 2007;32:228–47. [PubMed]
20. Etter JF, Sutton S. Assessing ‘stage of change’ in current and former smokers. Addiction. 2002;97:1171–82. [PubMed]
21. West R. Time for a change: Putting the Transtheoretical (Stages of Change) Model to rest. Addiction. 2005;100:1036–9. [PubMed]

Articles from International Journal of Preventive Medicine are provided here courtesy of Wolters Kluwer -- Medknow Publications