|Home | About | Journals | Submit | Contact Us | Français|
Two split-form surveys were conducted online among 6–17 year olds (n=1,200 each) to inform recommendations for cyberbullying measurement.
Measures that use the word ‘bully’ result in prevalence rates similar to each other whether or not a definition is included, whereas measures not using the word ‘bully’ are similar to each other whether or not a definition is included. A behavioral list of bullying experiences without either a definition or the word ‘bully’ results in higher prevalence rates and likely measures experiences that are beyond the definition of ‘bullying’. Follow-up questions querying differential power, repetition, and bullying over time were used to examine misclassification. The measure using a definition but not the word ‘bully’ appeared to have the highest rate of false positives and, therefore, the highest rate of misclassification. Across two studies, an average of 25% reported being bullied at least monthly in person compared with an average of 10% bullied online, 7% via telephone (cell or landline), and 8% via text messaging.
Measures of bullying among English-speaking samples in the US should include the word ‘bully’ when possible. The definition may be a useful tool for researchers, but results suggest that it does not necessarily yield a more rigorous measure of bullying victimization. Directly measuring aspects of bullying (i.e., differential power, repetition, over time) reduces misclassification. To prevent double counting across categories, we conceptualize cyberbullying as bullying communicated through the online mode; type (e.g., verbal, relational), and environment (e.g., school, home) are additional domains of bullying.
Cyberbullying and harassment victimization are significant adolescent health issues as they are associated with concurrent psychosocial problems including depressive symptomatology, social and behavior problems, and substance use [1–4]. Depending on the definition, measure, and methodology used, prevalence rates range between 9%  and 72% .
Definitions of cyberbullying vary widely , contributing to the inconsistency in findings across studies. A lack of consensus complicates cross-study comparisons and, thus, limits research progress. Some researchers treat “cyber” as a type of bullying, equivalent to physical and relational bullying ; others treat it as an environment, equivalent to school . If cyberbullying is treated as a type of bullying (e.g., cyber versus relational bullying) or as an environment (e.g., online versus at school), measures are vulnerable to double counting (e.g., relational bullying online; being bullied online while physically located at school). If treated as a communication mode (i.e., in-person, text messaging, voice [landline or cell phone], or online) however, cyberbullying becomes a distinct and meaningful category.
Even when defining cyberbullying the same way, researchers operationalize it differently across studies. Some measure it using a simple question (e.g., ‘have you been cyberbullied?) [10, 11]. Others use a definition (i.e., ‘we say bullying is…’) [1–4, 8], a list of behavioral experiences [5, 12–16], or both [6, 17–20]. Drawbacks exist for each approach. A definition-based measure may challenge respondents whose experiences differ from the definition. It also assumes that the definition will be read and understood. Using the word ‘bully’ necessitates the participant adopt the label of ‘being bullied’ . It also presumes that the respondent and researcher share the same meaning of ‘bully.’ Due to rapid changes in technology, behavioral lists, while providing concrete examples of bullying, are vulnerable to constant iteration unless lists are constrained to experiences that are universal across environments. Also, unless coupled with a definition or follow-up questions, behavioral lists likely measure general aggression instead of repetitive bullying between actors of differential power.
To avoid the word ‘bully’, some employ synonyms (e.g., ‘mean things’) [1, 2, 4, 6] or omit it from the definition entirely . In a 14-country comparison of 67 words and phrases used to describe ‘bullying’, Smith and colleagues report that the terms “bullying” and “picking on” cluster together, whereas the words “harassment”, “intimidation” and “tormenting” relate to each other in a different cluster . Thus, the use of synonyms may not always connote bullying.
To better understand how variations in definition and operationalization affect prevalence rates, and to identify the best method for measuring cyberbullying, we report the results of two related studies. Study 1 examines the relative impact of the word “bullying” and an adapted version of Olweus’ definition  on prevalence rates. It questions whether the likelihood of youth admission of being bullied, and adoption of that label, varies by the appearance of the word ‘bully’, or a definition, in the survey question. Study 2 examines how well reported rates of bullying align with Olweus’  three main definitional characteristics of bullying: differential power, repetitiveness, and over time. We identify which measure results in the highest percentage of accurate self-classification (i.e., those who say their experience occurred over time, repeatedly, and by someone with more power than they). Given the use of other terms to approximate the word bullying (e.g. ‘mean things’ and ‘harassment’), we also examine the impact of one word, ‘harassment’, on prevalence rates.
Olweus’  definition of bullying has found wide acceptance in the research literature. Based upon the number of researchers who are using an adapted version of this definition [2, 3, 8, 17–19], this seems true of cyberbullying as well. We conceptualize cyberbullying under the larger rubric of ‘bullying’. We propose three mutually exclusive components: 1) type (e.g., physical, relational), 2) mode of communication through which bullying occurs (e.g., in-person, online), 3) and environment (e.g., school). Not all components need be included, but they should not be combined even if space or budget is limited to avoid unintentional double counting of victimization rates. To our knowledge, this is the first study to propose measurement to be constrained to these three distinct domains. The recommendation arises from extensive consultations among the authors about the conundrum attributable to counting victimization across spaces in which youth engage.
Two separate “mini-surveys” were conducted online: January, 2010 (Study 1); and May, 2010 (Study 2)1. The protocol was reviewed and approved by Chesapeake IRB. C&R Research administered the surveys.
Respondents were randomly selected from a 30,000-member online panel. 1,200 youth between the ages of 6–17 were recruited for each survey. Younger youth (6–9 years) completed the survey with their parent; older youth completed the survey alone. A waiver of parental consent was obtained for the studies. Youth assent was required to participate: 95% assented in Study 1; and 97% in Study 2.
The survey research firm routinely conducts monthly omnibuses (i.e., surveys). Participants who took part in an omnibus in the past three months were excluded from the current month’s recruitment pool. The sample was purposefully balanced on biological sex (50% female) and age groups (300 youth in each group: 6–8-years; 9–11-years; 12–14-years; and 15–17-years). No other eligibility criteria were applied. As shown in Table 1, participants were an average of 12-years (M: 11.9-years, SD: 3.5-years). About 70% were White (weighted data).
The response rate (i.e., the number of people who clicked on the survey invitation link in the email divided by the total number of survey invitation emails sent) for Study 1 was 32%; and Study 2 was 39%. Survey completion rates among those who started the survey were about 93% for each study. Rates are within the expected range of well-conducted online surveys [24, 25].
To examine relative differences in prevalence rates based upon different measures fielded within one sample, a split-form methodology was used: a random sub-sample was assigned to one of several possible ‘forms’ of the measure. Internal validity is crucial for valid split-form studies; external validity less so. Thus, an online panel is as acceptable as other sampling procedures.
All questions used a 5-point scale and referred to the ‘past year’. Youth were categorized into one of three groups: 1) never, 2) less frequently than monthly (i.e., once or a few times), and 3) monthly or more often (i.e., a few times a month, a few times a week, every day, or almost every day). This categorization grouping was determined prima facie to reflect those who are bullied repetitively and over time (at least monthly) as opposed to those aggressed upon less frequently.
In Study 1, youth were randomly assigned to one of four different forms of the survey question:
The Definition + word ‘bully’ form read:
We say a young person is being bullied when someone repeatedly says or does mean or nasty things to them. Examples include being teased repeatedly or having nasty or cruel things said; being hit, kicked or pushed around; being excluded or left out; or having rumors spread. We are not talking about times when two young people of about the same strength fight or tease each other. We are asking about things that:
(These things can happen anywhere like at school, online, via text messaging, at home, or other places young people hang out.) In the last 12 months, how often have others bullied you by doing or saying the following things to you?
The Definition-only form was the same as above, but with a modified first sentence: “Sometimes people your age repeatedly say or do mean or nasty things to each other.” It also had a modified question: “In the last 12 months, how often have others done or said the following things to you?”
The ‘bully’-only form read:
In the last 12 months, how often have others bullied you by doing or saying the following things to you? (These things can happen anywhere like at school, online, via text messaging, at home, or other places young people hang out.)
The final form presented neither the definition nor the word:
In the last 12 months, how often have others done or said the following things to you? (These things can happen anywhere like at school, online, via text messaging, at home, or other places young people hang out.)
Note that the definition provided did not differentiate between mode, environment, and type because this was unimportant for participants to consider. Instead, participants were primed to think about bullying experiences broadly; and then the item response options forced the differentiation (i.e., modes were queried separately from types of bullying).
All youth were then presented the same behavioral list of experiences: 1) hit, kicked, pushed, or shoved you around, 2) someone made threatening or aggressive comments to you, 3) you were called mean names, 4) you were made fun of, or teased in a nasty way, 5) you weren’t let in or you were left out of a group because someone was mad at you or was trying to hurt you, 6) someone spread rumors about you, whether they were true or not, and 7) some other way.
Youth who said that at least one of these experiences had occurred to them in the past year were asked the mode of communication through which it occurred: in-person, by phone call (cell or land line), by text message, or online (e.g., email, social network site, or instant messenger). Response options were captured with the same 5-point frequency scale.
Budget limitation prevented the inclusion of the third bullying context, environment, in Study 1.
In Study 2, youth were randomly assigned to one of four different forms: 1) Definition + the word ‘bully’, 2) the Definition-only, 3) the word ‘bully’-only, and 4) Definition + the words ‘bully’ and ‘harassment’. Based upon findings from Study 1 (presented below), a form that included neither the Definition nor the word ‘bully’ was not included. The definition was modified slightly from Study 1: to improve readability, “more than once” and “more than just one day” were used instead of “are repeated” and “happen over time.”
Across all four forms, youth who indicated they had been bullied through at least one mode were asked three follow-up questions: 1) Was it by someone who had more power or strength than you? This could be because the person was bigger than you, had more friends, was more popular, or had more power than you in another way., 2) Was it repeated, so that it happened again and again?, and 3) Did it happen over a long period of time? We mean more than a week or so?
Due to budget limitations, victimization type and environment were not queried in Study 2.
Data were weighted to match the US online population of youth on biological sex, age, race/ethnicity, household income, geography, and county size. The design-based Pearson’s Chi-Square statistic was used to determine difference between categorical responses, while taking into account survey weighting . In Study 2, we calculated the positive predictive value (PPV) to estimate each measure’s rate of misclassification. PPV requires a ‘gold standard’ by which a measure is compared. We defined the gold standard as being consistent with Olweus’ components of bullying: differential power, with repetition, and over time. PPV is calculated as the [True Positives/(True Positives + False Positives)].
Confirmatory factor analysis suggested that the behavioral items measuring the different types of bullying loaded on one factor, ‘bullied’ (Table 2). Loadings were similar for boys and girls, and younger (e.g., 6–8 year olds) and older (e.g., 15–17 year olds) youth (data available upon request). The behavioral items were combined to reflect youth who reported at least one of the seven types of bullying victimization had occurred versus those who reported none of the seven types of bullying experiences had occurred. This variable, “behavioral list”, was used in subsequent analyses.
As shown in Table 3, reported rates of victimization were highest in the split-form measure that used neither the Definition nor the word ‘bully’; and lowest for the two forms that used the word ‘bully’ in the introduction to the survey question. Rates for the Definition-only form were statistically indistinguishable from the form that used neither Definition nor the word ‘bully’. Reliable differences were noted for communication mode between the ‘bully’-only form and the form that used neither Definition nor the word ‘bully’, as well as between the definition + ‘bully’ form and the form that used neither the definition nor “bully” for bullying via phone. The Definition+’bully’ form differed reliably from the Definition-only form for bullying via text messaging.
Twelve-month prevalence rates for bullying using the Definition+’bully’ and ‘bully’-only forms were similar to those reported in Study 1. Rates observed for the form that used the Definition+’bully’+’harassment’ were similar to those observed for the Definition+’bully’ form suggesting that the word ‘harassment’ does not denote additional context beyond ‘bully’.
In Study 2, youth who endorsed any type of bullying experience were asked follow-up questions to determine whether the experiences included 1) differential power, 2) repetition, and 3) occurrence over time. In order to calculate the positive predictive value (PPV), endorsement of all three was defined as the ‘gold standard’, consistent with Olweus’ definition of bullying. As shown in Table 4, 9–11% reported being bullied monthly or more often and also met the three criteria. Between 0–3% met all three criteria while also reporting it occurred less often than monthly. Rates of false positives, and therefore the positive predictive value, were similar across all four question forms.
We also examined a briefer follow-up series. Youth who reported being bullied at least monthly implicitly meet the criteria for being victimized both over time and repeatedly. Then, the only additional follow-up question needed to determine the ‘gold standard’ is one about differential power. When the ‘gold standard’ is re-defined as: endorsing the question about differential power + reporting a frequency of ‘monthly or more often’, PPV ranges from 47% (Definition-only) to 65% (‘bully’-only; see Table 4).
Across the two studies, an average of 25% reported being bullied at least monthly in-person. An average of 10% reported being bullied at least monthly online, 7% via telephone (cell or landline), and 8% via text messaging.
Across two split-form online surveys of youth 6–17-years-old, the introduction text appears to affect endorsement of bully victimization experiences. Measures that use the word ‘bully’ affect rates similar to each other whether or not a definition is included, whereas measures that do not use the word ‘bully’ are similar to each other whether or not a definition is included. This suggests that either the participants are not reading the definition, or that it is not personally meaningful. The definition may be a useful tool for researchers, but these results suggest that it does not yield a more rigorous measure of bullying victimization. If reading burden were an issue in the current survey, it is likely more so an issue in other surveys that use longer definitions. Furthermore, a behavioral list of bullying experiences without either a definition or the word ‘bully’ results in higher prevalence rates and likely measures experiences that are beyond the definition of ‘bullying’.
Victimization rates are strikingly similar across measurement forms when the three Olweus  criteria-based follow-up questions are applied. All forms, therefore, seem to be equally able to identify ‘true positives’ (i.e., those who say they have been bullied and endorse all three follow-up questions). Thus, adding these three follow-up questions seems to neutralize differences in the question forms.
The benefit of using all three follow-up questions is the ability to identify the few participants who report repetitive bullying that occurs over a relatively short period of time (i.e., less frequently than monthly). In the current studies, this reflects 0–3% of the respondents. The drawback of using three questions is participant burden. Instead of using separate questions to query repetitiveness and over time, one could classify participants who report being bullied monthly or more frequently as, by definition, meeting the criteria of ‘repetitiveness’ and ‘over time’. In this case, only one follow-up question to query differential power is needed. Again, the drawback is missing the respondents who have intense, but shorter-term experiences. When this gold standard is applied, the PPV, that is, the ability to accurately identify victimization cases, suggests that using the definition without the word ‘bully’ may result in the highest rate of false positives (i.e., those who say they have been bullied but do not endorse the differential power follow-up question). In contrast, the word ‘bully’ may elicit responses that lead to the lowest rates of misclassification. Researchers should consider including at least this one follow-up question in the same way that follow-ups for functional impairment are now included for many DSM-IV measures .
We used the same behavioral lists across all communication modes. When lists are used, we advocate for universal lists across communication mode and environment to allow for comparisons of youth experience online and offline. This also makes the measure transcendent of technology. Adolescent uses of technology will continue to change at a rapid pace, but this should not affect our definition of bullying or the associated behavioral lists.
Similar to pan-European data , twice as many youth in our studies report bullying in-person compared to each of the other communication modes. Despite the rapid uptake of technologies, ‘traditional’ face-to-face communication still is the dominant mode of bullying. Conceptualizing mode as a component of bullying permits us to count rates of online experiences separately from those occurring in-person, allowing for direct comparisons. Some may be concerned that components of the Olweus-based definition of bullying may not translate well to the online context. For example, the concept of repetition online may be different, as it is possible to have a picture posted or rumor written once, yet shared with others over and over again. While different in potential magnitude, this seems similar to a rumor scrawled once on a bathroom wall for many people to see repetitively. Traditionally, we would not say that this meets the definition of ‘repetition’ offline; and it is not clear why it should online. Similarly, anonymity has been pointed to as something unique to the online world. This assumes however, that all bullying offline is done by known people. In fact, 12% of youth reporting being bullied at school say they do not ‘know’ who their bully is . While lower than the 46% who report not knowing who their online bully, the issue of anonymity applies beyond online spaces.
Findings are specific to English-speaking youth in the United States. Not all languages have the word ‘bullying’ [20, 22]. Different findings would likely emerge if a similar study were conducted among non-English samples. Beyond an examination of internal consistency (i.e., Cronbach’s alpha and confirmatory factor analysis) of the behavioral items, possible variation by age was not examined. It is possible that the definition and word ‘bully’ have different influences on an 8-year-old compared to an 18-year-old . Also, the calculation of Positive Predictive Value requires a ‘gold standard’ to measure against; a stronger standard would have been observer-report. Finally, some may wonder why other, more sophisticated analyses were not employed. For example, the application of item response theory (IRT) is currently being debated in bullying research. The discussion centers on whether or not being bullied reflects an underlying trait; if it is not, then IRT is inappropriate. This question is beyond the paper scope.
Findings from two mini-surveys of youth sampled nationally suggest that measures for English-speaking studies should include the word ‘bully’ when possible. The definition seems less critical in affecting prevalence rates. A behavioral list of bullying experiences without either a definition or the word ‘bully’ results in higher prevalence rates and likely measures experiences that are beyond the definition of ‘bullying’. The word ‘harassment’, while meaningfully different than bullying to researchers, does not seem to connote additional context for youth beyond ‘bully’. Its inclusion to help youth conceptualize bullying, or give them a different term with which to identify, does not affect bullying rates. Furthermore, directly measuring aspects of differential power, and repetition over time through follow-up questions reduces the misclassification of youth. Finally, we propose three mutually exclusive components of bullying: 1) type (e.g., physical), 2) communication mode (e.g., online), 3) and environment (e.g., home). Not all components must be included, but they should not be merged to prevent double counting.
The project described was supported by Award Number R01-HD057191 from the National Institute of Child Health and Human Development (NICHD). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NICHD. We would like to thank Dr. Kimberly Mitchell, Ms. Amanda Lenhart, and Dr. Donna Cross for their input into study design; Ms. Tonya Prescott for her tireless help with formatting and proofreading; and the study participants for their participation in this study.
1A third survey examining the impact of question order on prevalence rates of bullying also was conducted. Differences were not statistically significant. Due to space limitations in the journal, results are excluded here, but available upon request from the authors.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Michele Ybarra, Internet Solutions for Kids, Inc.
danah boyd, Microsoft Research.
Josephine Korchmaros, Internet Solutions for Kids, Inc.
Jay (Koby) Oppenheim, Graduate Center, City University of New York.