|Home | About | Journals | Submit | Contact Us | Français|
Observational measures of parent and child behaviours have a long history in child psychiatric and psychological intervention research, including the field of autism and developmental disability. We describe the development of the Standardised Observational Analogue Procedure (SOAP) for the assessment of parent–child behaviour before and after a structured parent training program for children with pervasive developmental disorders (PDD). We report on the use of this procedure in a pilot study of 12 participants with PDD.
Inter-rater reliability across behaviours coded ranged from 75 to 100% agreement. Blindly scored observations of behaviour showed medium effect sizes for changes in inappropriate child behaviour. Analyses of baseline scores revealed a moderate positive correlation between inappropriate child behaviours as measured in all four SOAP conditions and parent ratings of child noncompliance (rs = .66, p < .05). By contrast, the correlations of SOAP scores with parent ratings of irritability was lower (rs = .40, p > .05).
As our treatment targeted compliance, these preliminary results suggest that the SOAP provides a valid measure of noncompliant behaviour in children with PDD and is sensitive to treatment effects on inappropriate child behaviours.
Observational measures of parent and child behaviours and interactions have a long history of use in child psychopathology and intervention research (Bell, 1964; Roberts, 2001). Such measures may be conducted in naturalistic settings or in laboratory situations. A number of different laboratory paradigms have been used, including free play, parent-directed play, and compliance analogues. Investigators have used observations during analogue parent–child interactions to examine variables such as child aggression and compliance, as well as parental use of commands and praise. Many prior studies have tended to focus upon the use of analogue observations to discriminate between controls and children with disorders such as attention deficit hyperactivity disorder (Campbell, Breaux, Ewing, & Szumowski, 1986; Campbell, Breaux, Ewing, Szumowski, & Pierce, 1986), oppositional defiant disorder (Roberts, Joe, & Rowe-Hallbert, 1992), and conduct disorders (Webster-Stratton, 1996). Also, parent–child behaviours and interactions have been found useful in evaluating efficacy of both pharmacologic and psychosocial treatments (Bahl, Spaulding, & McNeil, 1999; Barkley & Cunningham, 1979).
Direct observational methods have also been used to uncover mechanisms of family interaction, such as coercive parenting and escape conditioning, that influence the emergence of disruptive behaviour in childhood (Patterson, 1982; Roberts & Powers, 1988), and to identify parenting skills that can be taught to the parents in order to reduce child aggression and noncompliance (Wahler, Winkel, Peterson, & Morrison, 1965; Wiltz & Patterson, 1974). Currently, parent training interventions for child disruptive behaviour are among the best-studied and efficacious psychosocial treatments (Kazdin, 2005). Several structured observational methods have been used extensively in clinical trials with children with conduct disorder to evaluate change in parent and child behaviour (McMahon & Forehand, 2003; Robinson & Eyberg, 1981; Schuhmann, Foote, Eyberg, Boggs, & Algina, 1998; Webster-Stratton & Hammond, 1997).
The assessment of analogue parent and child interactions has also been used in the field of autism and developmental disorders. As with other childhood psychiatric disorders, observations of parent–child interactions have been used to examine differences between children with autism and children who are developing typically (e.g., Doussard-Roosevelt, Joe, Bazhenova, & Porges, 2003). Analogue observations of the behaviours of children with autism and/or intellectual disability have also been employed in functional analysis studies to identify environmental variables contributing to a range of disruptive and harmful behaviours. Functional analysis involves the manipulation of variables to assess their direct effect on the rate of target behaviours. Accordingly, analogue situations typically include free play (a control condition), demand condition, attention condition, and tangible restriction condition to assess the functional relationship between different environmental variables and behaviour (Hanley, Iwata, & McCord, 2003; Iwata, Dorsey, Slifer, Bauman, & Richman, 1982). For example, during the attention condition of a functional analysis, a child who engages in headbanging is provided with adult attention whenever this behaviour is observed. Similarly, in the demand condition, demands are immediately discontinued following the occurrence of self-injury (Iwata et al., 1982). Although functional analyses are often used to derive rational treatment interventions, they are not necessarily designed to evaluate treatment effects.
Parent–child observations have been used in a small number of studies to evaluate treatment effects among children with autism and other developmental disorders. Videotaped observations of parent–child interactions during unstructured dinner time were used in a randomised study comparing pivotal response training to the teaching of target behaviour in 17 children with PDD (Koegel, Bimbela, & Schreibman, 1996). Parent interactions were rated as neutral, negative, or positive across four areas (happiness, interest, stress, communication style). Handen, Feldman, Lurier, and Murray (1999) observed reduction of noncompliant and disruptive behaviour during an analogue observation task in a double-blind, placebo-controlled trial of methylphenidate in 11 preschoolers with developmental disability and ADHD. In a study with contrasting findings, Kolmen, Feldman, Handen, and Janosky (1995) used mother–child interactions to assess the effects of naltrexone in a double-blind, placebo-controlled study in a group of 13 young children with autism. In this trial, no significant improvements in child or parent behaviour were noted during the compliance task, clean-up task, or during the analogue in which the child was required to sit and wait while the mother talked with a clinician (simulating an at-home situation when a mother might desire to talk on the telephone). More recently, observational ratings of both child engagement and parent responsiveness during mother–child play were found to be sensitive to treatment effects in a study of relationship-focused intervention in 20 preschool children with PDD (Mahoney & Perales, 2005).
The appeal of observational measures is that they are presumed to be less vulnerable to potential biases from informants, such as parents or teachers. In addition, observational measures permit more individualised assessment that may be lost with the use of rating scales. On the other hand, a limitation is that the observation samples only a short period of time, which may not accurately reflect what is happening the rest of the time. Further, a clinic analogue protocol is an artificial situation that may not reflect what happens in the child's usual environment. However, the major challenge in the application of observational measures in intervention research is collection of valid data on parent–child interaction and the related cost. The coding of analogue observations poses significant costs for large clinical trials. Observations within naturalistic settings, often preferred by behavioural researchers, are even more burdensome.
A wide range of observational measures and coding systems have been developed to investigate various aspects of parent and child behaviour (see Gardner, 2000, and Roberts, 2001, for comprehensive reviews). However, we needed an observational measure that could be used in a large multi-site clinical trial, and be appropriate across the wide ranges of age, as well as cognitive and adaptive functioning levels among children with PDD. Therefore, drawing on the literature of other observation systems, we developed a Standardised Observation Analogue Procedure (SOAP) that allows for uniform assessment across sites and children. The SOAP was used to evaluate treatment effects of a structured parent training (PT) program for families of children with PDD and disruptive behaviour (Johnson et al., 2007). Briefly, the PT program consisted of 11 core, three optional, and three booster sessions, as well as an initial and follow-up home visit. The first six core sessions provide families with instruction in basic behavioural analysis techniques designed to prevent or decrease negative or noncompliant behaviours and to promote positive, pro-social behaviours. Topics include antecedent management techniques, positive reinforcement procedures, extinction, and compliance training. A subsequent set of five core sessions focuses on functional communication, teaching skills and techniques to promote generalisation and maintenance. A comprehensive battery of parent and clinician rated measures was assembled to evaluate PT in a recently completed multi-site trial, and observational data were collected to complement these ratings (Scahill et al., in press).
The current paper serves three purposes. First, we report inter-rater reliability for the categories of behaviour coded for the four SOAP conditions. Second, to evaluate convergent and divergent validity of the observational measure, we report the association between the observed behaviours and parent-rated measures. Finally, we report the effects of parent training using the analogue observational measures collected pre- and post-treatment.
This study was approved by the authors' respective universities' Institutional Review Boards (IRB). The informed consent process followed the processes as set by each of the local IRBs, with an in-person explanation and signing of the informed consent document. All families were provided a copy of the informed consent. Seventeen participants (14 boys and 3 girls; mean age 7.7 years; range 4–13 years) participated in a feasibility study of PT in preparation for a large-scale, randomised trial designed to compare medication alone versus medication plus PT for children with PDD (Johnson et al., 2007; Research Units on Pediatric Psychopharmacology [RUPP] Autism Network, 2007; Scahill et al., in press). Of these 17 participants, 12 (mean age = 7.6 years; range = 4.1–11.7 years) had complete SOAP data pre- and post-treatment. The other five participants had incomplete data due to study drop-out (n = 2) and equipment failure leading to incomplete or low quality tapings (n = 3). The 12 participants with complete data ranged in IQ from 29 to 78, with an average IQ score of 49. Nine (75%) were Caucasian, 1 (8.3 %) Asian, and 2 (16.7%) biracial. To be eligible for the feasibility PT study, participants had to be between the ages of 4 and 13 years, meet criteria for a PDD diagnosis (Autistic disorder, Asperger disorder, or Pervasive Developmental Disorder, not otherwise specified [PDD-NOS]), be on stable medication (at least 4 weeks) with no planned changes for six months, and have mild to moderate behaviour problems. The diagnosis of PDD was based on clinical assessment with corroboration from the Autism Diagnostic Interview: Revised (Lord, Rutter, & Le Couteur, 1994). The presence of mild to moderate disruptive behavioural problems was established by a baseline Clinical Global Impressions-Severity (CGI-S) score of at least 3 (mild impairment), but not greater than 5 (marked impairment) (Guy, 1976). This inclusion criterion acknowledged that participants were receiving some benefit from medication, but behavioural concerns remained.
Behavioural concerns included aggression, tantrums, self-injury, and noncompliance. Children with greater than moderate behavioural problems were excluded on the presumption that these children would need more intensive treatment (including pharmacological adjustment). Finally, all participants had to be functioning at a mental age (MA) level of at least 18 months, measured by a standardised intelligence or developmental test obtained at baseline (e.g., Leiter International Performance Scale-Revised, Roid, 1997; Mullen Scales of Early Learning, Mullen, 1995; or Slosson Intelligence Test for Children and Adults, Slosson, 2001).
The SOAP was conducted and recorded on video at pre- and post-treatment. The procedure was developed based on prior work in parent–child observations and brief functional analysis procedures (Handen et al., 1999; Hanley et al., 2003; Kennedy, Meyer, Knowles, & Shukla, 2000; Repp & Horner, 1999). The SOAP comprised four 10-minute observations:
The primary difference in our instructions from more typical functional analogue procedures is the instruction for the parent to address the behaviours as they typically would. Hence, the consequence was not experimentally manipulated but rather a naturalistic observation of parent-delivered consequences provided contingently in these analogue observations. The sessions were conducted in sparsely furnished clinic rooms, approximately 12 m2 with an adjacent observation room where the video recording took place.
Operational definitions were developed for coding parent and child behaviours of interest. Child behaviours included (a) inappropriate behaviour, and (b) compliance to demand (only in the High Demand session). Initially, specific inappropriate behaviours were coded separately. For example, disruptive behaviours such as climbing on furniture, running out of the room, tantrums, aggression, such as hitting others, biting, and self-injury, such as self-biting and headbanging were coded separately. Over time, however, it became clear that this approach was impractical and would make data reduction more difficult. Moreover, the goal of the parent training program was to decrease inappropriate behaviours broadly speaking. Thus, we merged all inappropriate behaviours into a single code. Compliance was defined as the child cooperating with parental demand within 60-seconds of the demand being given. The 60-second time frame was applied because many demands involved tasks that could not be completed within a 10–20 second period. Coded parent behaviours included (a) verbal reprimands, (b) positive reinforcement, (c) demands (coded only in High Demand session), and (d) contingent reinforcement of compliance within 20 seconds of child compliance (coded only in High Demand session). These parent behaviours were of interest as they were observable behaviours with expected change as a result of the parent training program. Table 1 provides the complete operational definitions and examples of the coded child and parent behaviours. Using a whole 10-second interval, the occurrence or nonoccurrence of the behaviours were coded for a total of 5 minutes (minutes 3 through 7) via video recordings using ProcoderDV software (Tapp, 2003).
Two coders were trained to reliability with three training tapes. One of the coders was a project coordinator with over 10 years of experience conducting behavioural research. This coder trained the second coder, an undergraduate psychology student, in the operational definitions of coded behaviours and the use of the coding software. Reliability was determined by calculating percent agreement (total occurrence agreements for each coded variable for each session divided by the total occurrence agreements plus total occurrence disagreement) for 25% of randomly selected sessions. Mean percent agreement represents the average total percent occurrence agreements of coded variables.
Home Situations Questionnaire (HSQ) asks the parent to consider the child's noncompliant behaviour in several real-life situations such as when playing with other children, when visitors are in the home, and at bedtime (Barkley and Murphy, 1998). Questions answered affirmatively are then rated on a 1- to 9-point Likert scale with higher scores indicating more severe noncompliance. Thus, the HSQ yields two scores: a count of “yes” responses (0 to 25) and a severity score (total of 1 to 9 ratings on “yes” items; range = 0–225). This severity score is typically expressed as the per-item mean score. The measure was developed to be used with a clinically diagnosed population and does not use clinical cut-off scores. The instrument was modified for this study by adding some items that reflected situations that often pose challenges for children with PDD. For example, situations when there is an unexpected change in the routine and when there is a change in the arrangement of a familiar setting (e.g., bedroom or classroom). This slightly modified version of the HSQ was used in our PT feasibility study (RUPP Autism Network, 2007).
Aberrant Behavior Checklist (ABC; Aman et al., 1985a, 1985b) is a 58-item informant-based scale comprising 5 subscales. The 15-item Irritability subscale was used in the current study because of its relevance to the intervention. Example items include “aggressive towards others,” “temper tantrums,” “irritable,” and “cries over minor annoyances and hurts.” Items are rated on a 4-point scale; higher scores indicate more severe problem behaviour. As with the HSQ, this measure does not have a clinical cut off score.
The distributions of SOAP scores for inappropriate behaviour were slightly skewed positively in most conditions at baseline and follow-up; skewness values ranged from 0.06 to 1.36 and kurtosis values ranged from 0.44 to 1.73. This distribution was to be expected, in that the measure was designed to identify inappropriate behaviours among children with PDD where the incidence of behaviour problems is expected to be higher than normal limits (Lecavalier, 2006). Appropriate statistical procedures were used, such as the Spearman correlation coefficient, when necessary, to accommodate the lack of normally distributed scores.
Interobserver agreement was calculated for 25% of the total sessions. Mean percent agreement for child behaviours were: (a) inappropriate behaviours, 85% (range 72–98%); (b) child compliance, 96% (range 93–100%); (c) parent demands, 96% (range 93–100%); (d) contingent and positive reinforcement, 89% (range 74–100%); and (e) parent verbal reprimands, 97% (range 90–100%).
The relationship between total inappropriate child behaviours during the four SOAP conditions and parent-reported child characteristics provided preliminary information about the validity of the SOAP as an assessment of child behaviour. Correlation analyses were conducted to examine the association between inappropriate child behaviours (as observed in all four SOAP conditions), parent-reported child noncompliance (measured by the HSQ), and irritability (as measured by the ABC-Irritability subscale). The mean, standard deviation, and range of the HSQ and ABC for this sample is provided in Table 2. The baseline scores represent meaningful clinical elevations and the reduction in the endpoint scores represent meaningful reduction in symptoms. Inappropriate child behaviours were selected for further analyses because this observation category was the only child related behaviour code that was consistently coded across all 4 SOAP conditions. Due to the small sample size and non-normal distribution of the categorical data, Spearman correlation coefficients were calculated. Inappropriate child behaviours observed during the SOAP conditions were moderately correlated with scores on the HSQ, representing child noncompliance (rs = .66, p < .05). By contrast, the correlation with the ABC-Irritability scores were positive, but not significant (rs = .40, p > .05).
Results from baseline to follow-up are provided in Table 3. Mean frequencies of child and parent variables are provided along with t-values for matched pairs, as well as corresponding significance levels and effect sizes. Inappropriate child behaviours were reduced significantly in the social attention (t = 2.24, p < .05, d = 0.49), demand (t = 3.58, p < .004, d = 0.63), and restriction (t = 3.22, p < .008, d = 0.84) sessions from baseline to endpoint. Parent verbal reprimands were also significantly decreased in the demand session (t = 3.19, p < .001, d = 0.71). Child compliance did not increase significantly between baseline and endpoint. Although contingent reinforcement increased, it did not show a significant change from baseline to endpoint. In the tangible restriction session, parental use of positive reinforcement increased significantly from pre- to post-treatment (t = −2.57, p < .05, d = 1.16). Although there was a decrease in the number of verbal reprimands, this did not reach statistical significance.
The SOAP was developed for use as an outcome measure for a feasibility study of PT in preparation of a larger randomised control trial of medication alone versus medication plus PT in children with PDD. The procedure was conducted consistently across four sites and was accomplished in about 40 minutes. The coding procedure took approximately 60 minutes per subject, and it was reliable when conducted by trained raters. As expected, there were overall high rates of inappropriate child behaviours in most SOAP conditions. The coded ratings for inappropriate child behaviours showed a stronger correlation with parent-rated measures of noncompliance than did parent-rated measures of irritability, suggesting convergent and divergent validity. Finally, results from the feasibility PT study suggest that the SOAP ratings are sensitive to change in a treatment focused on parent–child interactions. Decreases of 50% and more in inappropriate child behaviours were observed in three of the four conditions. Additionally, the change on the SOAP ratings of inappropriate child behaviours was significantly correlated with the change score on the HSQ.
Given that the Free Play session is a low demand situation and given the mild to moderate level of behavioural problems in this pilot study, the low rate of child inappropriate behaviour observed in this session pre- and post-treatment was not surprising. Similarly, low rates of verbal reprimands and positive reinforcement by parents in the Free Play session were not unexpected.
Following 6-months of PT, parent behaviours showed a total 29% decrease in the use of verbal reprimands and a 58% increase in the use of reinforcement across the 4 sessions. This increase in the delivery of reinforcement is particularly clinically relevant. Improvement in child rates of noncompliance in the demand session was not observed at post-training. However, this was due in large part to relatively high rates of compliance (77%) at baseline. Rather than select a standardised set of “compliance requests,” it may be more beneficial to develop child-specific compliance tasks, based upon an interview with parents. This would help to assure higher rates of noncompliance during pre-treatment assessments.
The goal in developing the SOAP was to bridge the gap between reliance on rating scales and clinician measures typically used in randomised trials, and direct observational methods that are commonplace in the assessment of behavioural interventions for individuals with PDD and intellectual disability. We took on the formidable challenge of developing a direct observational measure that could be implemented in a large-scale multi-site randomised clinical trial. The results of this pilot study suggest that the SOAP may provide a complementary measure of child noncompliant behaviour and changes in parental effectiveness.
The SOAP was not used as a traditional functional analysis procedure, which could be confusing to researchers and clinicians more familiar with comprehensive behaviour analytic assessment methodologies. The four SOAP conditions were developed to elicit different parent and child behaviours, not to determine the function or motivation for those behaviours, as would be the case in a traditional functional analysis procedure.
The SOAP poses many of the same challenges inherent with other behaviour observation procedures, including technical problems that may arise in making the recording, and scheduling problems when trying to coordinate the family, clinician, and recording facility. Indeed, in this study, we had missing data for 5 of 17 participants due to scheduling problems or equipment difficulties. Moreover, in a multi-site study, it requires a substantial commitment of resources to train raters to reliability and code what may turn out to be hundreds of recordings. Although we designed the SOAP to be administered in less than an hour and developed an abbreviated coding scheme, it is still time-consuming. Our use of the SOAP in a multi-site study presents other challenges as well. For example, it was necessary to develop standardised procedures and instructions for making the recordings to ensure uniformity across sites. In addition, the behavioural coding categories had to be relevant across all participants, who may represent a wide range of cognitive and adaptive functioning. This challenge stands in stark contrast to the application of behavioural observation procedures in single subject studies, which permit more individualised assessments of behaviour. However, as suggested earlier, it may be possible to individualise some of the prompts and demands a bit more within the group standardisation.
The current exploratory study suggests that the SOAP may be useful in detecting change in parent and child behaviours following a parent training intervention, and has subsequently been used as a measure in a large, randomised controlled study (Scahill et al., in press). The SOAP is a relatively brief, structured, and standardised measure that is easy to administer and may be applied as a direct assessment of change in pharmacological and psychosocial treatment studies, perhaps complementing the parent and clinician ratings in the current literature. It might be made even briefer by dropping the Free Play condition, which was not as sensitive as the other three conditions in this pilot study. However, this initial study is obviously limited by the small number of participants, which makes broad statements about the utility of the SOAP difficult to conclude. Still, the numbers were adequate to describe the SOAP development as a complement to the more subjective parent and clinician reports abundant in the current treatment literature. Moreover, it is encouraging that despite the small sample size limiting the power of our statistical analyses, we were nonetheless able to demonstrate significance. This pilot study provides a starting point in the development of the SOAP and an extension of the ongoing discussion about direct observational measures in clinical trials. More evidence from the current randomised controlled trial of the incremental validity of the SOAP is necessary before we could recommend the widespread adoption of it or other behaviour observation procedures in other large-scale clinical trials.
We wish to acknowledge Catherine A. Belasco for her valuable assistance in the coding of direct observation data and preliminary data analysis. We thank Ann Wagner, PhD, Chief, Neurodevelopmental Disorders Branch, Division of Pediatric Translational Research and Treatment Development, NIMH, for her valuable reviews of earlier versions of this manuscript. Finally, we thank all the families who participated in this pilot study.
This research was supported by the following cooperative agreement grants from the National Institute of Mental Health (NIMH): U10MH66768 (P. I.: M. Aman), U10MH66766 (P. I.: C. McDougle), and U10MH66764 (P. I.: L. Scahill).
*This manuscript was accepted under the Editorship of Roger J. Stancliffe.