We recruited participants from a regional multiple sclerosis clinic and by referral from specialists. Our eligibility criteria were spasticity and at least moderate increase in tone (score ≥ 3 points on the modified Ashworth scale5
at the elbow, hip or knee). Participants were allowed to continue other treatments for spasticity, with the exception of benzodiazepines, if they had been taking stable doses for three months or longer. Participants could continue disease-modifying therapy (e.g., interferon β-1a, interferon β-1b, glatiramer) if they had been on a stable regimen for at least six months. We prohibited any changes to medications that were expected to affect spasticity scores during the trial. Participants could be cannabis-naive or cannabis-exposed; if the participants had been previously exposed to cannabis, we asked that they refrain from smoking cannabis for one month before screening and during the trial.
We excluded patients with a history of major psychiatric disorder (other than depression) or substance abuse, substantial neurologic disease other than multiple sclerosis (e.g., epilepsy, head trauma) and severe or unstable medical illnesses, known pulmonary disorders (tuberculosis, asthma), patients who used benzodiazepines to control spasticity or high doses of narcotic medications for pain, and women who were pregnant or breastfeeding.
Our study was approved by the Human Research Protections Program at the University of California, San Diego, the Research Advisory Panel of California, the Drug Enforcement Administration, the US Food and Drug Administration and the National Institute on Drug Abuse. Our study was monitored by an independent data safety monitoring board through the University of California Center for Medicinal Cannabis Research.
We used a randomized, double-blind, placebo-controlled crossover design. We evaluated participants during eight visits over a period of two weeks. Visit 1 was a screening visit during which the participants gave their informed consent. At this time, we took medical/medication histories, screened participants for substance abuse (using urine toxicology) and psychiatric disorders, and determined spasticity using the modified Ashworth scale.5
Participants with a positive toxicological screening result (e.g., presence of delta-9 tetrahydrocannabinol, amphetamines, benzodiazepines, cocaine and/or benzoylecgonine) were excluded.
A second screening visit took place within seven days of the first. At this time, we completed the expanded disability status scale, determined spasticity again using the modified Ash-worth scale and conducted a battery of cognitive tests to reduce practise effects. During this second visit, participants were given a “practise session” with a placebo cigarette, although they were not told that it was a placebo.
Treatment began within seven days of the second screening visit, including randomization to placebo or smoked cannabis. Phase 1 was followed by an 11-day washout period, after which participants crossed over to the opposite treatment group for phase 2. We assessed each patient before and after treatment for three consecutive days during each phase. The examiner was blind to the treatment group to which each patient was assigned. We assessed patients using the modified Ashworth scale, a visual analog scale for pain, a timed walk and cognitive tests such as the Paced Auditory Serial Addition Test (PASAT). We assessed treatment-emergent effects about 45 minutes after treatment. We collected urine for toxicological screening at the beginning (baseline) of each phase.
We assessed participants at the same time of day to regulate food, medication and time of cannabis intake. Participants smoked either a placebo or a cannabis cigarette, using the Foltin uniform puff procedure (inhalation for 5 s, followed by a 10-s breath-hold and exhalation, with a 45-s wait between puffs),6
under supervision in a ventilated room. Participants completed an average of four puffs per cigarette.
Prerolled cannabis and placebo cigarettes with identical appearances and weight (about 800 mg) were provided by the National Institute on Drug Abuse. Cannabis cigarettes contained about 4% delta-9-tetrahydrocannabinol (delta-9-THC) by weight; placebo cigarettes had the same base material but with the delta-9-THC removed. We chose to use the 4% delta-9-THC cigarette available from the National Institute on Drug Abuse because it most closely resembled the strength of cigarettes available in the community at the time of the study (typically between 5% and 6%).7
We assessed safety and adverse effects by monitoring participants’ vital signs in addition to self-report by participants.
Our primary outcome was change in spasticity as measured by patient score on the modified Ash-worth scale. The modified Ashworth scale5
is an ordinal scale (0–5 points) ranking the intensity of muscle tone as follows: 0, no increase in muscle tone; 1, slight increase manifested by a catch and release or by minimal resistance at the end of the range of motion when the affected part(s) flexed or extended; 2, slight increase manifested by a catch, followed by minimal resistance throughout the remaining (less than half) range of motion; 3, more marked increase through most of the range of motion, but affected part(s) easily moved; 4, considerable increase in tone, and passive movement is difficult; 5, affected part(s) rigid in flexion and extension. We combined ratings for both elbows, hips and knees for a total possible score of 30 points. We assessed participants using this scale before and about 45 minutes after treatment (cannabis or placebo) at each visit.
This measure has been validated and correlates with motor function.8
Although the minimal clinically important difference is not available in the literature, trials using a rating scale of 0–10 for spasticity have established a threshold of 18%.9
Given this threshold and the mean baseline score of 9 among our participants, a difference of two or more points would be considered clinically meaningful.
We assessed patients daily for pain (using a visual analogue scale), physical performance (using a timed walk) and cognitive function (the PASAT). We administered these tests before and about 45 minutes after treatment at each visit.
We assessed patients for symptoms using the Brief Symptom Inventory (BSI), for perceived deficits using the Perceived Deficits Questionnaire (PDQ) and for fatigue using the modified Fatigue Impact Scale (mFIS). We did these assessments before treatment on day 1 and after treatment on day 3.
In addition, at the end of each visit, we asked patients to assess their feeling of “highness” after treatment, according to question 1 from the Subjective Ratings of High and Sedation Questionnaire (SRHS–R), and to guess which treatment they were receiving (placebo or cannabis).
We calculated mean scores (and 95% confidence intervals [CIs]) on the modified Ash-worth scale during each visit of each phase, at each assessment time (before and after treatment). We calculated bootstrap-based, bias-corrected, accelerated CIs for extra precision around each mean. We calculated four overall mean scores on the modified Ashworth scale (before and after smoking during both phases). We compared the difference in scores before and after smoking for each of the two phases using paired t tests. We then compared the change in this difference (after to before) in the two phases (placebo and active) using a paired t test. We used the same analysis to examine scores on the visual analogue scale for pain, the timed walk and the PASAT.
We analyzed secondary variables according to the schedule of measurements. We calculated the means and bootstrap-based CIs for patient scores on the BSI, the PDQ, and the mFIS for each day on which these measures were assessed (day 1 before smoking, day 3 after smoking). We assessed the overall differences for before and after treatment with placebo and cannabis using a paired t test. We calculated the means and bootstrap-based CIs for the answer to question 1 of the SRHS–R questionnaire. We used a paired t test to compare the overall difference in “highness” between treatment and placebo.
We performed power calculations before the beginning of the study, and these were reviewed and approved by an external scientific advisory board and regulatory agencies. A priori, we identified as “clinically important” any departure from zero in the hypothesized direction that is greater than one standard deviation (SD) of paired differences. We determined that a sample size of 30 would yield better than 80% power (α = 0.05) to detect such an effect size.