Demographic items included self-reported sex, race and ethnicity, age, marital status, employment status, residential zip code, and sheltering-in-place status given the COVID-19 pandemic. The Alcohol Use Disorders Identification Test-Concise , a widely used 3-item self-report measure based on the 10-item original AUDIT, assessed hazardous or harmful alcohol consumption in the past 3 months. A score of 4+ for men and 3+ for women indicated significant problems with alcohol consumption. The AUDIT-C has been found to be a valid screening test for heavy drinking and/or active alcohol abuse or dependence. The Drug Abuse Screening Test-10 , a 10-item self-report measure adapted from the 28-item DAST, assessed consequences related to drug abuse, excluding alcohol and tobacco in the past 3 months. The last item of the DAST-10 regarding medical problems resulting from drug use was not reassessed because it was an exclusion criterion in the study screener; hence, the total possible range for the sample was 0-9, not 0-10. Total scores of 3+ indicated significant problems related to drug abuse. The DAST-10 has moderate test-retest reliability, sensitivity, and specificity. For the AUDIT-C and DAST-10 measures at post treatment, the reference period was the past 2 months, to reflect the period of intervention. Craving was assessed with a single item asking, “In the past 7 days, how much were you bothered by cravings or urges to drink alcohol or use drugs?”, with response options of not at all , a little bit , moderately , quite a bit , and extremely . The Brief Situational Confidence Questionnaire, a state-dependent measure,vertical farms assessed self-confidence to resist the urge “right now” to drink heavily or use drugs in different situations reported on visual analog scales anchored from 0% “not at all confident” to 100% “totally confident.”
The Patient Health Questionnaire-8 item , an 8-item scale, assessed depressive symptoms, and the Generalized Anxiety Disorder-7 item , a 7-item scale, assessed symptoms of generalized anxiety disorder. Both the PHQ-8 and GAD-7 have good internal consistency and demonstrated convergent validity with measures of depression, stress, and anxiety. A total of 2 items assessed the history of therapy for mental health or substance use concerns. Lifetime psychiatric diagnoses were assessed using 10 items plus a write-in option for others. A single item assessed currently taking prescribed medications for a psychiatric diagnosis. The treatment feasibility and acceptability of W-SUDs were assessed post treatment using the Usage Rating Profile-Intervention Feasibility and Acceptability scales, the 8-item Client Satisfaction Questionnaire-8 questions, and the 12-item Working Alliance Inventory-Short Revised. The URP-I item response options ranged from strongly disagree to strongly agree; the items were summed for a total score within each scale, with one feasibility item reverse coded. The CSQ-8 items have 4-point rating scales with response descriptors that vary. Internal consistency exceeds 0.90, and the total sum score ranges from 8 to 32, with higher total scores indicating higher satisfaction. The WAI-SR has three 4-item sub-scales, with 5-point rating scales, that reflect development of an affective bond in treatment and level of agreement with treatment goals and treatment tasks. Serious adverse events occurring in the 8 weeks after the start of the study were assessed for hospitalization related to substance use, suicide attempt, alcohol or drug overdose, and severe withdrawal . Positive endorsements were followed up with questions about the timing, diagnosis, and resolution. If additional details were needed to determine whether the event was study related, a team member reached out to the participant. Serious adverse events were reported to the study’s Data Safety Monitoring Board within 72 hours of the team learning of the event. Participants’ W-SUDs app use, including days of app use, number of check-ins, and number of messages sent, was collected via the Woebot app, as were module completion rates, lesson acceptability ratings indicated on a binary scale , and mood impact after tools utilization . In addition, on a daily basis, the W-SUDs app assessed mood, cravings or urges to use, and pain. In-the-moment emotional state was reported through emoji selection with a default menu of 19 total moods, including options for negative , positive , and average mood , with an additional ability to type in free text emotion words and/or self-selected emoji expressions. Cravings were assessed as not at all , a little bit , moderately , quite a bit , or extremely .
Physical pain was rated on a scale of 0 to 10. Descriptive statistics were used to describe the sample and examine the ratings of program feasibility and acceptability. Paired samples ttests and McNemar nonparametric tests examined within-subject changes from preto post treatment on measures of substance use, confidence, cravings, mood, and pain. Change scores were calculated , and bivariate correlations were used to examine associations between changes in AUDIT-C and DAST-10 scores and changes in use occasions, confidence, and depression and anxiety scores. ttests were conducted to examine changes from pre- to post treatment in substance use, confidence, mood, and pain by whether participants were currently in therapy or taking psychiatric medications. Post treatment survey completion was 50.5% , with better retention among those with a higher CAGE-AID score at screening . Retention was lowest among those with a CAGE-AID score of 2 and higher for those scoring 3 or 4 . Retention was unrelated to participant demographic characteristics, previous use of Woebot, psychiatric diagnoses, primary problematic substance, depressive symptoms, pain, cravings, confidence, substance use occasions, AUDIT-C scores, or DAST-10 scores . Missing data on individual survey items was minimal. In a single instance, a participant’s average score values were imputed when missing 1 item on the PHQ-8. Participants were prompted to report craving and pain ratings within the W-SUDs app on a daily basis. The data were aggregated so that if participants provided multiple ratings within a day, the scores were averaged. To examine changes over time, generalized estimating equationlinear models were run with week entered as a factor, setting week 1 as the reference category. A total of 1571 mood ratings were entered into the W-SUDs app by 90 of the 101 participants, with each participant entering on average 17.5 mood ratings or 2.2 per week. A total of 1399 craving and 1403 pain ratings were entered into the W-SUDs app by 87 of the 101 participants , with each participant providing an average of 16.1 ratings for cravings and 16.1 ratings for pain. Table 2 shows the number of participants providing craving ratings for each week and summarizes the generalized estimating equation model analyzing craving ratings over time. Compared with week 1, craving ratings were significantly lower at weeks 4 through 9. By weeks 8 and 9, craving ratings were reduced by approximately half of the sample’s mean rating at week 1.
In contrast, pain ratings did not differ significantly by week and over the 9 weeks averaged, on a scale of 0 to 10.W-SUDs, an automated conversational agent,vertical plant tower was feasible to deliver, engaging, and acceptable and was associated with significant improvements pre- to post treatment in self-reported measures of substance use, confidence, craving, depression, and anxiety and in-app measures of craving. The W-SUDs app registration rate among those who completed the baseline survey was 78.9% , comparable with other successful mobile health interventions. As expected, the use of the W-SUDs app was highest early in treatment and declined over the 8 weeks. Study of engagement with digital health apps has been growing, with no consensus yet on ideal construct definitions. Simply reporting the number of messages or minutes spent on an app over time may undermine clarity and genuine understanding of the type and manifestation of app utilization related to clinical outcomes of interest. Further research in this area is warranted. The observed reductions from pre- to post treatment measures of depression and anxiety symptoms were consistent with a previous evaluation of Woebot conducted with college students self-identified as having symptoms of anxiety and depression. Furthermore, in this study, treatment-related reductions in depression and anxiety symptoms were associated with declines in problematic substance use. Declines in depressive symptoms observed from pre- to post treatment were greater among the participants in therapy. This study also examined working alliance, proposed to mediate clinical outcomes in traditional therapeutic settings. Traditionally, working alliance has been characterized as the cooperation and collaboration in the therapeutic relationship between the patient and the therapist. The role of working alliance in relationally based systems and digital therapeutics has been previously considered; the potential of alliance to mediate outcomes in Woebot should be further validated in future studies adequately powered to examine mediators of change. Measures of physical pain did not change with the use of W-SUDs as reported in pre- and post treatment measures or within the app; however, the sample’s baseline ratings of pain intensity and pain interference were low. Although not a direct intervention target, pain was measured due to the potential for use of substances to self-treat physical pain and the possibility that pain may worsen if substance use was reduced, which was not observed here. Within-app lesson completion and content acceptability were high for the overall sample, although there was a wide range of use patterns. Most participants used all facets of the W-SUDs app: tracked their mood, cravings, and pain; completed on average over 7 psycho educational lessons; and used tools in the W-SUDs app. Only about half of the sample completed the post treatment assessment, with better retention among those screening higher on the CAGE-AID. That is, those with more severe substance use problems at the start of the study, and hence in greater need of the intervention, were more likely to complete the post treatment evaluation. None of the other measured variables distinguished those who did and did not complete the post treatment evaluation. This level of attrition is commensurate with other digital mental health solution trial attrition rates. By addressing problematic substance use, including but not limited to alcohol, the W-SUDs intervention supports and extends a growing body of literature on the use of automated conversational agents and other mobile apps to support behavioral health.
A systematic review of mobile and web-based interventions targeting the reduction of problematic substance use found that most web-based interventions produced significant short-term improvements in at least one measure of problematic substance use. Mobile apps were less common than web-based interventions, with weaker evidence of efficacy and some indication of causing harm . However, mobile interventions can be efficacious. Electronic screening and brief intervention programs, which use mobile tools to screen for excessive alcohol use and deliver personalized feedback, have been found to effectively reduce alcohol consumption and alcohol-related problems. However, rigorous evaluation trials of digital interventions targeting non-alcohol substance use are limited. Furthermore, although a systematic review concluded that conversational agents showed preliminary efficacy in reducing psychological distress among adults with mental health concerns compared with inactive control conditions, this is the first published study of a conversational agent adapted for substance use. Study strengths include study enrollment being double the initial recruitment goal, reflecting interest in W-SUDs. Most participants reported lifetime psychiatric diagnoses, and approximately half of the participants endorsed current moderate-to-severe levels of depression or anxiety. From pre- to post treatment with W-SUDs, participants reported significant improvements in multiple measures of substance use and mood. The delivery modality of W-SUDs offered easy, immediate, and stigma-free access to emotional support and substance use recovery information, particularly relevant during a time of global physical distancing and sheltering in place. More time spent at home, coupled with reduced access to in-person mental health care, may have increased enrollment and engagement with the app. Although further data on recruitment and enrollment are warranted, these early findings suggest that individuals with SUDs are indeed interested in obtaining support for this condition from a fully digitalized conversational agent. This study had a single-group design, and the outcomes were short term and limited to post treatment, thus limiting the strength of inferences that can be drawn. The sample was predominately female and identified as non-Hispanic White, and the majority were employed full-time. Non-Hispanic White participants reported higher program acceptability on 2 of the 4 measures compared with participants from other racial or ethnic groups. Future research on W-SUDs will use a randomized design, with longer follow-up, and focus on recruitment of a more diverse population to better inform racial or ethnic cultural programmatic tailoring, using quotas to ensure racial or ethnic diversity in sampling.