Zhuohao Chen, Nikolaos Flemotomos, Karan Singla, Torrey A. Creed, David C. Atkins, and Shrikanth Narayanan. An automated quality evaluation framework of psychotherapy conversations with local quality estimates. Computer Speech & Language, 75:101380, 2022.

Download

[PDF] 

Abstract

Text-based computational approaches for assessing the quality of psychotherapy are being developed to support quality assurance and clinical training. However, due to the long durations of typical conversation based therapy sessions, and due to limited annotated modeling resources, computational methods largely rely on frequency-based lexical features or dialogue acts to assess the overall session level characteristics. In this work, we propose a hierarchical framework to automatically evaluate the quality of transcribed Cognitive Behavioral Therapy (CBT) interactions. Given the richly dynamic nature of the spoken dialog within a talk therapy session, to evaluate the overall session level quality, we propose to consider modeling it as a function of local variations across the interaction. To implement that empirically, we divide each psychotherapy session into conversation segments and initialize the segment-level qualities with the session-level scores. First, we produce segment embeddings by fine-tuning a BERT-based model, and predict segment-level (local) quality scores. These embeddings are used as the lower-level input to a Bidirectional LSTM-based neural network to predict the session-level (global) quality estimates. In particular, we model the global quality as a linear function of the local quality scores, which allows us to update the segment-level quality estimates based on the session-level quality prediction. These newly estimated segment-level scores benefit the BERT fine-tuning process, which in turn results in better segment embeddings. We evaluate the proposed framework on automatically derived transcriptions from real-world CBT clinical recordings to predict session-level behavior codes. The results indicate that our approach leads to improved evaluation accuracy for most codes when used for both regression and classification tasks.

BibTeX Entry

@article{CHEN2022101380,
title = {An automated quality evaluation framework of psychotherapy conversations with local quality estimates},
journal = {Computer Speech \& Language},
volume = {75},
pages = {101380},
year = {2022},
issn = {0885-2308},
doi = {https://doi.org/10.1016/j.csl.2022.101380},
 link = {http://sail.usc.edu/publications/files/Chen-CSL2022.pdf}
url = {https://www.sciencedirect.com/science/article/pii/S0885230822000213},
author = {Zhuohao Chen and Nikolaos Flemotomos and Karan Singla and Torrey A. Creed and David C. Atkins and Shrikanth Narayanan},
keywords = {Cognitive behavioral therapy, Computational linguistics, Hierarchical framework, Local quality estimates},
abstract = {Text-based computational approaches for assessing the quality of psychotherapy are being developed to support quality assurance and clinical training. However, due to the long durations of typical conversation based therapy sessions, and due to limited annotated modeling resources, computational methods largely rely on frequency-based lexical features or dialogue acts to assess the overall session level characteristics. In this work, we propose a hierarchical framework to automatically evaluate the quality of transcribed Cognitive Behavioral Therapy (CBT) interactions. Given the richly dynamic nature of the spoken dialog within a talk therapy session, to evaluate the overall session level quality, we propose to consider modeling it as a function of local variations across the interaction. To implement that empirically, we divide each psychotherapy session into conversation segments and initialize the segment-level qualities with the session-level scores. First, we produce segment embeddings by fine-tuning a BERT-based model, and predict segment-level (local) quality scores. These embeddings are used as the lower-level input to a Bidirectional LSTM-based neural network to predict the session-level (global) quality estimates. In particular, we model the global quality as a linear function of the local quality scores, which allows us to update the segment-level quality estimates based on the session-level quality prediction. These newly estimated segment-level scores benefit the BERT fine-tuning process, which in turn results in better segment embeddings. We evaluate the proposed framework on automatically derived transcriptions from real-world CBT clinical recordings to predict session-level behavior codes. The results indicate that our approach leads to improved evaluation accuracy for most codes when used for both regression and classification tasks.}
}

Generated by bib2html.pl (written by Patrick Riley ) on Sat Jul 30, 2022 10:09:52