Why is it important?

Psychotherapy clients do not know the root cause of their adversity, and instead rely on telling stories related to their problems. Some of these stories, usually the ones stemming from negative events, will stand out as more significant than others and can ultimately shape one’s identity[1]. Professional therapists are trained to collaborate with their clients to understand what is of interest to them, to build rapport and empathy, and to raise awareness on particular traits and characteristics that may have caused the client’s current afflictions. Our current understanding of how therapy works suggests that these actions lead the client into better therapeutical outcomes[2,3,4,5].

But how does a skilled therapist does so? We hypothesize that good therapists have the skill to deliver their own narratives, in such a way that these therapist’s stories are not a word-by-word retelling of the client’s story but are stories similar enough for the client to relate to. We believe that quantifying the similarity between therapist’s and client’s stories, by understanding the narrative elements of these stories, might help provide novel insights into how therapy works.

What did we do?

We propose computational narrative models, based on natural language processing techniques, to model therapy outcome as a function of similarity between therapist’s and client’s stories. We argue that similarity between stories can be measured by judging the role that each character plays in each participant’s story. To identify character roles, we leverage on an unsupervised model that groups characters into roles with similar traits, behaviors, and motivations, that is, it identifies character personae[6]1. We then train machine learning models to understand the relation between the roles that characters play in the therapist and client’s stories to predict therapy outcome.

1Also known as character archetypes.

How do we get therapy data?

We collaborated with a US counseling center to obtain 1,235 recorded sessions of conversations between a therapist and a client. The total number of clients and therapists is 386 and 40, respectively. The average duration of each session is around \(50.71\pm10.32\) minutes. Before the start of each session, clients self-report therapeutic alliance, that is the perception of a shared bond and the agreement on the focus of the therapy treatment.

How do we transcribe sessions automatically?

We used our in-house development of a speech processing pipeline[7]. This pipeline is based on state-of-the-art models offered by Kaldi[8]. It consists of four steps:

  1. Voice Activity Detection (VAD): a two-layer feed forward network with a softmax inference layer at the frame level was used.
  2. Speaker Diarization: based on the x-vector/PLDA paradigm[9]
  3. Automatic Speech Recognition (ASR): a time-delay neural network[10] and a tri-gram language model trained on more than 4,000 hours of data from publicly available speech corpora augmented with noise and reverberation and adapted it using in-domain psychotherapy data.
  4. Role assignment: followed a method to assigned the two diarized speaker clusters to either therapist or client roles[11].

Personae Model from Bamman 2013 Figure 1. Personae model for narratives, as proposed by (Bamman, O’Connor & Smith, 2013)

Identifying therapist and client’s personae

First, we identify when therapists (or clients) are narrating a story. From each story, we select the characters by extracting the proper names and pronouns. To assign roles to these characters, we rely on the assumption that, in the context of stories, when the therapist says you, they refer to the client’s character. Moreover, this is the same character that the client refers to when saying I.

Once we identified the therapist’s (client’s) self- and other-character, we have to assign each a personae. To do this, we employ an unsupervised archetype modeling technique[12] (see Figure 1) to learn data-driven archetype distributions from the topics and words used in the stories—this is similar to what LDA does for topic modeling[13]. Given a set of therapy sessions, this model learns \(P\) archetypes from a group of \(K\) topics. In each story, we assing the therapist’s self- and other-characters to the ones with maximum posterior probability.

Predicting therapy outcome from therapist and client's archetypes. Figure 2. Predicting therapy outcome from therapist and client’s archetypes.

Can archetypes predict therapy outcome?

We train a machine learning model (support vector regressor) to predict the therapy outcome measurement based on the therapist and client archetypes, as shown in Figure 2. To measure how well can this model predict therapy outcome, we compare the estimated therapeutic alliance to the true value using mean squared error (MSE). To get a better estimate of the error, we measured it using cross-validation in a leave-one-therapist-out fashion. We compare the performance of our model to SVRs trained using uni- and bi-gram language models from either participant speech as well as trained on their joint text[14].

Models' performance on estimating working alliance. Figure 3. Models’ performance on estimating working alliance.

Results of therapeutic alliance estimations are presented in Figure 3. Our first insight is that just focusing on an individual’s language is not enough to predict therapeutic alliance. By adding in the notion of character’s types, our models significantly improved the performance over the therapist and client linguistic models (t-test, \(t(60) = 3.94, p < 0.001\) and t-test, \(t(60) = 9.13, p < 0.001\) respectively). Moreover, the unsupervised archetypes model also performs better than the supervised model that considers both therapist and client text (t-test, \(t(60) = 7.01, p < 0.01\)).

Average number of Therapist Personae Figure 4. Average number of Therapist Personae with respect to the number of clients

What can we learn from these archetypes?

Back to the original question, what are skilled therapists doing? As the number of clients increases so does the number of personae shown by the therapists (see Figure 4), that is, there is a linear relation between the number of clients and the number of personae that a therapist portrays. So it appears that skilled therapists have a repertoire of personae (with an average of 9 per therapist) from which they draw one carefully selected to match that particular client’s needs.

What does this all mean?

We obtained better predictions on the relationship between clients and therapists (a predictor of therapy outcome) by quantifying story similarity through the identification of similar characters roles. We also shown that skilled therapists adapt their own narratives to their client needs, by changing the role their self-character portrays in each story. Taken all together, this suggests that the stories a therapists tells are more effective when characters have similar roles to the ones in the client’s story. As a takeaway example, if a client considers their actions as heroic, the therapist should avoid narratives where the client’s role is the anti-hero as this would only lead to reduced rapport and conflict in their working relation.

  1. Martinez, V. R., Flemotomos, N., Ardulov, V., Somandepalli, K., Goldberg, S. B., Imel, Z. E., … & Narayanan, S. (2019). Identifying Therapist and Client Personae for Therapeutic Alliance Estimation. In INTERSPEECH (pp. 1901-1905).


  1. 1. Morgan A. What is narrative therapy? An easy-to-read introduction. 1st ed. Dulwich Centre Publications; 2000.
  2. 2. Goldberg SB, Flemotomos N, Martinez VR, Tanana MJ, Kuo PB, Pace BT, et al. Machine learning and natural language processing in psychotherapy research: Alliance as example use case. Journal of counseling psychology 2020;67 4: 438–48 .
  3. 3. Lambert MJ, Barley DE. Research summary on the therapeutic relationship and psychotherapy outcome. Psychotherapy: Theory, research, practice, training 2001;38(4).
  4. 4. Thompson MN, Goldberg SB, Nielsen SL. Patient financial distress and treatment outcomes in naturalistic psychotherapy. Journal of counseling psychology 2018;65(4).
  5. 5. Baldwin SA, Wampold BE, Imel ZE. Untangling the alliance-outcome correlation: Exploring the relative importance of therapist and patient variability in the alliance. Journal of consulting and clinical psychology 2007;75(6).
  6. 6. Jung CG. Two essays on analytical psychology. Routledge; 1943.
  7. 7. Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, et al. The Kaldi Speech Recognition Toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society; 2011.
  8. 8. Sell G, Snyder D, McCree A, Garcia-Romero D, Villalba J, Maciejewski M, et al. Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge. In: Proc. INTERSPEECH. 2018.
  9. 9. Peddinti V, Chen G, Manohar V, Ko T, Povey D, Khudanpur S. Jhu aspire system: Robust lvcsr with tdnns, ivector adaptation and rnn-lms. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE; 2015.
  10. 10. Flemotomos N, Martinez V, Gibson J, Atkins D, Creed T, Narayanan S. Language Features for Automated Evaluation of Cognitive Behavior Psychotherapy Sessions. Proc Interspeech 2018 2018;
  11. 11. Bamman D, O’Connor B, Smith NA. Learning Latent Personas of Film Characters. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Sofia, Bulgaria: Association for Computational Linguistics; 2013.
  12. 12. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of machine Learning research 2003;3(Jan).
  13. 13. Gibson J, Can D, Georgiou P, Atkins D, Narayanan S. Attention Networks for Modeling Behavior in Addiction Counseling. In: Proceedings of Interspeech. 2017.