What this is
- PREACT-digital is a longitudinal, observational study focused on predicting non-response to cognitive behavioural therapy (CBT) for internalising disorders (ID).
- It combines () and to gather real-time data on patients' emotional and physical states.
- The study aims to identify dynamic markers that can inform personalized treatment approaches and improve therapy outcomes.
Essence
- PREACT-digital seeks to enhance the prediction of non-response to CBT in patients with internalising disorders by utilizing real-time data collection methods. The study integrates and to capture daily fluctuations in emotional and behavioral states, aiming to inform personalized therapy.
Key takeaways
- The study will involve approximately 468 patients, with 350 expected to complete a long-term version of the study. Participants will provide data through a smartwatch and a customized app, enabling continuous monitoring of their emotional and physical conditions.
- The central goal is to achieve a predictive accuracy greater than 75% for identifying non-response to therapy after 20 sessions. This will be assessed using a combination of data and features collected during a 14-day assessment phase.
- Findings from this study aim to contribute to the development of more personalized and effective CBT approaches, potentially allowing for real-time interventions based on patients' data collected through wearable technology.
Caveats
- The absence of a healthy control group limits the ability to distinguish digital patterns specific to clinical populations from those present in non-clinical individuals. Future studies may need to incorporate control groups for better specificity.
- The extensive assessment schedule may burden participants, potentially affecting adherence and retention in the study. Managing participant engagement will be crucial to ensure the robustness of the data collected.
Definitions
- Ecological Momentary Assessment (EMA): A method involving real-time data collection through repeated self-reports, capturing individuals' experiences in their natural environments.
- Passive Sensing: Continuous data collection on behavioral and physiological metrics using wearable devices or smartphones, without active input from the user.
AI simplified
Introduction
The following study protocol describes a subproject from the Research Unit FOR 5187 ‘Towards Precision Psychotherapy for Non-Respondent Patients: From Signatures to Predictions to Clinical Utility (PREACT)’ that was formed as a collaboration across four universities and outpatient clinics in Berlin to identify predictors of non-response to cognitive behavioural therapy (CBT) in Internalising Disorders (ID). The diagnostic and experimental focus of PREACT lies on emotion regulation (ER) as a putative key mechanism of CBT and response to CBT.
Due to the comprehensive design of the Research Unit, the general study protocol provides an overview across all subprojects.1 The present protocol entails a more granular description of methods and objectives relevant to this subproject. This level of detail is of utter importance to ensure that reporting recommendations are met2 and researchers can assess the comparability with other digital phenotyping studies. Associated materials and descriptions can be found in our Open Science Framework (OSF) directory (https://osf.io/253nb↗) and on Github (https://github.com/leona-ha/preact_digital↗).
ID encompass a broad range of psychopathology, including symptoms of depressive disorder, obsessive-compulsive disorder or generalised anxiety disorder.3 They represent the most common mental health conditions worldwide.4 CBT is considered the first-line treatment for ID but was shown to produce insufficient response rates.5 In addition, urgently needed breakthroughs in psychotherapy research remain elusive: studies adapting or enhancing CBT or comparing it to other forms of psychotherapy for ID, like behavioural activation therapy or interpersonal therapy, often do not find significant differences in outcomes.6
A great hope lies within the field of machine learning, which has found its way into psychotherapy research within recent years.7 Here, a prominent approach is to predict outcomes before the treatment started using a large number of different variables from various sources.8 9 The idea is to change or adapt a treatment if the risk of non-response is high and thus avoid frustrating experiences for patients as well as unnecessary societal and economical burdens. Thus far, however, results are unsatisfactory: depending on the features of input, predictive accuracies often only slightly exceed chance level, like 62% for neuroimaging markers10 and/or fail to generalise to external samples.11
Most existing prediction studies relied on cross-sectional data (ie, pre-treatment self-reports). However, these momentary snapshots are incapable of depicting the intraindividual symptom heterogeneity therein.12 13 Beyond that, the applicability to outpatient care—as an ultimate goal of precision psychotherapy—is neglected when relying solely on time and cost-intensive forms of phenotyping like clinical interviews or neurophysiological markers.14
Luckily, recent advances in wearable devices and smartphone technology and the broad integration of smartphones into people’s everyday life are paving the way for more accessible, ecologically valid data collection methods. In particular, these tools offer a scalable alternative that can reduce reliance on labour-intensive procedures. Ecological momentary assessment (EMA) involves short, repeated self-reports on the smartphone. Passive sensing refers to the continuous collection of biobehavioural data (ie, physical activity and heart rate) using smartphone or wearable sensors. The combination of active (ie, EMA-based) and passive (ie, wearable-based) assessments is also referred to as ‘Digital phenotyping’.15 Digital phenotyping offers the opportunity to better understand individual factors and mechanisms leading to (non-) response by (a) opening the black box of between-session processes and gathering naturalistic information from everyday settings, (b) getting long-term subjective and objective impressions of symptom dynamics in (c) real time and without recall biases inherent to cross-sectional assessments.
First, systematic reviews on digital phenotyping in different ID concluded that, although in its infancy, it can identify behavioural patterns associated with ID.16 17 These include, for example, increased time at home in depression18 19 or a positive association between heart rate variability (HRV) and anxiety.20 At the same time, only a few studies have applied digital phenotyping (DP) in patients undergoing psychotherapy for ID to depict (absent) changes in symptoms and/or predict treatment response. De Angel et al21 collected DP data from 66 patients starting psychological treatment for depression. They only published results on the feasibility of data collection, showing that data availability varied strongly depending on data source (ie, smartphone vs wearable) and treatment stage (ie, pretreatment vs in-treatment vs posttreatment). Müller-Bardoff et al22 conducted a randomised-controlled trial (RCT) to find sensor-based and EMA-based predictors of response to CBT in 150 patients suffering from anxiety disorders. To date, they did not publish results related to DP-based outcome prediction. In psychiatric settings, Zou et al23 collected passive sensor data (ie, phone usage, app usage and sleep) from 245 patients with major depressive disorder to predict response to psychopharmacological outpatient treatment. They were able to achieve sufficient predictive accuracy around 10 weeks before treatment ended.
In summary, there remains a need for studies investigating digital phenotyping as both predictors and markers of treatment response in ID. Our subproject, PREACT-digital, addresses this gap. We implement EMA and passive sensing via state-of-the-art wearable devices to continuously monitor patients receiving CBT for ID in their daily lives. Building on the PREACT consortium’s overarching focus on ER, we incorporate ER-related constructs such as affect and social/situational context into our EMA measures. Ultimately, our goal is to identify ecologically valid, readily implementable markers of treatment response—or the absence thereof—in CBT for ID.
Our research objectives encompass predictive and exploratory hypotheses. First, in-line with PREACT, we aim to find out if we can predict non-response (NR) at T20 (ie, after 20 therapy sessions) and TPost (ie, after therapy end or 365 days if therapy lasts longer) with sufficient predictive accuracy (i.e. >75%) using a combination of EMA and passive sensing features collected during the first, 14-day assessment phase (T0). We expect that EMA-based features can better capture ER-related dynamics and are thus of greatest importance, but passive features will provide significant incremental value. We further hypothesise that EMA and sensing features both have an incremental predictive value beyond more elaborate (neuro)physiological markers and thus represent an ecologically valid substitute for implementation. Second, we aim to find out if symptom changes during CBT can be modelled using EMA and passive sensing, and how they relate to cross-sectional self-report and physiological information. We hypothesise that data actively and passively generated by personal electronic devices such as smartphones and wearables can be linked to neuroimaging-based markers of ER assessed in other subprojects of the research group.
Method and analysis
Sample selection
Participants within the PREACT study are informed about PREACT-digital via flyers and information brochures within each of the participating outpatient clinics. Interested patients are invited to an ‘onboarding’ meeting to review inclusion and exclusion criteria and provide written informed consent. The inclusion criteria of the PREACT study are as follows: (1) age of 18 years or older, (2) a primary diagnosis based on the DSM-5 criteria for either social anxiety disorder, panic disorder, agoraphobia, generalised anxiety disorder, obsessive-compulsive disorder, posttraumatic stress disorder or unipolar depressive disorder (major depression or dysthymia), (3) an indication for outpatient CBT treatment, (4) a minimum treatment plan of 12 sessions, (5) a symptom severity indicated by a General Severity Index of the Brief Symptom Inventory (BSI-GSI) greater than 0.56 (19, 20), (6) a Clinical Global Impression – Severity Scale (GSI-S) score of 3 or higher, indicating at least mild illness, and (7) owning an appropriate smartphone (ie, android version >9.0 or i-Phone version >15.0).
Exclusion criteria encompass contraindications for outpatient treatment, including: (1) a current secondary diagnosis of moderate to severe substance use disorder (including regular use of benzodiazepines), (2) current psychosis, (3) current bipolar disorder, (4) more than moderate suicidality, (5) medical conditions that contraindicate CBT according to the medical consultation report (ie, severe cognitive impairment or dementia). Patients who require inpatient treatment during the course of the outpatient treatment will be excluded.
Study procedures
The PREACT-digital study is currently running and will end in June 2026. Interested patients can choose if they want to take part in PREACT-digital (ie, therapy-accompanying digital phenotyping). Individuals who decide to participate in the PREACT-digital can choose between two study versions. Option 1 (‘the short version‘) consists of a 14-day EMA assessment phase paralleled by passive data collection prior to therapy start (T0) only. Option 2 (‘the long version’) consists of one additional EMA phase after 20 therapy sessions (T20) and one after the end of therapy (if the duration of therapy exceeds 1 year, data are collected 365 days after therapy start; TPost). Passive data are collected in parallel throughout the entire time. Participants enrolled in the short version have the option to switch to the long version after they have completed the first assessment phase. The design of our subproject is depicted in figure 1.
Once enrolled in PREACT-digital, participating patients go through an onboarding process with trained research assistants. Together, they instal the required apps and connect the smartwatch. For passive data collection in PREACT-digital, patients receive a state-of-the-art smartwatch (Withings Scanwatch Light), the associated Withings app and a customised study app developed by a German tech-startup, called TIKI-app. The TIKI app (1) serves as an interface to the Withings–API, allowing access to the smartwatch data (2) sends out the EMA questionnaires and (3) collects GPS data.
Participants receive information and materials for study participation and provide additional electronic informed consent via the study app. During the first measurement burst, patients receive an onboarding call from the study team, where they get the option to ask questions and report technical problems. In addition, the acceptability and feasibility of EMA and passive sensing are assessed as part of the onboarding call (see interview guideline on https://osf.io/253nb↗). Information gathered during the onboarding call is entered into a respective REDCap sheet. Between the active assessment phases, patients receive regular update mails with information about their last EMA completion rate and subsequent study procedures.
To reduce interruptions in data transmission leading to high rates of missing data, we implement weekly data monitoring. We run a script to identify participants who have not provided data for more than 7 days. If GPS and/or passive sensing data are absent for more than 7 days, the respective participants receive an e-mail containing instructions on how to restore data transmission. If data transmission is not restored in the following week, participants are contacted by phone. Reasons for interruptions (ie, app settings, participant forgetting to charge phone) are noted, when available. In addition, all active patients receive a notification on Sunday evenings asking them to open the TIKI-App and check settings to avoid automatic suppression of data collection.
Participants can decide to stop participation in PREACT-digital at any time point. In case of dropout, participants are contacted via mail to assess reasons for dropout.
Study procedures. EMA, ecological momentary assessment.
EMA assessment schedule
During the 14 days of active assessments, patients receive notifications (‘beeps’) in the TIKI-App on their smartphones eight times a day at quasirandom intervals of 90±30 min. Depending on their individual sleep—wake rhythm, participants can choose to receive beeps between 7.30 and 21.30 or 09.30 and 22.30. The respective questionnaires expire after 30 min. Each EMA assessment contains 30–35 items in total, depending on assessment phase and timing. The order of items was pseudorandomised.
We employ graded reimbursement, as it was shown to increase compliance: participants received at least 20€ per assessment phase, 35€ if they completed at least 90 beeps and 50€ if they completed more than 100 beeps. Thus, participants receive up to 150€ for the EMA assessments, which are paid out as a gift voucher. Moreover, individuals participating in the long version of the study are allowed to keep the smartwatch.
EMA measures
Table 1 contains an overview of all measures. The whole set of items, including response options and assessment schedules, is provided online. Items are presented in German, but example items were translated to English here to facilitate comprehension for non-German-speaking readers. The following constructs are assessed:
| Domain | Construct | Description | Example feature | N items | Timing, Sampling |
|---|---|---|---|---|---|
| EMA | Positive and Negative Affect | PANAS-X+item on loneliness | How anxious do you feel right now? | 17 | Baseline, T20, TPost, all beeps |
| Emotion Regulation | RESS–EMA scale+item on acceptance | In reaction to the negative feeling I…tried to breathe deeply | 8 | Baseline, T20, TPost, all beeps | |
| Situational Context | Self-constructed | How did you spend the time since the last beep? | 1 | Baseline, T20, TPost, all beeps | |
| Social Context | Self-constructed | Did you have social contact since the last beep? | 3 | Baseline, T20, TPost, all beeps | |
| Significant Events | Self-constructed | How did you perceive the most significant moment since the last beep? | 1 | Baseline, T20, TPost, all beeps | |
| Physical Health | Self-constructed | How physically healthy did you feel today? | 1 | Baseline, T20, TPost, last beep | |
| Therapeutic Agency | Constructed based on TAI | Today I have… implemented ideas or tasks from therapy | 4 | T20, TPost, last beep | |
| ECG Control | Self-constructed | Did you consume coffee in the last 30 min? | 1 | Baseline, T20, TPost, third and fifth beep | |
| Sensing | GPS | Raw; TIKI-App | Time spent at home | Continuous; event-based | |
| Steps | Interval sum; Scanwatch | N steps | Continuous; event-based | ||
| Activity | Scanwatch | Time spent cycling, calories burned | Continuous; event-based | ||
| Heart rate | PPG; Interval mean; Scanwatch | Average heart rate | Continuous; regular 30 s samples | ||
| ECG | Resting-State ECG. Scanwatch | Heart Rate Variability | Baseline, T20, TPost; 2 x daily; 30 s | ||
| Sleep | Aggregate scores; Scanwatch | Time in Bed | Continuous |
Passive sensing data
We use a Withings Scanwatch to collect passive data on sleep, physical activity and heart rate (see table 1 for an overview). The following sensor data are available: for sleep, we have SleepDeepBinary, SleepLightBinary, SleepREMBinary, SleepStateBinary, SleepBinary, SleepInBedBinary and SleepAwakeBinary. They indicate if a user was, for a given time period, in deep sleep, light sleep, REM sleep, any sleep state, asleep overall, in bed (whether asleep or awake) or awake. For physical activity, we have ActivityType, ActiveBinary, RunBinary, BikeBinary, WalkBinary, FloorsClimbed, ElevationGain, ActiveBurnedCalories and Steps. ActivityType contains information on a detected activity in a given time period (ie, resting and walking). The remaining features indicate if, for a given time period, a person was generally physically active, running, biking, walking, how many floors or metres in altitude a person made, how many calories a person burned and the number of steps. Heart rate is collected passively using photoplethysmography (PPG). When no physical activity is detected, the Scanwatch collects heart rate data every 10 min for 30 s and provides the average heart rate for that sampling interval. As soon as physical activity is detected, the Scanwatch collects PPG-based heart rate continuously until the activity stops.
The Withings Scanwatch also allows you to perform 30-s resting-state ECGs (300 Hz) with medical-device quality. During active assessment phases, patients are asked to conduct an ECG on their Scanwatch two times a day. The resting-state ECG data are available in raw format. Withings also calculates Root Mean Square of Successive Differences (RMSSD) for resting-state ECGs as an indicator of HRV.
The TIKI app collects GPS data directly from the mobile phone’s sensor. To increase patient’ privacy and data security, GPS data are anonymised locally before saving them on the study servers by rotating the Latitude–Longitude tuples by a random angle. Like this, only mobility patterns (ie, distance travelled) but not exact locations can be inferred.
Sample size calculation
As described in the main study protocol,1 the required sample size for the PREACT study was estimated (a) using simulation studies with fivefold cross-validation to yield a prediction accuracy of 75–80% and (b) using effect size calculation for group-based linear multiple regression analyses, resulting in a required sample of 585 patients. Of those, it was expected that around 80% would take part in PREACT-digital, resulting in an estimated sample size of 468 patients for T0. Based on previous studies investigating the feasibility of digital phenotyping in clinical samples,28 it was further expected that 75% of the sample would decide for the long version of the study, resulting in 350 patients providing data at T20 and TPost.
Outcomes
We use different machine learning techniques to predict treatment response based on DP data. To evaluate treatment response, we implement two different criteria. First, we follow the definition of the PREACT research group to achieve comparability with other subprojects. It follows the concept of clinically significant change as introduced by Jacobson & Truax.29 The shift from a dysfunctional to a functional mental condition is operationalised by achieving a Brief Symptom Inventory-General Severity Index (BSI–GSI) score of less than 0.56 at T20/posttreatment assessment. Additionally, the change observed from baseline to T20/posttreatment must be reliable, as defined by the Reliable Change Index (RCI). The BSI–GSI is a self-report measure assessed cross-sectionally at T20 and TPost as part of PREACT.
Second, we will use aggregates of EMA assessments on positive and negative affect (PANAS-X; 21) collected during measurement bursts at T20 and TPost as additional indicators of treatment response as EMA-based measures were shown to capture different aspects of symptom load (28).
To account for the complexity of ID, PREACT further included disorder-specific measures as secondary outcomes: The Hamilton Anxiety Rating Scale (HAM-A; 29), the Montgomery–Åsberg Depression Rating Scale (MADRS; 30), the Yale-Brown Obsessive Compulsive Scale (Y-BOCS; 31, 32) and the Clinician-Administered PTSD Scale for DSM-5 (CAPS-5; 33, 34).
Finally, for exploratory analyses, outcomes will depend on the respective research question, that is, single-beep scores of EMA constructs or cross-sectional self-reports of ER capacities.
Data preprocessing and feature engineering
In its raw format, passive sensing data are usually not interpretable. Thus, depending on the data source, data format and research question, different data preprocessing and aggregation steps will be necessary.
Depending on the research question and granularity of data aggregation, a certain compliance rate has to be met to be included in the analysis (ie, at least 50% of completed EMA beeps). Missing data imputation will also depend on the type of data and applied analyses.
Analysis
The analysis procedure and code will be provided on OSF and on Github (https://github.com/leona-ha/preact_digital↗). Data preparation, exploration and modelling will mainly be done using Python, R in the Jupyter Lab environment, conducted on the computing facility of Charité – Universitätsmedizin Berlin.
Data freeze, holdout and model training
To allow for exploratory analyses without risking data leakage for predictive modelling, all data collected up to November 2024 were frozen. PREACT then drew an a priori, participant-level hold-out sample comprising 20 % of the cohort, stratified by age, sex, primary diagnosis and data modality (EEG, MRI, digital-phenotyping short vs long). The remaining 80 % constitute the development data set.
Participants enrolled after November 2024 (ie, after the data freeze) are assigned to development or hold-out according to the same 80 : 20 stratified rule so that these proportions are preserved when recruitment closes in June 2025.
Model development
Within the development set, we apply nested, repeated, stratified fivefold cross-validation. In each outer fold, 80 % of development data form the training subset and 20 % the validation subset for hyperparameter tuning and model selection. Performance metrics are averaged across folds and repetitions to yield optimism-corrected estimates. The optimised pipeline is then evaluated on the untouched 20 % hold-out test set.
Exploratory analyses that do not involve model building are conducted exclusively within the development set to safeguard the integrity of the hold-out sample.
Predictive analyses
We will implement both basic and advanced models and compare subsets of data and different timeframes to balance interpretability and predictive performance. With a sample size of at least 350 participants, we have sufficient data to explore a range of methods.
There are various ways to aggregate and preprocess EMA data to include them as predictors of non-response, ranging from simple aggregates to advanced modelling techniques. We will implement dynamic structural equation models (DSEM), an advanced extension of multilevel modelling that integrates time-series dynamics with latent variable modelling.30 Multilevel models are considered the gold standard for longitudinal data analysis in psychology,31 and DSEM builds on this foundation to allow unequally spaced observations and modelling of time-lagged dependencies, making it well suited for EMA data. The Bayesian estimation approach allows for missing data under the missing at random assumption (eg, when participants skip measurements). Beyond that, DSEM models are extendable to model change in dynamic parameters across measurement bursts. Further, passive data can be flexibly integrated into DSEM to analyse cross-lagged relationships between EMA (ie, affect) and passive sensing features (ie, sleep and physical activity). The model-based framework allows us to derive dynamic parameters (eg, individual autocorrelations and cross-lagged relationships) that we will use as features for the predictive analyses.
For the prediction of NR, basic algorithms include logistic regression, support vector machines, random forests and gradient boosting, depending on the research question and predictors used. Tree-based approaches are known for their effectiveness on tabular data.32 Regression-based approaches are highly interpretable. All of the selected require fewer computational resources compared with deep learning models and were frequently applied in psychotherapy research, increasing comparability.33,35
Advanced modelling approaches will be selected to incorporate the hierarchical, sequential and multimodal structure of our data. These include deep learning architectures like LSTM, Transformers or ensemble learning approaches. To explore how the predictive capacities depend on the time lag, we compare the accuracies of models using subsets of available data (eg, first measurement burst vs second measurement burst) and outcome timepoints (ie, T20 vs TPost).
As outlined above, we are interested in the incremental value of DP and EMA data to less burdensome, cross-sectional self-reports. As a first step, we will construct a baseline model that relies exclusively on the cross-sectional information collected at enrolment (T0).
Predictors will mirror the variables listed in the main protocol1—emographics, clinical routine variables (ie, primary diagnosis, comorbidity, medication and previous CBT) and standard intake questionnaires. Separate models will be trained for our two prespecified outcomes (T20, TPost). The baseline cross-sectional models will be compared against four progressively richer model sets: (a) EMA-only models, (b) passive-sensor-only models, (c) EMA+passive-sensor models and (d) all data sources combined (T0+EMA + passive). Paired differences in area under the curve (with 95% CIs, DeLong test) will quantify improvements in discrimination of (non)-response.
To assess feature importance and increase interpretability of our models, we will implement model-agnostic SHapley values (SHAP, SHapley Additive exPlanations).
Exploratory analyses
Our exploratory analyses encompass statistical and inferential investigations, where models are built to explore selected mechanisms and relationships between different variables, data domains and assessment times. We further aim to validate our EMA-based TA scale using Multilevel Confirmatory Factor Analysis (MCFA). To explore trajectories of change during therapy, we will make use of clustering techniques (ie, k-means clustering) or latent class growth analysis (LCGA). Derived clusters or classes may serve as additional features for the prediction of treatment NR.
Patient and public involvement
During the first week of study participation, we conduct a standardised interview with participants where we assess the feasibility and acceptability of our ambulatory assessments. Here, patients are also invited to suggest improvements. In addition, as part of subproject 5,1 patients report, among others, the burden and time required for participating in our subproject. All results regarding feasibility and acceptability will be considered when working on the follow-up proposal to this project. To elaborate strategies for dissemination of study results to participants and wider patient communities, we will conduct a congress open to the public in 2025.
Ethics and dissemination
The study has been approved by the Institutional Ethics Committee of the Department of Psychology at Humboldt-Universität zu Berlin (approval no. 2021–01) and the Ethics Committee of Charité – Universitätsmedizin Berlin (approval no. EA1/186/22). All participants provide written informed consent after being fully informed about the study’s aims, procedures, potential risks and benefits, and they may withdraw at any time without consequence. To meet the requirements of the EU General Data Protection Regulation (“Datenschutz-Grundverordnung,” DSGVO), we have implemented a data-protection concept for prospective clinical studies and a Joint Controller Agreement covering all participating institutions, both approved by their respective data-protection officers. Participant data are pseudonymised and stored on encrypted servers with role-based access rights, audit trails, standard operating procedures and staff training; no identifiable information is shared with third parties. Data integrity, participant safety and adverse events are continuously monitored. One year after the last patient out, the dataset will be fully anonymised to support open-science practices while safeguarding privacy. De-identified data and analysis scripts will be available to researchers within the RU via a secure server and may be shared with external investigators on reasonable request under GDPR-compliant agreements and in line with participant consent. Results will be disseminated through peer-reviewed journals, presentations at national and international conferences, and lay summaries posted on institutional websites or presented at public events.
Discussion
Within the research endeavour of the PREACT study to find predictors of non-response to CBT for ID, PREACT-digital implements therapy accompany digital phenotyping. DP complements the other subprojects by contributing dense, behavioural and physiological information from the patients everyday life, providing an ecologically valid and innovative add-on to more traditional sources of data. By integrating EMA with wearable-based passive sensing, we capture detailed, real-time information that goes beyond traditional cross-sectional snapshots. Our predictive analyses aim to determine whether advanced but computationally demanding modelling approaches improve accuracy over simpler methods, while exploratory analyses focus on uncovering potential mechanisms and subgroups that may inform tailored interventions.
A key strength of our study lies in its ecological validity: participants provide data in naturalistic settings, resulting in rich, high-frequency observations. We further benefit from a relatively large sample size and the opportunity to capture change processes over multiple time points. The integration of our project into the Research Unit allows us to compare our digital phenotypes with other data modalities and acquire a rich, comprehensive impression of participating patients. Although a few studies have integrated DP into outpatient therapy settings, this combination of different data modalities is unique to our study.
Despite these strengths, there are also key limitations that demand consideration. First, the study includes no healthy control group. As a result, we cannot directly distinguish digital patterns that are specific to clinical presentations from those also present in non-clinical populations. Future work will need case–control or population-based designs to clarify the specificity of any digital signatures identified here. Second, the extensive assessment schedule may place a burden on participants. Although we tried to keep the length of our active assessments reasonable (ie, around 3–5 min to complete it), answering eight daily questionnaires across 14 days is non-negligible. Also, participants are asked to wear their smartwatch as continuously as possible, also during nighttime. This combination of frequent questionnaires and continuous sensing may reduce adherence over time and, in the worst case, could lead some participants to withdraw from the study.
We hope to identify both robust predictors of therapy response and clinically meaningful insights that can guide personalised treatment adaptations. As the field of DP in mental health is in its infancy, our study will contribute relevant evidence regarding its feasibility in psychotherapy contexts and its informational content regarding treatment response.
Ultimately, this research has the potential to inform the development of more dynamic, precise, and patient-centred therapy approaches for ID. Early identification of individuals at risk of non-response to CBT could enable not only improved treatment allocation but also the implementation of adaptive interventions. For example, predicted non-responders may benefit from intensified care (eg, stepped care), evidence-based augmentation strategies (eg, pharmacological or mindfulness-based enhancements). DP data, in particular, may also allow the identification of symptom worsening in real time and thus trigger a suitable intervention on the smartphone (just-in-time adaptive interventions) or inform the respective therapist (blended treatment) about symptom triggers and contexts. Thus, predictive models in general, and predictive DP models in particular, could serve as a foundation for tailoring treatment pathways in a more responsive and individualised manner.
Acknowledgements
The present work was derived from the Research Unit FOR5187 ('Towards precision psychotherapy for non-respondent patients: from signatures to predictions to clinical utility', www.forschungsgruppe5187.de↗) funded by Deutsche Forschungsgemeinschaft (project number 442075332). It is based on data from a naturalistic observational clinical study (clinical trial registration: DRKS00030915). We would like to thank the following individuals for their help: Ulrike Lueken, Lydia Fehm, Norbert Kathmann, Babette Renneberg, Frank Jacobi, Till Langhammer, Andrea Ertle, Björn Elsner, Anne Trösken, Lars Schulze, Ricarda Evens, Chantal Unterfeld, Lena Fliedner, Alexandra Künstler, Torsten Sauder, Leandra Fien, Paul Eichler, Jonathan Torbecke, Lea Sophie Roediger, Freya Uhrlauf, Vera Yuseva, Isabelle Habedank, Nina Richter, Louise Förster, Sophie Meska, Jana Samland, Caroline Nitz, Helen Mahlke, Christoph Geiger, Jasmin Ghalib, Gesa Bimüller, Belnjamin Gas, Dorothea Neumann, Johanna Suchy.