Protocol Example

This example illustrates how to work with the Protocol API in the Protocols module. These classes represent (psychological) protocols. One instance of these protocols represents a particular study the protocol was conducted, and data collected during the study.

The Protocol API further offers functions to process different kind of data collected during that study without the need of setting up “common” processing pipelines each time (see the ECG_Analysis_Example.ipynb for further information on data processing pipelines).

As input, this notebook uses the heart rate processing outputs generated from the ECG processing pipeline as shown in ECG_Processing_Example.ipynb) as well as saliva data.

Setup and Helper Functions

[1]:
from pathlib import Path

import re

import pandas as pd
import numpy as np

from fau_colors import cmaps
import biopsykit as bp
from biopsykit.protocols import BaseProtocol, MIST, TSST

import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
%load_ext autoreload
%autoreload 2
[2]:
plt.close("all")

palette = sns.color_palette(cmaps.faculties)
sns.set_theme(context="notebook", style="ticks", font="sans-serif", palette=palette)

plt.rcParams["figure.figsize"] = (8, 4)
plt.rcParams["pdf.fonttype"] = 42
plt.rcParams["mathtext.default"] = "regular"

palette
[2]:
[3]:
def display_dict_structure(dict_to_display):
    _display_dict_recursive(dict_to_display)


def _display_dict_recursive(dict_to_display):
    if isinstance(dict_to_display, dict):
        display(dict_to_display.keys())
        _display_dict_recursive(list(dict_to_display.values())[0])
    else:
        display("Dataframe shape: {}".format(dict_to_display.shape))
        display(dict_to_display.head())

Overview

The Protocol API of BioPsyKit offers classes that represent various psychological protocols.

The classes allow to add time-series data (such as heart rate) as well as tabular data (such as saliva) to an instance of a Protocol object. Further, it offers functions to process the data and aggregate the results on a study level (such as computing mean and standard error over all subjects within different phases during the protocol).

For this, it uses functions from the biopsykit.utils.data_processing module to split data, compute aggregated results, etc. While these functions can also be used stand-alone, without using the Protocol API, it is, however, recommended to make use of the Protocol API whenever possible since you don’t have to take care of calling each function manually, and in the right order.

See the ECG_Analysis_Example.ipynb notebook for further examples.

General Usage – BaseProtocol

The base class for all protocols is BaseProtocol(). It provides the general functionality. Besides that, there exist a couple of predefined subclasses that represent standardized psychological protocols, such as MIST (Montreal Imaging Stress Task) and TSST (Trier Social Stress Test).

Each protocol is expected to have a certain structure which can be specified by passing a dictionary describing the protocol structure when creating the Protocol object. The structure can be nested, up to three nested structure levels are supported:

  • 1st level: study part: These can be different, disjunct parts of the complete study, such as: “Preface”, “Test”, and “Questionnaires”

  • 2nd level: phase: One study part of the psychological protocol can consist of different phases, such as: “Preparation”, “Stress”, “Recovery”

  • 3rd level: subphase: One phase can consist of different subphases, such as: “Baseline”, “Arithmetic Task”, “Feedback”

The following examples illustrate different possibilities of defining the protocol structure:

[4]:
# Example 1: study with three parts, no finer division into phases
structure = {"Preface": None, "Test": None, "Questionnaires": None}
BaseProtocol(name="Base", structure=structure)
[4]:
Base
        Structure: {'Preface': None, 'Test': None, 'Questionnaires': None}
[5]:
# Example 2: study with three parts, all parts have different phases with specific durations
structure = {
    "Preface": {"Questionnaires": 240, "Baseline": 60},
    "Test": {"Preparation": 120, "Test": 240, "Recovery": 120},
    "Recovery": {"Part1": 240, "Part2": 240},
}
BaseProtocol(name="Base", structure=structure)
[5]:
Base
        Structure: {'Preface': {'Questionnaires': 240, 'Baseline': 60}, 'Test': {'Preparation': 120, 'Test': 240, 'Recovery': 120}, 'Recovery': {'Part1': 240, 'Part2': 240}}
[6]:
# Example 3: only certain study parts have different phases (example: TSST)
structure = {"Before": None, "TSST": {"Preparation": 300, "Talk": 300, "Math": 300}, "After": None}
BaseProtocol(name="Base", structure=structure)
[6]:
Base
        Structure: {'Before': None, 'TSST': {'Preparation': 300, 'Talk': 300, 'Math': 300}, 'After': None}
[7]:
# Example 4: study with phases and subphases, only certain study parts have different phases (example: MIST)
structure = {
    "Before": None,
    "MIST": {
        "MIST1": {"BL": 60, "AT": 240, "FB": 120},
        "MIST2": {"BL": 60, "AT": 240, "FB": 120},
        "MIST3": {"BL": 60, "AT": 240, "FB": 120},
    },
    "After": None,
}
BaseProtocol(name="Base", structure=structure)
[7]:
Base
        Structure: {'Before': None, 'MIST': {'MIST1': {'BL': 60, 'AT': 240, 'FB': 120}, 'MIST2': {'BL': 60, 'AT': 240, 'FB': 120}, 'MIST3': {'BL': 60, 'AT': 240, 'FB': 120}}, 'After': None}

Load Data

Heart Rate and R Peak Data

Load processed heart rate time-series data of all subjects and concatenate into one dict.

[8]:
subject_data_dict_hr = {}
# or use your own path
ecg_path = bp.example_data.get_ecg_processing_results_path_example()

for file in sorted(ecg_path.glob("hr_result_*.xlsx")):
    subject_id = re.findall("hr_result_(Vp\w+).xlsx", file.name)[0]
    subject_data_dict_hr[subject_id] = pd.read_excel(file, sheet_name=None, index_col="time")

subject_data_dict_rpeaks = {}
for file in sorted(ecg_path.glob("rpeaks_result_*.xlsx")):
    subject_id = re.findall("rpeaks_result_(Vp\w+).xlsx", file.name)[0]
    subject_data_dict_rpeaks[subject_id] = pd.read_excel(file, sheet_name=None, index_col="time")

Structure of the SubjectDataDict:

[9]:
display_dict_structure(subject_data_dict_hr)
dict_keys(['Vp01', 'Vp02'])
dict_keys(['Baseline', 'Intervention', 'Stress', 'Recovery'])
'Dataframe shape: (91, 1)'
Heart_Rate
time
2019-10-23 12:32:05.797 80.842105
2019-10-23 12:32:07.223 81.269841
2019-10-23 12:32:07.961 78.769231
2019-10-23 12:32:08.723 79.175258
2019-10-23 12:32:09.480 83.934426

Subject Conditions

(will be needed later)

Note: This is only an example, thus, the condition list is manually constructed. Usually, you should import a condition list from a file.

[10]:
condition_list = pd.DataFrame(
    ["Control", "Intervention"], columns=["condition"], index=pd.Index(["Vp01", "Vp02"], name="subject")
)
condition_list
[10]:
condition
subject
Vp01 Control
Vp02 Intervention

Create Protocol instance

Let’s assume our example protocol consists of only one study part, so we can ignore this level. The study consists of four different phases: “Baseline”, “Intervention”, “Stress”, and “Recovery”. Each of this phase has, in turn, three subphases: “Start”, “Middle”, and “End” with the durations 20, 30, and 10 seoncds, respectively. Our structure dict then looks like this:

[11]:
subphases = {"Start": 20, "Middle": 30, "End": 10}
structure = {"Baseline": subphases, "Intervention": subphases, "Stress": subphases, "Recovery": subphases}
[12]:
base_protocol = BaseProtocol(name="Base", structure=structure)

Add Heart Rate Data

The data to be added to the Protocol object is expected to be a SubjectDataDict, a special nested dictionary structure that contains processed data of all subjects. The dictionaries subject_data_dict_hr and subject_data_dict_rpeaks are already in this structure, so they can be added to the Protocol object.

Since we only have one study part, we can directly add this data without specifying a study_part parameter. The data is added to a study part named “Study” for better internal handling. The added data can be accessed via the hr_data property.

Note: See BaseProtocol.add_hr_data() and BaseProtocol.hr_data for further information!

[13]:
base_protocol.add_hr_data(hr_data=subject_data_dict_hr, rpeak_data=subject_data_dict_rpeaks)
[14]:
display_dict_structure(base_protocol.hr_data)
dict_keys(['Study'])
dict_keys(['Vp01', 'Vp02'])
dict_keys(['Baseline', 'Intervention', 'Stress', 'Recovery'])
'Dataframe shape: (91, 1)'
Heart_Rate
time
2019-10-23 12:32:05.797 80.842105
2019-10-23 12:32:07.223 81.269841
2019-10-23 12:32:07.961 78.769231
2019-10-23 12:32:08.723 79.175258
2019-10-23 12:32:09.480 83.934426

Compute Aggregated Results

Next, we can compute Aggregated Results. For this, we can use the method BaseProtocol.compute_hr_results() which allows us to enable/disable the specific processing steps and provide different parameters to the individual step (see the ECG_Analysis_Example.ipynb notebook for further explanations on the different processing steps).

Since different kinds of aggregated results can be computed (e.g., aggregating only over phases vs. aggregating over subphases, normalization of heart rate data vs. no normalization, …) we can speficy an identifier for these results via the result_id parameter.

BaseProtocol.compute_hr_results() can do the following steps (True/False indicates which steps are enabled by default – have also a look at the function documentation!):

  • Resample (resample_sec): True

  • Normalize (normalize_to): False

  • Select Phases (select_phases): False

  • Split into Subphases (split_into_subphases: False

  • Aggregate per Subject (mean_per_subject): True

  • Add Conditions (add_conditions): False

Aggregate over Phases (Without Normalization)

[15]:
base_protocol.compute_hr_results("hr_phases")

Aggregate over Phases, Add Subject Conditions

[16]:
base_protocol.compute_hr_results("hr_phases_cond", add_conditions=True, params={"add_conditions": condition_list})

Aggregate over Phases (With Normalization)

If, for example, we want to normalize the heart rate data to the “Baseline” phase, we can enable Normalization (normalize_to). Since all data were normalized to the “Baseline” phase we can drop this phase for later analysis by enabling the select_phases step. The parameters for these steps is supplied via the params dict:

[17]:
base_protocol.compute_hr_results(
    "hr_phases_norm",
    normalize_to=True,
    select_phases=True,
    params={"normalize_to": "Baseline", "select_phases": ["Intervention", "Stress", "Recovery"]},
)

Aggregate over Subphases (With Normalization)

[18]:
base_protocol.compute_hr_results(
    "hr_subphases_norm",
    normalize_to=True,
    select_phases=True,
    split_into_subphases=True,
    params={
        "normalize_to": "Baseline",
        "select_phases": ["Intervention", "Stress", "Recovery"],
        "split_into_subphases": subphases,
    },
)

Access Results

The results can then be accessed via the hr_results property:

[19]:
base_protocol.hr_results["hr_phases"].head()
[19]:
Heart_Rate
subject phase
Vp01 Baseline 85.579756
Intervention 77.021302
Stress 88.263152
Recovery 85.855722
Vp02 Baseline 86.877952
[20]:
base_protocol.hr_results["hr_phases_cond"].head()
[20]:
Heart_Rate
condition subject phase
Control Vp01 Baseline 85.579756
Intervention 77.021302
Stress 88.263152
Recovery 85.855722
Intervention Vp02 Baseline 86.877952
[21]:
base_protocol.hr_results["hr_phases_norm"].head()
[21]:
Heart_Rate
subject phase
Vp01 Intervention -10.000559
Stress 3.135550
Recovery 0.322467
Vp02 Intervention 5.185647
Stress 23.107396
[22]:
base_protocol.hr_results["hr_subphases_norm"].head()
[22]:
Heart_Rate
subject phase subphase
Vp01 Intervention Start -8.418687
Middle -10.009403
End -10.612060
Stress Start 14.146760
Middle 7.265783

Compute Ensemble Time-Series

Next, we can compute Ensemble Time-Series. For this, we can use the method BaseProtocol.compute_hr_ensemble() which allows us to enable/disable the specific processing steps and provide different parameters to the individual step (see the ECG_Analysis_Example.ipynb notebook for further explanations on the different processing steps).

Since different kinds of ensemble time-series can be computed (e.g., normalization of heart rate data vs. no normalization, …), we can speficy an identifier for these results via the ensemble_id parameter.

BaseProtocol.compute_hr_ensemble() can do the following steps (True/False indicates which steps are enabled by default – have also a look at the function documentation!):

  • Resample (resample_sec): True

  • Normalize (normalize_to): True

  • Select Phases (select_phases): False

  • Cut Phases to Shortest Duration (cut_phases): True

  • Merge Dictionary (merge_dict): True

  • Add Conditions (add_conditions): False

[23]:
base_protocol.compute_hr_ensemble("hr_ensemble", params={"normalize_to": "Baseline"})
[24]:
display_dict_structure(base_protocol.hr_ensemble["hr_ensemble"])
dict_keys(['Baseline', 'Intervention', 'Stress', 'Recovery'])
'Dataframe shape: (60, 2)'
subject Vp01 Vp02
time
1.0 -5.185450 -1.907328
2.0 -7.308778 2.764077
3.0 -6.940040 6.703796
4.0 -0.950875 0.947209
5.0 -5.986024 -14.090102

Compute HRV Parameters

Next, we can compute HRV Parameters. For this, we can use the method BaseProtocol.compute_hrv_results() which allows us to enable/disable the specific processing steps and provide different parameters to the individual step (see the ECG_Analysis_Example.ipynb notebook for further explanations on the different processing steps).

Since different kinds of HRV results can be computed (e.g., aggregating only over phases vs. aggregating over subphases, …) we can speficy an identifier for these results via the result_id parameter.

BaseProtocol.compute_hrv_results() can do the following steps (True/False indicates which steps are enabled by default – have also a look at the function documentation!):

  • Select Phases (select_phases): False

  • Split into Subphases (split_into_subphases): False

  • Add Conditions (add_conditions): False

Additionally, the function has arguments to further specify HRV processing:

  • dict_levels: The names of the dictionary levels. Corresponds to the index level names of the resulting, concatenated dataframe. Default: None for the levels [“subject”, “phase”] (or [“subject”, “phase”, “subphase”], when split_into_subphases is True)

  • hrv_params: Dictionary with parameters to configure HRV parameter computation. Parameters are passed to EcgProcessor.hrv_process(). Examples:

    • hrv_types: list of HRV types to be computed. Subset of [“hrv_time”, “hrv_nonlinear”, “hrv_frequency”].

    • correct_rpeaks: Whether to apply R peak correction before computing HRV parameters or not

Compute HRV Parameter over Phases

[25]:
base_protocol.compute_hrv_results("hrv_phases")
/home/docs/checkouts/readthedocs.org/user_builds/biopsykit/envs/latest/lib/python3.9/site-packages/neurokit2/hrv/hrv_nonlinear.py:474: NeuroKitWarning: DFA_alpha2 related indices will not be calculated. The maximum duration of the windows provided for the long-term correlation is smaller than the minimum duration of windows. Refer to the `scale` argument in `nk.fractal_dfa()` for more information.
  warn(
/home/docs/checkouts/readthedocs.org/user_builds/biopsykit/envs/latest/lib/python3.9/site-packages/neurokit2/hrv/hrv_nonlinear.py:474: NeuroKitWarning: DFA_alpha2 related indices will not be calculated. The maximum duration of the windows provided for the long-term correlation is smaller than the minimum duration of windows. Refer to the `scale` argument in `nk.fractal_dfa()` for more information.
  warn(
[26]:
base_protocol.hrv_results["hrv_phases"].head()
[26]:
HRV_MeanNN HRV_SDNN HRV_SDANN1 HRV_SDNNI1 HRV_SDANN2 HRV_SDNNI2 HRV_SDANN5 HRV_SDNNI5 HRV_RMSSD HRV_SDSD ... HRV_LZC HRV_DFA_alpha2 HRV_MFDFA_alpha2_Width HRV_MFDFA_alpha2_Peak HRV_MFDFA_alpha2_Mean HRV_MFDFA_alpha2_Max HRV_MFDFA_alpha2_Delta HRV_MFDFA_alpha2_Asymmetry HRV_MFDFA_alpha2_Fluctuation HRV_MFDFA_alpha2_Increment
subject phase
Vp01 Baseline 701.171875 44.086389 NaN NaN NaN NaN NaN NaN 36.399766 36.605573 ... 1.009844 NaN NaN NaN NaN NaN NaN NaN NaN NaN
Intervention 779.551630 48.772965 NaN NaN NaN NaN NaN NaN 53.253237 53.448631 ... 1.133243 1.108132 1.088136 1.327777 1.629693 -1.286655 -2.286655 -0.222539 0.001160 0.107944
Stress 680.145410 51.211148 25.720464 45.331424 NaN NaN NaN NaN 35.764047 35.806996 ... 0.791578 0.741699 0.265318 0.671193 0.629318 1.000000 0.664149 -0.657832 0.000091 0.006495
Recovery 699.218750 50.388794 NaN NaN NaN NaN NaN NaN 29.430528 29.506479 ... 0.908615 0.857308 0.294392 1.055181 1.004504 0.878651 -0.099155 -0.672143 0.000363 0.017242
Vp02 Baseline 691.029744 40.551896 NaN NaN NaN NaN NaN NaN 32.113406 32.304060 ... 0.998500 NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 81 columns

Compute HRV Parameter over Subphases

Assuming we only want to compute a certain type of HRV measures (in this example, only time-domain measures), we can specify this in the hrv_params (see EcgProcessor.hrv_process() for an overview of available arguments) argument.

[27]:
base_protocol.compute_hrv_results(
    "hrv_subphases",
    split_into_subphases=True,
    params={"split_into_subphases": subphases},
    hrv_params={"hrv_types": "hrv_time"},
)
[28]:
base_protocol.hrv_results["hrv_subphases"].head()
[28]:
HRV_MeanNN HRV_SDNN HRV_SDANN1 HRV_SDNNI1 HRV_SDANN2 HRV_SDNNI2 HRV_SDANN5 HRV_SDNNI5 HRV_RMSSD HRV_SDSD ... HRV_IQRNN HRV_SDRMSSD HRV_Prc20NN HRV_Prc80NN HRV_pNN50 HRV_pNN20 HRV_MinNN HRV_MaxNN HRV_HTI HRV_TINN
subject phase subphase
Vp01 Baseline Start 728.365385 38.515399 NaN NaN NaN NaN NaN NaN 38.121478 38.852299 ... 60.546875 1.010333 691.40625 761.71875 23.076923 61.538462 660.15625 789.06250 8.666667 109.3750
Middle 673.029119 31.651441 NaN NaN NaN NaN NaN NaN 32.801251 33.164488 ... 39.062500 0.964946 650.78125 704.68750 18.181818 43.181818 605.46875 765.62500 3.666667 54.6875
End 724.759615 32.667065 NaN NaN NaN NaN NaN NaN 38.356246 40.060349 ... 31.250000 0.851675 702.34375 752.34375 23.076923 69.230769 664.06250 769.53125 3.250000 62.5000
Intervention Start 762.812500 26.119350 NaN NaN NaN NaN NaN NaN 30.912484 31.418938 ... 39.062500 0.844945 741.40625 781.25000 8.000000 60.000000 707.03125 820.31250 6.250000 39.0625
Middle 779.399671 47.293117 NaN NaN NaN NaN NaN NaN 49.025151 49.689867 ... 62.500000 0.964670 739.84375 818.75000 42.105263 68.421053 671.87500 871.09375 9.500000 78.1250

5 rows × 25 columns

Add Precomputed Results

If processing results were already computed (e.g., in another notebook or because it’s data from an old study), the precomputed results can be imported and directly added to the Protocol instance.

Ensemble Time-series

In this example, we have Ensemble Time-series data from a study that has 3 phases (Phase1, Phase2, Phase3). This data is already in the correct format, so we can directly add it to a Protocol object (that also has the correct structure – the duration of the single phases is not of importance, so we just leave it empty here).

[29]:
dict_merged_norm = bp.example_data.get_hr_ensemble_sample()
[30]:
base_protocol = BaseProtocol("TimeSeriesDemo", structure={"Phase1": None, "Phase2": None, "Phase3": None})
[31]:
base_protocol.add_hr_ensemble("hr_ensemble", dict_merged_norm)
[32]:
display_dict_structure(base_protocol.hr_ensemble["hr_ensemble"])
dict_keys(['Phase1', 'Phase2', 'Phase3'])
'Dataframe shape: (406, 28)'
Vp01 Vp02 Vp03 Vp04 Vp05 Vp06 Vp07 Vp08 Vp09 Vp10 ... Vp22 Vp24 Vp25 Vp26 Vp27 Vp28 Vp30 Vp31 Vp32 Vp33
time
1 -5.896011 21.372677 -13.380272 -0.952815 5.662782 -4.119631 0.594466 7.365068 2.589674 -4.873666 ... 8.312998 6.064452 -4.047215 -5.267747 -11.698333 -5.405734 -1.370884 3.346634 -5.795628 3.491703
2 -4.745362 21.213832 -7.508542 1.678500 4.286097 -5.328198 11.758334 6.751515 0.005992 3.383603 ... 5.708344 6.728987 -6.630518 -5.409718 -15.351490 -4.091093 -0.836948 -1.557466 -12.685620 8.637926
3 1.967010 14.120346 -4.631881 -4.720704 0.709777 1.238230 7.367597 2.901178 -2.304507 11.329636 ... 1.157688 -8.462778 -6.974983 -7.243909 -12.632172 -3.817170 0.881259 -2.667594 -10.346344 2.835992
4 1.454400 8.943284 -11.498010 -10.179091 -2.759789 0.394513 -8.177153 1.614913 0.885181 9.373362 ... -0.661564 -13.988615 -3.315441 -7.813189 -8.996902 -0.654702 -0.186492 -0.641279 0.950780 0.571289
5 -3.319305 -1.501998 -8.575398 -8.604960 -1.664693 -4.642004 -7.594194 0.708533 -2.946673 12.488329 ... 3.657499 -8.304159 2.546755 -3.969524 -9.432211 6.311569 1.601839 -0.189170 8.193988 -4.193230

5 rows × 28 columns

MIST

The MIST represents the Montreal Imaging Stress Task.

Typically, the MIST has the following structure:

  • “Before”: The study part before conducting the “actual” MIST. Here, subjects typically fill out questionnaires and provide a neutral baseline before stress induction

  • “MIST”: The “actual” MIST. The MIST protocol consists of three repetitions of the same subphases: Baseline (BL), Arithmetic Tasks (AT), Feedback (FB). Each subphase has a duration that is specified in seconds.

  • “After”: The study part after the MIST. Here, subjects typically fill out further questionnaires and remain in the examination room to provide further saliva samples.

In summary:

  • Study Parts: Before, MIST, After

  • Phases: MIST1, MIST2, MIST3

  • Subphases: BL, AT, FB

  • Subphase Durations: 60, 240, 0 (Feedback Interval has length 0 because it’s of variable length for each MIST phase and will later be inferred from the data)

For further information please visit the documentatin of protocols.MIST.

Hence, the structure dictionary would look like the following:

structure = {
    "Before": None,
    "MIST": {
        "MIST1": {"BL": 60, "AT": 240, "FB": 0},
        "MIST2": {"BL": 60, "AT": 240, "FB": 0},
        "MIST3": {"BL": 60, "AT": 240, "FB": 0}
    },
    "After": None
}
[33]:
structure = {
    "Before": None,
    "MIST": {
        "MIST1": ["BL", "AT", "FB"],
        "MIST2": ["BL", "AT", "FB"],
        "MIST3": ["BL", "AT", "FB"],
    },
    "After": None,
}
[34]:
mist = MIST(structure=structure)
mist
[34]:
MIST
        Structure: {'Before': None, 'MIST': {'MIST1': ['BL', 'AT', 'FB'], 'MIST2': ['BL', 'AT', 'FB'], 'MIST3': ['BL', 'AT', 'FB']}, 'After': None}

TSST

The TSST represents the Trier Social Stress Test.

Typically, the TSST has the following structure:

  • “Before”: The study part before conducting the “actual” TSST. Here, subjects typically fill out questionnaires and provide a neutral baseline before stress induction

  • “TSST”: The “actual” TSST. The TSST protocol consists two phases: Talk and Math, both with a duration of 5 minutes (300 seconds).

  • “After”: The study part after the TSST. Here, subjects typically fill out further questionnaires and remain in the examination room to provide further saliva samples.

In summary:

  • Study Parts: Before, TSST, After

  • Phases: Talk, Math

  • Phase Durations: 300, 300

For further information please visit the documentatin of protocols.TSST.

Hence, the structure dictionary would look like the following:

structure = {
    "Before": None,
    "TSST": {
        "Talk": 300,
        "Math": 300,
    },
    "After": None
}
[35]:
structure = {
    "Before": None,
    "TSST": {
        "Talk": 300,
        "Math": 300,
    },
    "After": None,
}
[36]:
tsst = TSST(structure=structure)
tsst
[36]:
TSST
        Structure: {'Before': None, 'TSST': {'Talk': 300, 'Math': 300}, 'After': None}

Plotting

The Protocol API offers functions to directly visualize the processed study data of the Protocol instance. So far, the visualization of heart rate data (hr_mean_plot() and hr_ensemble_plot()) and saliva data (saliva_plot()) is supported.

Create Protocol object

Note: For this example we assume that data was collected during the MIST.

[37]:
mist = MIST()
mist
[37]:
MIST
        Structure: {'Part1': None, 'MIST': {'MIST1': {'BL': 60, 'AT': 240, 'FB': 0}, 'MIST2': {'BL': 60, 'AT': 240, 'FB': 0}, 'MIST3': {'BL': 60, 'AT': 240, 'FB': 0}}, 'Part2': None}

Aggregated Results

Load Example Aggregated Results Data

[38]:
hr_result = bp.example_data.get_hr_result_sample()
# Alternatively: Load your own aggregated results
# hr_result = bp.io.load_long_format_csv("<path-to-aggregated-results-in-long-format>")
[39]:
hr_result.head()
[39]:
HR
condition subject phase subphase
Intervention Vp01 Phase1 Start -1.233771
Middle -0.727680
End -0.719957
Phase2 Start -2.472756
Middle -2.044894

Add Aggregated Results Data

[40]:
mist.add_hr_results("hr_result", hr_result)

Plot Mean HR Plot

[41]:
fig, ax = plt.subplots()
mist.hr_mean_plot("hr_result", ax=ax);
../../_images/examples__notebooks_Protocol_Example_88_0.svg

Ensemble Time-series

Load Example Ensemble Data

[42]:
dict_hr_ensemble = bp.example_data.get_hr_ensemble_sample()

# Alternatively: Load your own ensemble time-series data
# dict_hr_ensemble = bp.io.load_pandas_dict_excel("<path-to-heart-rate-ensemble-data.xlsx>")

# Note: We rename the phase names (Phase1-3) of this example dataset to match
# it with the standard phase names of the MIST (MIST1-3)
dict_hr_ensemble = {key: value for key, value in zip(["MIST1", "MIST2", "MIST3"], dict_hr_ensemble.values())}
[43]:
display_dict_structure(dict_hr_ensemble)
dict_keys(['MIST1', 'MIST2', 'MIST3'])
'Dataframe shape: (406, 28)'
Vp01 Vp02 Vp03 Vp04 Vp05 Vp06 Vp07 Vp08 Vp09 Vp10 ... Vp22 Vp24 Vp25 Vp26 Vp27 Vp28 Vp30 Vp31 Vp32 Vp33
time
1 -5.896011 21.372677 -13.380272 -0.952815 5.662782 -4.119631 0.594466 7.365068 2.589674 -4.873666 ... 8.312998 6.064452 -4.047215 -5.267747 -11.698333 -5.405734 -1.370884 3.346634 -5.795628 3.491703
2 -4.745362 21.213832 -7.508542 1.678500 4.286097 -5.328198 11.758334 6.751515 0.005992 3.383603 ... 5.708344 6.728987 -6.630518 -5.409718 -15.351490 -4.091093 -0.836948 -1.557466 -12.685620 8.637926
3 1.967010 14.120346 -4.631881 -4.720704 0.709777 1.238230 7.367597 2.901178 -2.304507 11.329636 ... 1.157688 -8.462778 -6.974983 -7.243909 -12.632172 -3.817170 0.881259 -2.667594 -10.346344 2.835992
4 1.454400 8.943284 -11.498010 -10.179091 -2.759789 0.394513 -8.177153 1.614913 0.885181 9.373362 ... -0.661564 -13.988615 -3.315441 -7.813189 -8.996902 -0.654702 -0.186492 -0.641279 0.950780 0.571289
5 -3.319305 -1.501998 -8.575398 -8.604960 -1.664693 -4.642004 -7.594194 0.708533 -2.946673 12.488329 ... 3.657499 -8.304159 2.546755 -3.969524 -9.432211 6.311569 1.601839 -0.189170 8.193988 -4.193230

5 rows × 28 columns

Add Ensemble Data

[44]:
mist.add_hr_ensemble("hr_ensemble", dict_hr_ensemble)

Plot HR Ensemble Plot

Note: Since the MIST contains subphases the MIST.hr_ensemble_plot() function makes use of structure property of the MIST object and automatically highlights the subphases in the HR Ensemble Plot.

[45]:
fig, ax = plt.subplots()
mist.hr_ensemble_plot("hr_ensemble", ax=ax);
../../_images/examples__notebooks_Protocol_Example_97_0.svg

Saliva

Just like heart rate data we can also add saliva data to the Protocols object. We just need to specify which type of data it is (e.g., “cortisol”, “amylase”, etc.) – see the (see the Saliva_Example.ipynb notebook for further information.

Saliva data can be added to the Protocol object in two different format:

  • Saliva data per subject as SalivaRawDataFrame, a specialized dataframe containing “raw” (unaggregated) saliva data in a standardized format. BioPsyKit will then interanally compute mean and standard error.

  • Aggregated saliva data (mean and standard error) as SalivaMeanSeDataFrame , a specialized dataframe containing mean and standard error of saliva samples in a standardized format.

SalivaRawDataFrame

Load Example Cortisol Data in SalivaRawDataFrame format.

Note: The sample times are provided relative to the psychological protocol. The first sample is dropped for plotting.

[46]:
saliva_data = bp.example_data.get_saliva_example(sample_times=[-30, -1, 0, 10, 20, 30, 40])
# alternatively: load your own saliva data
# saliva_data = bp.io.saliva.load_saliva_wide_format("<path-to-saliva-data>")
# drop first saliva sample (not needed because it was only assessed to control for high baseline)
saliva_data = saliva_data.drop("0", level="sample")
saliva_data.head()
[46]:
cortisol time
condition subject sample
Intervention Vp01 1 7.03320 -1
2 5.77670 0
3 5.25790 10
4 5.00795 20
5 4.50045 30

Add Saliva Data to MIST object

[47]:
mist.add_saliva_data(saliva_data, "cortisol")
[48]:
fig, ax = mist.saliva_plot(saliva_type="cortisol")
fig.tight_layout()
../../_images/examples__notebooks_Protocol_Example_105_0.svg

SalivaMeanSeDataFrame

Create Protocol object – For this example, we’re assuming the data was collected during the TSST.

[49]:
structure = {
    "Before": None,
    "TSST": {
        "Talk": 300,
        "Math": 300,
    },
    "After": None,
}

tsst = TSST(structure=structure)
tsst
[49]:
TSST
        Structure: {'Before': None, 'TSST': {'Talk': 300, 'Math': 300}, 'After': None}

Load Example Saliva Data (cortisol, amylase, and IL-6) in SalivaMeanSeDataFrame format (or, more presicely, a dictionary of such).

[50]:
saliva_dict = bp.example_data.get_saliva_mean_se_example()
display_dict_structure(saliva_dict)
dict_keys(['cortisol', 'amylase', 'il6'])
'Dataframe shape: (14, 2)'
mean se
condition sample time
Control 0 -1 8.004 1.021
1 0 10.751 0.832
2 10 14.330 1.230
3 20 13.641 1.338
4 45 10.549 1.036

Add Saliva Data to TSST object. Since it’s a dictionary, we don’t have to specify the saliva_types – the names are automatically inferred from the dictionary keys (we can, however, also manually specify the saliva type if we want).

[51]:
tsst.add_saliva_data(saliva_dict)
display_dict_structure(tsst.saliva_data)
dict_keys(['cortisol', 'amylase', 'il6'])
'Dataframe shape: (14, 2)'
mean se
condition sample time
Control 0 -1 8.004 1.021
1 0 10.751 0.832
2 10 14.330 1.230
3 20 13.641 1.338
4 45 10.549 1.036

Plot Cortisol Data

Note: This function will automatically plot data from all conditions. If you want to change that behavior, and plot only selected conditions (or just one), you have to manually call saliva_plot() from the module biopsykit.protocols.plotting(see below).

[52]:
fig, ax = tsst.saliva_plot(saliva_type="cortisol", legend_loc="upper right")
fig.tight_layout()
../../_images/examples__notebooks_Protocol_Example_114_0.svg

Plot Cortisol Data – Only “Intervention” Group

[53]:
fig, ax = plt.subplots()
bp.protocols.plotting.saliva_plot(
    data=tsst.saliva_data["cortisol"].xs("Intervention"),
    saliva_type="cortisol",
    test_times=tsst.test_times,
    test_title="TSST",
    legend_loc="upper right",
    ax=ax,
)
fig.tight_layout()
../../_images/examples__notebooks_Protocol_Example_116_0.svg

Plot Cortisol and IL-6 Data

[54]:
fig, ax = plt.subplots()
tsst.saliva_plot(
    saliva_type=["cortisol", "il6"], linestyle=["-", "--"], marker=["o", "P"], legend_loc="upper right", ax=ax
);
../../_images/examples__notebooks_Protocol_Example_118_0.svg
[ ]:

Download Notebook
(Right-Click -> Save Link As...)