Saliva Example

This example illustrates how to import saliva data (cortisol, amylase, etc.), how to compute often used parameters and how to export it to perform futher analysis.

Setup and Helper Functions

[1]:
from pathlib import Path

import re

import pandas as pd
import numpy as np

from fau_colors import cmaps
import biopsykit as bp

import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
%load_ext autoreload
%autoreload 2
[2]:
plt.close("all")

palette = sns.color_palette(cmaps.faculties)
sns.set_theme(context="notebook", style="ticks", font="sans-serif", palette=palette)

plt.rcParams["figure.figsize"] = (8, 4)
plt.rcParams["pdf.fonttype"] = 42
plt.rcParams["mathtext.default"] = "regular"

palette
[2]:

Set Saliva Time Points

[3]:
sample_times = [-30, -1, 30, 40, 50, 60, 70]

Load Condition List

Load example data

[4]:
condition_list = bp.example_data.get_condition_list_example()
condition_list.head()
[4]:
condition
subject
Vp01 Intervention
Vp02 Intervention
Vp03 Control
Vp04 Intervention
Vp05 Intervention

Alternatively: Load your own data

[5]:
# condition_list = bp.io.load_subject_condition_list("<path-to-condition-list-file>")
# condition_list.head()

Load Data

Option 0: Load BioPsyKit example data

The example data will be imported and converted into a long-format dataframe:

[6]:
df_cort = bp.example_data.get_saliva_example()
df_cort.head()
[6]:
cortisol
condition subject sample
Intervention Vp01 0 6.98890
1 7.03320
2 5.77670
3 5.25790
4 5.00795

Compute Mean and Standard Error

Mean and standard error is computed in saliva.mean_se() for the two conditions and for each sample separately.

[7]:
bp.saliva.mean_se(df_cort, saliva_type="cortisol")
[7]:
mean se
condition sample
Control 0 4.569485 0.708467
1 5.367404 0.956169
2 5.805423 1.005129
3 7.669331 1.353230
4 7.737588 1.479832
5 6.964650 1.396808
6 5.916773 1.128819
Intervention 0 6.041300 0.788725
1 5.934754 0.867792
2 4.971362 0.696804
3 5.071823 0.752146
4 5.105092 0.917503
5 4.932746 0.914560
6 4.645919 0.869791

Plot Saliva Data

[8]:
bp.plotting.lineplot(data=df_cort, x="sample", y="cortisol", hue="condition", style="condition");
../../_images/examples__notebooks_Saliva_Example_20_0.svg

Example: Subject Exclusion

For this example, we assume we want to exclude the subjects ‘Vp01’ and ‘Vp02’ from both the condition list and the dataframe with cortisol samples using data_processing.exclude_subjects()

[9]:
dict_result = bp.utils.data_processing.exclude_subjects(
    ["Vp01", "Vp02"], condition_list=condition_list, cortisol=df_cort
)

dict_result
# uncomment to reassign the data with excluded subjects to the original variable names
# df_cort = dict_result['cortisol']
# condition_list = dict_result['condition_list']
[9]:
{'condition_list':             condition
 subject
 Vp03          Control
 Vp04     Intervention
 Vp05     Intervention
 Vp06          Control
 Vp07     Intervention
 Vp08          Control
 Vp09          Control
 Vp10     Intervention
 Vp11          Control
 Vp15     Intervention
 Vp16     Intervention
 Vp17          Control
 Vp18     Intervention
 Vp19          Control
 Vp20     Intervention
 Vp21          Control
 Vp22          Control
 Vp24          Control
 Vp25     Intervention
 Vp26     Intervention
 Vp27     Intervention
 Vp28     Intervention
 Vp30          Control
 Vp31          Control
 Vp32          Control
 Vp33          Control,
 'cortisol':                           cortisol
 condition subject sample
 Control   Vp03    0        5.39850
                   1       13.27400
                   2       10.10115
                   3       14.23000
                   4       13.98650
 ...                            ...
           Vp33    2        7.78115
                   3        8.74815
                   4        7.66505
                   5        7.75460
                   6        6.72890

 [168 rows x 1 columns]}

Option 1: Use BioPsyKit to load saliva data in ‘plate’ format

The data is converted into long-format (SalivaRawDataFrame) and returned.

Load saliva data into pandas dataframe (example data using biopsykit.example_data.get_saliva_example_plate_format() or own data in ‘plate’-format using biopsykit.io.saliva.load_saliva_plate()):

[10]:
df_cort = bp.example_data.get_saliva_example_plate_format()

# alternatively: load own data that is present in "plate" format
# df_cort = bp.io.saliva.load_saliva_plate(file_path="<path-to-saliva-data-in-plate-format>", saliva_type="cortisol")
[11]:
df_cort.head()
[11]:
cortisol
subject sample
Vp01 S0 6.98890
S1 7.03320
S2 5.77670
S3 5.25790
S4 5.00795

We can additionally directly pass a ‘condition list’ to the data loader function (for example data loader as well as for your own data). This allows us to directly assign subject conditions to the saliva samples:

[12]:
df_cort = bp.example_data.get_saliva_example_plate_format(condition_list=condition_list)
# alternatively: load own data in "plate" format and directly assign subject conditions
# df_cort = bp.io.saliva.load_saliva_plate(file_path="<path-to-saliva-data-in-plate-format>", saliva_type="cortisol", condition_list=condition_list)

df_cort.head()
[12]:
cortisol
condition subject sample
Intervention Vp01 S0 6.98890
S1 7.03320
S2 5.77670
S3 5.25790
S4 5.00795

Speficy your custom regular expression string to extract Subject ID and Saliva ID (see the documentation of biopsykit.io.saliva.load_saliva_plate() for further information).

For example, the regex string in regex_str will extract the subject IDs without the Vp prefix and sample IDs without the S prefix:

[13]:
regex_str = "Vp(\d+) S(\d)"
df_cort = bp.example_data.get_saliva_example_plate_format(regex_str=regex_str)
# works analogously for bp.io.saliva.load_saliva_plate()

df_cort.head()
[13]:
cortisol
subject sample
01 0 6.98890
1 7.03320
2 5.77670
3 5.25790
4 5.00795

Option 2: Use BioPsyKit to load saliva data that’s already in the “correct” wide format

During import, the data is converted into long-format (SalivaRawDataFrame) and returned.

[14]:
df_cort = bp.example_data.get_saliva_example()
# Alternatively, use your own data:
# df_cort = bp.io.saliva.load_saliva_wide_format(file_path="<path-to-cortisol-data-in-wide-format.csv>", saliva_type="cortisol")
[15]:
df_cort.head()
[15]:
cortisol
condition subject sample
Intervention Vp01 0 6.98890
1 7.03320
2 5.77670
3 5.25790
4 5.00795

Save Data

Save Dataframe as csv (in standardized format)

[16]:
# bp.io.saliva.save_saliva("<export-path.csv>", df_cort)

Saliva Data Processing

Saliva Data in SalivaRawDataFrame

[17]:
df_cort.head()
[17]:
cortisol
condition subject sample
Intervention Vp01 0 6.98890
1 7.03320
2 5.77670
3 5.25790
4 5.00795

Mean and Standard Error over all Subjects

Computing mean and standard error over all subjects returns a SalivaMeanSeDataFrame. This dataframe can directly be used for plotting functions from BioPsyKit, such as plotting.saliva_plot() (more on that later).

[18]:
df_cort_mean_se = bp.saliva.mean_se(df_cort)
df_cort_mean_se.head()
[18]:
mean se
condition sample
Control 0 4.569485 0.708467
1 5.367404 0.956169
2 5.805423 1.005129
3 7.669331 1.353230
4 7.737588 1.479832

Saliva Features

Standard Features

Compute a set of “Standard Features”, including:

  • argmax: location of maximum

  • mean: mean value

  • std: standard deviation

  • skew: skewness

  • kurt: kurtosis

using saliva.standard_features()

[19]:
bp.saliva.standard_features(df_cort).head()
[19]:
saliva_feature cortisol_argmax cortisol_mean cortisol_std cortisol_skew cortisol_kurt
condition subject
Control Vp03 3 11.433950 3.067825 -1.433263 2.214301
Vp06 5 10.483879 4.101966 -0.358008 -1.497109
Vp08 3 3.713750 1.103525 0.772725 -0.729211
Vp09 0 5.399236 1.864181 1.655006 2.695174
Vp11 4 6.400743 1.220714 -0.784137 1.954389

AUC

Area under the Curve (AUC), in different variations (according to Pruessner et al. 2003):

  • auc_g: Total Area under the Curve

  • auc_i: Area under the Curve with respect to increase

  • auc_i_post: (if compute_auc_post=True) Area under the Curve with respect to increase after the stressor: This is only relevant for an acute stress scenario where we collected saliva samples before and after the stressor. Per BioPsyKit convention, saliva samples collected after the stressor have saliva times \(t \geq 0\).

(see the documentation of saliva.auc() for further information).

Note: For these examples we neglect the first saliva sample (S0) by setting remove_s0 to True because this sample was only assessed to control for high cortisol baseline and is not relevant for feature computation. The feature computation is performed on the remaining saliva samples (S1-S6).

[20]:
# sample times must directly be supplied if they are not already part of the saliva dataframe
bp.saliva.auc(df_cort, remove_s0=True, sample_times=sample_times).head()
[20]:
saliva_feature cortisol_auc_g cortisol_auc_i
condition subject
Control Vp03 870.418075 -72.035925
Vp06 745.785975 305.071225
Vp08 261.263925 84.065675
Vp09 349.085125 -107.491025
Vp11 477.073050 56.433550

Maximum Increase

Absolute maximum increase (or the relative increase in percent if percent=True) between the first sample in the data and all others:

Note: see the documentation of saliva.max_increase() for further information)

[21]:
bp.saliva.max_increase(df_cort, remove_s0=True).head()
[21]:
saliva_feature cortisol_max_inc
condition subject
Control Vp03 0.95600
Vp06 9.44625
Vp08 3.01050
Vp09 -1.40395
Vp11 2.19135

Slope

Slope between two saliva samples (specified by sample_idx): Note: see the documentation of saliva.slope() for further information)

[22]:
bp.saliva.slope(df_cort, sample_idx=(1, 4), sample_times=sample_times).head()
[22]:
saliva_feature cortisol_slope14
condition subject
Control Vp03 0.013971
Vp06 0.141828
Vp08 0.045958
Vp09 -0.027528
Vp11 0.042968

Plotting

Using Seaborn (some very simple Examples)

[23]:
fig, ax = plt.subplots()
sns.lineplot(
    data=df_cort.reset_index(),
    x="sample",
    y="cortisol",
    hue="condition",
    hue_order=["Control", "Intervention"],
    ci=None,  # ci = None => no error bars
    ax=ax,
)
ax.set_ylabel("Cortisol [nmol/l]")
ax.set_xlabel("Sample Times")
fig.tight_layout()
/tmp/ipykernel_3349/3425057158.py:2: FutureWarning:

The `ci` parameter is deprecated. Use `errorbar=None` for the same effect.

  sns.lineplot(
../../_images/examples__notebooks_Saliva_Example_60_1.svg
[24]:
fig, ax = plt.subplots()
sns.lineplot(
    data=df_cort.reset_index(), x="sample", y="cortisol", hue="condition", hue_order=["Control", "Intervention"], ax=ax
)
ax.set_ylabel("Cortisol [nmol/l]")
ax.set_xlabel("Sample Times")
fig.tight_layout()
../../_images/examples__notebooks_Saliva_Example_61_0.svg
[25]:
fig, ax = plt.subplots()
sns.boxplot(
    data=df_cort.reset_index(),
    x="sample",
    y="cortisol",
    hue="condition",
    hue_order=["Control", "Intervention"],
    palette=cmaps.faculties_light,
    ax=ax,
)
fig.tight_layout()
../../_images/examples__notebooks_Saliva_Example_62_0.svg
[26]:
fig, ax = plt.subplots()
sns.boxplot(
    data=bp.saliva.max_increase(df_cort).reset_index(),
    x="condition",
    y="cortisol_max_inc",
    order=["Control", "Intervention"],
    palette=cmaps.faculties_light,
)
fig.tight_layout()
../../_images/examples__notebooks_Saliva_Example_63_0.svg
[27]:
fig, ax = plt.subplots()

data_features = bp.saliva.utils.saliva_feature_wide_to_long(bp.saliva.standard_features(df_cort), "cortisol")

sns.boxplot(
    data=data_features.reset_index(),
    x="saliva_feature",
    y="cortisol",
    hue="condition",
    hue_order=["Control", "Intervention"],
    palette=cmaps.faculties_light,
    ax=ax,
)
fig.tight_layout()
../../_images/examples__notebooks_Saliva_Example_64_0.svg

Using functions from BioPsyKit

[28]:
df_cort_mean_se.T
[28]:
condition Control Intervention
sample 0 1 2 3 4 5 6 0 1 2 3 4 5 6
mean 4.569485 5.367404 5.805423 7.669331 7.737588 6.964650 5.916773 6.041300 5.934754 4.971362 5.071823 5.105092 4.932746 4.645919
se 0.708467 0.956169 1.005129 1.353230 1.479832 1.396808 1.128819 0.788725 0.867792 0.696804 0.752146 0.917503 0.914560 0.869791

Note: This example shows how to use the function plotting.saliva_plot(). If you recorded saliva data within a psychological protocol, however, it’s recommended to use the Protocol API and create a Protocol object, where you can add all data and use plotting functions in a more convenient way.

Saliva Plot

[29]:
# plot without first saliva sample
bp.protocols.plotting.saliva_plot(
    df_cort_mean_se.drop("0", level="sample"),
    saliva_type="cortisol",
    sample_times=sample_times[1:],
    sample_times_absolute=True,
    test_times=[0, 30],
    test_title="TEST",
);
../../_images/examples__notebooks_Saliva_Example_69_0.svg

Saliva Features

All features in one plot using saliva_feature_boxplot(), x variable separates the features, hue variable separates the conditions.

[30]:
fig, ax = plt.subplots()
bp.protocols.plotting.saliva_feature_boxplot(
    data=data_features,
    x="saliva_feature",
    saliva_type="cortisol",
    hue="condition",
    hue_order=["Control", "Intervention"],
    palette=cmaps.faculties_light,
    ax=ax,
)
fig.tight_layout()
../../_images/examples__notebooks_Saliva_Example_72_0.svg

One subplot per feature using saliva_multi_feature_boxplot(), x variable separates the conditions. features is provided a list of the features to plot.

[31]:
bp.protocols.plotting.saliva_multi_feature_boxplot(
    data=data_features,
    saliva_type="cortisol",
    features=["argmax", "kurt", "std", "mean"],
    hue="condition",
    hue_order=["Control", "Intervention"],
    palette=cmaps.faculties_light,
)
fig.tight_layout()
../../_images/examples__notebooks_Saliva_Example_74_0.svg

Grouping of features per subplot, features is provided a dictionary that specifies the mapping.

[32]:
bp.protocols.plotting.saliva_multi_feature_boxplot(
    data=data_features,
    saliva_type="cortisol",
    features={"mean": ["mean", "argmax"], "std": ["std", "skew", "kurt"]},
    hue="condition",
    hue_order=["Control", "Intervention"],
    palette=cmaps.faculties_light,
)
fig.tight_layout()
../../_images/examples__notebooks_Saliva_Example_76_0.svg
[ ]:

Download Notebook
(Right-Click -> Save Link As...)