biopsykit.io.saliva module¶

Module wrapping biopsykit.io.biomarker including only I/O functions for saliva data.

biopsykit.io.saliva.load_saliva_plate(file_path, saliva_type, sample_id_col=None, data_col=None, id_col_names=None, regex_str=None, sample_times=None, condition_list=None, **kwargs)[source]¶

Read saliva from an Excel sheet in ‘plate’ format.

This function automatically extracts identifier like subject, day and sample IDs from the saliva sample names. To extract them, a regular expression string can be passed via regex_str.

Here are some examples on how sample identifiers might look like and what the corresponding regex_str would output:

“Vp01 S1” => r"(Vp\d+) (S\d)" (this is the default pattern, you can also just set regex_str to None) => data [Vp01, S1] in two columns: subject, sample (unless column names are explicitly specified in data_col_names)

“Vp01 T1 S1” … “Vp01 T1 S5” (only numeric characters in day/sample) => r"(Vp\d+) (T\d) (S\d)" => three columns: subject, sample with data [Vp01, T1, S1] (unless column names are explicitly specified in data_col_names)

“Vp01 T1 S1” … “Vp01 T1 SA” (also letter characters in day/sample) => r"(Vp\d+) (T\w) (S\w)" => three columns: subject, sample with data [Vp01, T1, S1] (unless column names are explicitly specified in data_col_names)

If you don’t want to extract the ‘S’ or ‘T’ prefixes in saliva or day IDs, respectively, you have to move it out of the capture group in the regex_str (round brackets), like this: (S\d) (would give S1, S2, …) => S(\d) (would give 1, 2, …)

Parameters

file_path (Path or str) – path to the Excel sheet in ‘plate’ format containing saliva data
saliva_type (str) – saliva type to load from file
sample_id_col (str, optional) – column name of the Excel sheet containing the sample ID. Default: “sample ID”
data_col (str, optional) – column name of the Excel sheet containing saliva data to be analyzed. Default: Select default column name based on biomarker_type, e.g. cortisol => cortisol (nmol/l)
id_col_names (list of str, optional) – names of the extracted ID column names. None to use the default column names ([‘subject’, ‘day’, ‘sample’])
regex_str (str, optional) – regular expression to extract subject ID, day ID and sample ID from the sample identifier. None to use default regex string (r"(Vp\d+) (S\d)")
sample_times (list of int, optional) – times at which saliva samples were collected
condition_list (1d-array, optional) – list of conditions which subjects were assigned to
**kwargs – Additional parameters that are passed to pandas.read_excel()

Returns

data – saliva data in SalivaRawDataFrame format

Return type

SalivaRawDataFrame

Raises

FileExtensionError – if file is no Excel file (.xls or .xlsx)
ValueError – if any saliva sample can not be converted into a float (e.g. because there was text in one of the columns)
ValidationError – if imported data can not be parsed to a SalivaRawDataFrame

biopsykit.io.saliva.save_saliva(file_path, data, saliva_type='cortisol', as_wide_format=False)[source]¶

Save saliva data to csv file.

Parameters

file_path (Path or str) – file path to export. Must be a csv or an Excel file
data (SalivaRawDataFrame) – saliva data in SalivaRawDataFrame format
saliva_type (str) – type of saliva data in the dataframe
as_wide_format (bool, optional) – True to save data in wide format (and flatten all index levels), False to save data in long-format. Default: False

Raises

ValidationError – if data is not a SalivaRawDataFrame
FileExtensionError – if file_path is not a csv or Excel file

biopsykit.io.saliva.load_saliva_wide_format(file_path, saliva_type, subject_col=None, condition_col=None, additional_index_cols=None, sample_times=None, **kwargs)[source]¶

Load saliva data that is in wide-format from csv file.

It will return a SalivaRawDataFrame, a long-format dataframe that complies with BioPsyKit’s naming convention, i.e., the subject ID index will be named subject, the sample index will be names sample, and the value column will be named after the saliva biomarker type.

Parameters

file_path (Path or str) – path to file
saliva_type (str) – saliva type to load from file. Example: cortisol
subject_col (str, optional) – name of column containing subject IDs or None to use the default column name subject. According to BioPsyKit’s convention, the subject ID column is expected to have the name subject. If the subject ID column in the file has another name, the column will be renamed in the dataframe returned by this function. Default: None
condition_col (str, optional) – name of the column containing condition assignments or None if no conditions are present. According to BioPsyKit’s convention, the condition column is expected to have the name condition. If the condition column in the file has another name, the column will be renamed in the dataframe returned by this function. Default: None
additional_index_cols (str or list of str, optional) – additional index levels to be added to the dataframe, e.g., “day” index. Can either be a string or a list strings to indicate column name(s) that should be used as index level(s), or None for no additional index levels. Default: None
sample_times (list of int, optional) – times at which saliva samples were collected or None if no sample times should be specified. Default: None
**kwargs – Additional parameters that are passed to pandas.read_csv() or pandas.read_excel()

Returns

data – saliva data in SalivaRawDataFrame format

Return type

SalivaRawDataFrame

Raises

FileExtensionError – if file is no csv or Excel file

biopsykit.io.psg module biopsykit.io.sleep module