biopsykit.io.saliva module¶
Module wrapping biopsykit.io.biomarker including only I/O functions for saliva data.
- biopsykit.io.saliva.load_saliva_plate(file_path, saliva_type, sample_id_col=None, data_col=None, id_col_names=None, regex_str=None, sample_times=None, condition_list=None, **kwargs)[source]¶
Read saliva from an Excel sheet in ‘plate’ format.
This function automatically extracts identifier like subject, day and sample IDs from the saliva sample names. To extract them, a regular expression string can be passed via
regex_str.Here are some examples on how sample identifiers might look like and what the corresponding
regex_strwould output:“Vp01 S1” =>
r"(Vp\d+) (S\d)"(this is the default pattern, you can also just setregex_strtoNone) => data[Vp01, S1]in two columns:subject,sample(unless column names are explicitly specified indata_col_names)“Vp01 T1 S1” … “Vp01 T1 S5” (only numeric characters in day/sample) =>
r"(Vp\d+) (T\d) (S\d)"=> three columns:subject,samplewith data[Vp01, T1, S1](unless column names are explicitly specified indata_col_names)“Vp01 T1 S1” … “Vp01 T1 SA” (also letter characters in day/sample) =>
r"(Vp\d+) (T\w) (S\w)"=> three columns:subject,samplewith data[Vp01, T1, S1](unless column names are explicitly specified indata_col_names)
If you don’t want to extract the ‘S’ or ‘T’ prefixes in saliva or day IDs, respectively, you have to move it out of the capture group in the
regex_str(round brackets), like this:(S\d)(would giveS1,S2, …) =>S(\d)(would give1,2, …)- Parameters
file_path (
Pathor str) – path to the Excel sheet in ‘plate’ format containing saliva datasaliva_type (str) – saliva type to load from file
sample_id_col (str, optional) – column name of the Excel sheet containing the sample ID. Default: “sample ID”
data_col (str, optional) – column name of the Excel sheet containing saliva data to be analyzed. Default: Select default column name based on
biomarker_type, e.g.cortisol=>cortisol (nmol/l)id_col_names (list of str, optional) – names of the extracted ID column names.
Noneto use the default column names ([‘subject’, ‘day’, ‘sample’])regex_str (str, optional) – regular expression to extract subject ID, day ID and sample ID from the sample identifier.
Noneto use default regex string (r"(Vp\d+) (S\d)")sample_times (list of int, optional) – times at which saliva samples were collected
condition_list (1d-array, optional) – list of conditions which subjects were assigned to
**kwargs – Additional parameters that are passed to
pandas.read_excel()
- Returns
data – saliva data in SalivaRawDataFrame format
- Return type
- Raises
FileExtensionError – if file is no Excel file (.xls or .xlsx)
ValueError – if any saliva sample can not be converted into a float (e.g. because there was text in one of the columns)
ValidationError – if imported data can not be parsed to a SalivaRawDataFrame
- biopsykit.io.saliva.load_saliva_wide_format(file_path, saliva_type, subject_col=None, condition_col=None, additional_index_cols=None, sample_times=None, **kwargs)[source]¶
Load saliva data that is in wide-format from csv file.
It will return a SalivaRawDataFrame, a long-format dataframe that complies with BioPsyKit’s naming convention, i.e., the subject ID index will be named
subject, the sample index will be namessample, and the value column will be named after the saliva biomarker type.- Parameters
file_path (
Pathor str) – path to filesaliva_type (str) – saliva type to load from file. Example:
cortisolsubject_col (str, optional) – name of column containing subject IDs or
Noneto use the default column namesubject. According to BioPsyKit’s convention, the subject ID column is expected to have the namesubject. If the subject ID column in the file has another name, the column will be renamed in the dataframe returned by this function. Default:Nonecondition_col (str, optional) – name of the column containing condition assignments or
Noneif no conditions are present. According to BioPsyKit’s convention, the condition column is expected to have the namecondition. If the condition column in the file has another name, the column will be renamed in the dataframe returned by this function. Default:Noneadditional_index_cols (str or list of str, optional) – additional index levels to be added to the dataframe, e.g., “day” index. Can either be a string or a list strings to indicate column name(s) that should be used as index level(s), or
Nonefor no additional index levels. Default:Nonesample_times (list of int, optional) – times at which saliva samples were collected or
Noneif no sample times should be specified. Default:None**kwargs – Additional parameters that are passed to
pandas.read_csv()orpandas.read_excel()
- Returns
data – saliva data in SalivaRawDataFrame format
- Return type
- Raises
FileExtensionError – if file is no csv or Excel file
- biopsykit.io.saliva.save_saliva(file_path, data, saliva_type='cortisol', as_wide_format=False)[source]¶
Save saliva data to csv file.
- Parameters
file_path (
Pathor str) – file path to export. Must be a csv or an Excel filedata (
SalivaRawDataFrame) – saliva data in SalivaRawDataFrame formatsaliva_type (str) – type of saliva data in the dataframe
as_wide_format (bool, optional) –
Trueto save data in wide format (and flatten all index levels),Falseto save data in long-format. Default:False
- Raises
ValidationError – if
datais not a SalivaRawDataFrameFileExtensionError – if
file_pathis not a csv or Excel file