biopsykit.signals.ecg.ecg module

Module for processing ECG data.

class biopsykit.signals.ecg.ecg.EcgProcessor(data, sampling_rate=None, time_intervals=None, include_start=False)[source]

Bases: biopsykit.signals._base._BaseProcessor

Initialize a new EcgProcessor instance.

To use this class simply pass data in form of a DataFrame (or a dict of such). If the data was recorded during a study that consists of multiple phases, the ECG data can be split into single phases by passing time information via the time_intervals parameter.

Each instance of EcgProcessor has the following attributes:

  • data: dict with raw ECG data, split into the specified phases. If data was not split the dictionary only has one entry, accessible by the key Data

  • ecg_result : dict with ECG processing results from data. Each dataframe in the dict has the following columns:

    • ECG_Raw: Raw ECG signal

    • ECG_Clean: Cleaned (filtered) ECG signal

    • ECG_Quality: Quality indicator in the range of [0,1] for ECG signal quality

    • ECG_R_Peaks: 1.0 where R peak was detected in the ECG signal, 0.0 else

    • R_Peak_Outlier: 1.0 when a detected R peak was classified as outlier, 0.0 else

    • Heart_Rate: Computed Heart rate interpolated to the length of the raw ECG signal

  • heart_rate: dict with heart rate derived from data. Each dataframe in the dict has the following columns:

    • Heart_Rate: Computed heart rate for each detected R peak

  • rpeaks: dict with R peak location indices derived from data. Each dataframe in the dict has the following columns:

    • R_Peak_Quality: Quality indicator in the range of [0,1] for quality of the original ECG signal

    • R_Peak_Idx: Index of detected R peak in the raw ECG signal

    • RR_Interval: Interval between the current and the successive R peak in seconds

    • R_Peak_Outlier: 1.0 when a detected R peak was classified as outlier, 0.0 else

You can either pass a data dictionary ‘data_dict’ containing ECG data or dataframe containing ECG data. For the latter, you can additionally supply time information via time_intervals parameter to automatically split the data into single phases.

Parameters
  • data (EcgRawDataFrame or dict) – dataframe (or dict of such) with ECG data

  • sampling_rate (float, optional) – sampling rate of recorded data in Hz

  • time_intervals (dict or Series, optional) – time intervals indicating how data should be split. Can either be a Series with the start times of the single phases (the phase names are then derived from the index) or a dictionary with tuples indicating start and end times of phases (the phase names are then derived from the dict keys). Default: None (data is not split further)

  • include_start (bool, optional) – True to include the data from the beginning of the recording to the first time interval as the first phase (then named Start), False otherwise. Default: False

Examples

>>> # Example using NilsPod Dataset
>>> from biopsykit.io.nilspod import load_dataset_nilspod
>>> from biopsykit.signals.ecg import EcgProcessor
>>>
>>> # path to file
>>> file_path = "./NilsPod_TestData.bin"
>>> # time zone of the recording (optional)
>>> timezone = "Europe/Berlin"
>>>
>>> # define time intervals of the different recording phases
>>> time_intervals = {"Part1": ("09:00", "09:30"), "Part2": ("09:30", "09:45"), "Part3": ("09:45", "10:00")}
>>>
>>> # load data from binary file
>>> data, sampling_rate = load_dataset_nilspod(file_path=file_path, datastreams=['ecg'], timezone=timezone)
>>> ecg_processor = EcgProcessor(data=data, sampling_rate=sampling_rate, time_intervals=time_intervals)
ecg_result: Dict[str, Union[biopsykit.utils.datatype_helper._EcgResultDataFrame, pandas.core.frame.DataFrame]]

Dictionary with ECG processing result dataframes, split into different phases.

Each dataframe is expected to be a EcgResultDataFrame.

See also

EcgResultDataFrame

dataframe format

heart_rate: Dict[str, Union[biopsykit.utils.datatype_helper._HeartRateDataFrame, pandas.core.frame.DataFrame]]

Dictionary with time-series heart rate data, split into different phases.

See also

HeartRatePhaseDict

dictionary format

rpeaks: Dict[str, Union[biopsykit.utils.datatype_helper._RPeakDataFrame, pandas.core.frame.DataFrame]]

Dictionary with R peak location indices, split into different phases.

See also

RPeakDataFrame

dictionary format

property ecg: Dict[str, pandas.core.frame.DataFrame]

Return ECG signal after filtering, split into different phases.

Returns

dictionary with filtered ECG signal per phase

Return type

dict

property hr_result: Dict[str, Union[biopsykit.utils.datatype_helper._HeartRateDataFrame, pandas.core.frame.DataFrame]]

Return heart rate result from ECG processing, split into different phases.

Returns

dictionary with time-series heart rate per phase

Return type

dict

ecg_process(outlier_correction='all', outlier_params=None, title=None, method=None, errors='raise')[source]

Process ECG signal.

The ECG processing pipeline consists of the following steps:

  • Filtering: Uses ecg_clean() to clean the ECG signal and prepare it for R peak detection

  • R-peak detection: Uses ecg_peaks() to find and extract R peaks.

  • Outlier correction (optional): Uses correct_outlier() to check detected R peaks for outlier and impute removed outlier by linear interpolation.

Parameters
  • outlier_correction (list, all or None, optional) – List containing outlier correction methods to be applied. Alternatively, pass all to apply all available outlier correction methods, or None to not apply any outlier correction. See outlier_corrections() to get a list of possible outlier correction methods. Default: all

  • outlier_params (dict) – Dictionary of outlier correction parameters or None for default parameters. See outlier_params_default() for the default parameters. Default: None

  • title (str, optional) – title of ECG processing progress bar in Jupyter Notebooks or None to leave empty. Default: None

  • method ({'neurokit', 'hamilton', 'pantompkins', 'elgendi', ... }, optional) – method used to clean ECG signal and perform R-peak detection as defined by the neurokit library (see ecg_clean() and ecg_peaks()) or None to use default method (neurokit).

  • errors ({'raise', 'warn', 'ignore'}, optional) –

    how to handle errors during ECG processing:

    • ”raise” (default): raise an error

    • ”warn”: issue a warning but still return a dictionary with empty results for this data

    • ”ignore”: ignore errors and continue processing

Return type

None

See also

correct_outlier()

function to perform R peak outlier correction

outlier_corrections()

list of all available outlier correction methods

outlier_params_default()

dictionary with default parameters for outlier correction

ecg_clean()

neurokit method to clean ECG signal

ecg_peaks()

neurokit method for R-peak detection

Examples

>>> from biopsykit.signals.ecg import EcgProcessor
>>> # initialize EcgProcessor instance
>>> ecg_processor = EcgProcessor(...)
>>> # use default outlier correction pipeline
>>> ecg_processor.ecg_process()
>>> # don't apply any outlier correction
>>> ecg_processor.ecg_process(outlier_correction=None)
>>> # use custom outlier correction pipeline: only physiological and statistical outlier with custom thresholds
>>> methods = ["physiological", "statistical"]
>>> params = {
>>>    'physiological': (50, 150),
>>>    'statistical': 2.576
>>>}
>>> ecg_processor.ecg_process(outlier_correction=methods, outlier_params=params)
>>> # Print available results from ECG processing
>>> print(ecg_processor.ecg_result)
>>> print(ecg_processor.rpeaks)
>>> print(ecg_processor.heart_rate)
classmethod outlier_corrections()[source]

Return all possible outlier correction methods.

Currently available outlier correction methods are:

  • correlation: Computes cross-correlation coefficient between every single beat and the average of all detected beats. Marks beats as outlier if cross-correlation coefficient is below a certain threshold.

  • quality: Uses the ECG_Quality indicator from neurokit to assess signal quality. Marks beats as outlier if the quality indicator is below a certain threshold.

  • artifact: Artifact detection based on Berntson et al. (1990).

  • physiological: Physiological outlier removal. Marks beats as outlier if their heart rate is above or below a threshold that is very unlikely to be achieved physiologically.

  • statistical_rr: Statistical outlier removal based on RR intervals. Marks beats as outlier if the RR intervals are within the xx% highest or lowest values. Values are removed based on the z-score; e.g. 1.96 => 5% (2.5% highest, 2.5% lowest values); 2.576 => 1% (0.5% highest, 0.5% lowest values)

  • statistical_rr_diff: Statistical outlier removal based on successive differences of RR intervals. Marks beats as outlier if the difference of successive RR intervals are within the xx% highest or lowest heart rates. Values are removed based on the z-score; e.g. 1.96 => 5% (2.5% highest, 2.5% lowest values); 2.576 => 1% (0.5% highest, 0.5% lowest values).

See also

correct_outlier()

function to perform R peak outlier correction

outlier_params_default()

dictionary with default parameters for outlier correction

Returns

keys of all possible outlier correction methods

Return type

list

References

Berntson, G. G., Quigley, K. S., Jang, J. F., & Boysen, S. T. (1990). An Approach to Artifact Identification: Application to Heart Period Data. Psychophysiology, 27(5), 586-598. https://doi.org/10.1111/j.1469-8986.1990.tb01982.x

classmethod outlier_params_default()[source]

Return default parameter for all outlier correction methods.

Note

The outlier correction method artifact has no threshold, but 0.0 is the default parameter in order to provide a homogenous interface

See also

correct_outlier()

function to perform R peak outlier correction

outlier_corrections()

list with available outlier correction methods

Returns

default parameters for outlier correction methods

Return type

dict

classmethod correct_outlier(ecg_processor=None, key=None, ecg_signal=None, rpeaks=None, outlier_correction='all', outlier_params=None, imputation_type=None, sampling_rate=256.0)[source]

Perform outlier correction on the detected R peaks.

Different methods for outlier detection are available (see outlier_corrections() to get a list of possible outlier correction methods). All outlier methods work independently on the detected R peaks, the results will be combined by a logical ‘or’.

RR intervals classified as outliers will be removed and imputed either using linear interpolation (setting imputation_type to linear) or by replacing it with the average value of the 10 preceding and 10 succeding RR intervals (setting imputation_type to moving_average).

To use this function, either simply pass an EcgProcessor object together with a key indicating which phase needs to be processed should be processed or the two dataframes ecg_signal and rpeaks resulting from ecg_process().

Parameters
  • ecg_processor (EcgProcessor, optional) – EcgProcessor object. If this argument is supplied, the key argument needs to be supplied as well

  • key (str, optional) – Dictionary key of the phase to process. Needed when ecg_processor is passed as argument

  • ecg_signal (EcgResultDataFrame, optional) – Dataframe with processed ECG signal. Output from ecg_process()

  • rpeaks (RPeakDataFrame, optional) – Dataframe with detected R peaks. Output from ecg_process()

  • outlier_correction (list, optional) – List containing the outlier correction methods to be applied. Pass None to not apply any outlier correction, all to apply all available outlier correction methods. See outlier_corrections() to get a list of possible outlier correction methods. Default: all

  • outlier_params (dict, optional) – Dict of parameters to be passed to the outlier correction methods or None to use default parameters (see outlier_params_default() for more information). Default: None

  • imputation_type (str, optional) – Method for outlier imputation: linear for linear interpolation between the RR intervals before and after R peak outlier, or moving_average for average value of the 10 preceding and 10 succeding RR intervals. Default: None (corresponds to moving_average)

  • sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Not needed if ecg_processor is supplied as parameter. Default: 256

Returns

Return type

Tuple[Union[biopsykit.utils.datatype_helper._EcgResultDataFrame, pandas.core.frame.DataFrame], Union[biopsykit.utils.datatype_helper._RPeakDataFrame, pandas.core.frame.DataFrame]]

See also

ecg_process()

function for ECG signal processing

outlier_corrections()

list of all available outlier correction methods

outlier_params_default()

dictionary with default parameters for outlier correction

Examples

>>> from biopsykit.signals.ecg import EcgProcessor
>>> # initialize EcgProcessor instance
>>> ecg_processor = EcgProcessor(...)
>>> # Option 1: Use default outlier correction pipeline
>>> ecg_signal, rpeaks = ecg_processor.correct_outlier(ecg_processor, key="Data")
>>> print(ecg_signal)
>>> print(rpeaks)
>>> # Option 2: Use custom outlier correction pipeline: only physiological and statistical
>>> # RR interval outlier with custom thresholds
>>> methods = ["physiological", "statistical_rr"]
>>> params = {
>>>    'physiological': (50, 150),
>>>    'statistical_rr': 2.576
>>>}
>>> ecg_signal, rpeaks = ecg_processor.correct_outlier(
>>>                             ecg_processor, key="Data",
>>>                             outlier_correction=methods,
>>>                             outlier_params=params
>>>                         )
>>> print(ecg_signal)
>>> print(rpeaks)
classmethod correct_rpeaks(ecg_processor=None, key=None, rpeaks=None, sampling_rate=256.0)[source]

Perform R peak correction algorithms to get less noisy HRV parameters.

R peak correction comes from neurokit and is based on an algorithm by Lipponen et al. (2019).

To use this function, either simply pass an EcgProcessor object together with a key indicating which phase needs to be processed should be processed or the dataframe rpeaks which is a result from ecg_process().

Warning

This algorithm might add additional R peaks or remove certain ones, so results of this function might not match with the R peaks of rpeaks(). Thus, R peaks resulting from this function might not be used in combination with ecg() since R peak indices won’t match.

Note

In BioPsyKit this function is not applied to the detected R peaks during ECG signal processing but only used right before passing R peaks to hrv_process().

Parameters
  • ecg_processor (EcgProcessor, optional) – EcgProcessor object. If this argument is supplied, the key argument needs to be supplied as well.

  • key (str, optional) – Dictionary key of the phase to process. Needed when ecg_processor is passed as argument.

  • rpeaks (RPeakDataFrame, optional) – Dataframe with detected R peaks. Output from ecg_process().

  • sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Not needed if ecg_processor is supplied as parameter. Default: 256

Returns

dataframe containing corrected R peak indices

Return type

DataFrame

References

Lipponen, J. A., & Tarvainen, M. P. (2019). A robust algorithm for heart rate variability time series artefact correction using novel beat classification. Journal of Medical Engineering and Technology, 43(3), 173-181. https://doi.org/10.1080/03091902.2019.1640306

Examples

>>> from biopsykit.signals.ecg import EcgProcessor
>>> # initialize EcgProcessor instance
>>> ep = EcgProcessor(...)
>>> # correct R peak locations
>>> rpeaks_corrected = ep.correct_rpeaks(ecg_processor, key="Data")
classmethod hrv_process(ecg_processor=None, key=None, rpeaks=None, hrv_types=None, correct_rpeaks=True, index=None, index_name=None, sampling_rate=256.0)[source]

Compute HRV parameters on the given data.

By default, it applies R peak correction (see correct_rpeaks()) before computing HRV parameters.

To use this function, either simply pass an EcgProcessor object together with a key indicating which phase needs to be processed should be processed or the dataframe rpeaks which is a result from ecg_process().

Parameters
  • ecg_processor (EcgProcessor, optional) – EcgProcessor object. If this argument is supplied, the key argument needs to be supplied as well.

  • key (str, optional) – Dictionary key of the phase to process. Needed when ecg_processor is passed as argument.

  • rpeaks (RPeakDataFrame, optional) – Dataframe with detected R peaks. Output from ecg_process().

  • hrv_types (str (or list of such), optional) – list of HRV types to be computed. Must be a subset of [“hrv_time”, “hrv_nonlinear”, “hrv_frequency”] or “all” to compute all types of HRV. Refer to neurokit2.hrv.hrv() for further information on the available HRV parameters. Default: None (equals to [“hrv_time”, “hrv_nonlinear”])

  • correct_rpeaks (bool, optional) – True to apply R peak correction (using correct_rpeaks()) before computing HRV parameters, False otherwise. Default: True

  • index (str, optional) – Index of the computed HRV parameters. Used to concatenate HRV processing results from multiple phases into one joint dataframe later on. Default: None

  • index_name (str, optional) – Index name of the output dataframe. Only used if index is also supplied. Default: None

  • sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Not needed if ecg_processor is supplied as parameter. Default: 256

Returns

dataframe with computed HRV parameters

Return type

DataFrame

Examples

>>> from biopsykit.signals.ecg import EcgProcessor
>>> # initialize EcgProcessor instance
>>> ecg_processor = EcgProcessor(...)
>>> # HRV processing using default parameters (time and nonlinear), including R peak correction
>>> hrv_output = ecg_processor.hrv_process(ecg_processor, key="Data")
>>> # HRV processing using using all types, and without R peak correction
>>> hrv_output = ecg_processor.hrv_process(ecg_processor, key="Data", hrv_types='all', correct_rpeaks=False)
hrv_batch_process(hrv_types=None)[source]

Compute HRV parameters over all phases.

This function computes HRV parameters over all phases using hrv_process().

Parameters

hrv_types (str (or list of such), optional) – list of HRV types to be computed. Must be a subset of [‘hrv_time’, ‘hrv_nonlinear’, ‘hrv_frequency’] or ‘all’ to compute all types of HRV. Refer to neurokit2.hrv.hrv() for further information on the available HRV parameters. Default: None (equals to [‘hrv_time’, ‘hrv_nonlinear’])

Returns

dataframe with HRV parameters over all phases

Return type

DataFrame

classmethod ecg_estimate_rsp(ecg_processor=None, key=None, ecg_signal=None, rpeaks=None, edr_type=None, sampling_rate=256)[source]

Estimate respiration signal from ECG (ECG-derived respiration, EDR).

To use this function, either simply pass an EcgProcessor object together with a key indicating which phase needs to be processed should be processed or the two dataframes ecg_signal and rpeaks resulting from ecg_process().

Parameters
  • ecg_processor (EcgProcessor, optional) – EcgProcessor object. If this argument is supplied, the key argument needs to be supplied as well.

  • key (str, optional) – Dictionary key of the phase to process. Needed when ecg_processor is passed as argument.

  • ecg_signal (EcgResultDataFrame, optional) – Dataframe with processed ECG signal. Output from ecg_process().

  • rpeaks (RPeakDataFrame, optional) – Dataframe with detected R peaks. Output from ecg_process()

  • edr_type ({'peak_trough_mean', 'peak_trough_diff', 'peak_peak_interval'}, optional) – Method to use for estimating EDR. Must be one of ‘peak_trough_mean’, ‘peak_trough_diff’, or ‘peak_peak_interval’. Default: ‘peak_trough_mean’

  • sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Not needed if ecg_processor is supplied as parameter. Default: 256

Returns

dataframe with estimated respiration signal

Return type

DataFrame

Examples

>>> from biopsykit.signals.ecg import EcgProcessor
>>> # initialize EcgProcessor instance
>>> ecg_processor = EcgProcessor(...)
>>> # Extract respiration signal estimated from ECG using the 'peak_trough_diff' method
>>> rsp_signal = ecg_processor.ecg_estimate_rsp(ecg_processor, key="Data", edr_type='peak_trough_diff')
classmethod rsa_process(ecg_signal, rsp_signal, sampling_rate=256)[source]

Compute respiratory sinus arrhythmia (RSA) based on ECG and respiration signal.

RSA is computed both via Peak-to-Trough (P2T) Porges-Bohrer method using hrv_rsa().

Parameters
  • ecg_signal (EcgResultDataFrame, optional) – Dataframe with processed ECG signal. Output from ecg_process().

  • rsp_signal (pd.DataFrame) – Dataframe with 1-D raw respiration signal. Can be a ‘true’ respiration signal (e.g. from bioimpedance or Radar) or an ‘estimated’ respiration signal (e.g. from ECG-derived respiration).

  • sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Default: 256

Returns

Dictionary containing computed RSA metrics.

Return type

dict

See also

hrv_rsa()

compute respiratory sinus arrhythmia

Examples

>>> from biopsykit.signals.ecg import EcgProcessor
>>> # initialize EcgProcessor instance
>>> ecg_processor = EcgProcessor(...)
>>> ecg_signal = ecg_processor.ecg_result['Data']
>>> # Extract respiration signal estimated from ECG using the 'peak_trough_diff' method
>>> rsp_signal = ecg_processor.ecg_estimate_rsp(ecg_processor, key="Data", edr_type='peak_trough_diff')
>>> # Compute RSA from ECG and Respiration data
>>> rsa_output = ecg_processor.rsa_process(ecg_signal, rsp_signal)
sampling_rate: float

Sampling rate of recorded data.

data: Dict[str, pandas.core.frame.DataFrame]

Dictionary with raw data, split into different phases.

Each dataframe is expected to be a DataFrame.