biopsykit.signals.ecg.ecg module¶
Module for processing ECG data.
- class biopsykit.signals.ecg.ecg.EcgProcessor(data, sampling_rate=None, time_intervals=None, include_start=False)[source]¶
Bases:
biopsykit.signals._base._BaseProcessor
Initialize a new
EcgProcessor
instance.To use this class simply pass data in form of a
DataFrame
(or a dict of such). If the data was recorded during a study that consists of multiple phases, the ECG data can be split into single phases by passing time information via thetime_intervals
parameter.Each instance of
EcgProcessor
has the following attributes:data
: dict with raw ECG data, split into the specified phases. If data was not split the dictionary only has one entry, accessible by the keyData
ecg_result
: dict with ECG processing results fromdata
. Each dataframe in the dict has the following columns:ECG_Raw
: Raw ECG signalECG_Clean
: Cleaned (filtered) ECG signalECG_Quality
: Quality indicator in the range of [0,1] for ECG signal qualityECG_R_Peaks
: 1.0 where R peak was detected in the ECG signal, 0.0 elseR_Peak_Outlier
: 1.0 when a detected R peak was classified as outlier, 0.0 elseHeart_Rate
: Computed Heart rate interpolated to the length of the raw ECG signal
heart_rate
: dict with heart rate derived fromdata
. Each dataframe in the dict has the following columns:Heart_Rate
: Computed heart rate for each detected R peak
rpeaks
: dict with R peak location indices derived fromdata
. Each dataframe in the dict has the following columns:R_Peak_Quality
: Quality indicator in the range of [0,1] for quality of the original ECG signalR_Peak_Idx
: Index of detected R peak in the raw ECG signalRR_Interval
: Interval between the current and the successive R peak in secondsR_Peak_Outlier
: 1.0 when a detected R peak was classified as outlier, 0.0 else
You can either pass a data dictionary ‘data_dict’ containing ECG data or dataframe containing ECG data. For the latter, you can additionally supply time information via
time_intervals
parameter to automatically split the data into single phases.- Parameters
data (
EcgRawDataFrame
or dict) – dataframe (or dict of such) with ECG datasampling_rate (float, optional) – sampling rate of recorded data in Hz
time_intervals (dict or
Series
, optional) – time intervals indicating howdata
should be split. Can either be aSeries
with the start times of the single phases (the phase names are then derived from the index) or a dictionary with tuples indicating start and end times of phases (the phase names are then derived from the dict keys). Default:None
(data is not split further)include_start (bool, optional) –
True
to include the data from the beginning of the recording to the first time interval as the first phase (then namedStart
),False
otherwise. Default:False
Examples
>>> # Example using NilsPod Dataset >>> from biopsykit.io.nilspod import load_dataset_nilspod >>> from biopsykit.signals.ecg import EcgProcessor >>> >>> # path to file >>> file_path = "./NilsPod_TestData.bin" >>> # time zone of the recording (optional) >>> timezone = "Europe/Berlin" >>> >>> # define time intervals of the different recording phases >>> time_intervals = {"Part1": ("09:00", "09:30"), "Part2": ("09:30", "09:45"), "Part3": ("09:45", "10:00")} >>> >>> # load data from binary file >>> data, sampling_rate = load_dataset_nilspod(file_path=file_path, datastreams=['ecg'], timezone=timezone) >>> ecg_processor = EcgProcessor(data=data, sampling_rate=sampling_rate, time_intervals=time_intervals)
- ecg_result: Dict[str, Union[biopsykit.utils.datatype_helper._EcgResultDataFrame, pandas.core.frame.DataFrame]]¶
Dictionary with ECG processing result dataframes, split into different phases.
Each dataframe is expected to be a
EcgResultDataFrame
.See also
EcgResultDataFrame
dataframe format
- heart_rate: Dict[str, Union[biopsykit.utils.datatype_helper._HeartRateDataFrame, pandas.core.frame.DataFrame]]¶
Dictionary with time-series heart rate data, split into different phases.
See also
HeartRatePhaseDict
dictionary format
- rpeaks: Dict[str, Union[biopsykit.utils.datatype_helper._RPeakDataFrame, pandas.core.frame.DataFrame]]¶
Dictionary with R peak location indices, split into different phases.
See also
RPeakDataFrame
dictionary format
- property ecg: Dict[str, pandas.core.frame.DataFrame]¶
Return ECG signal after filtering, split into different phases.
- Returns
dictionary with filtered ECG signal per phase
- Return type
- property hr_result: Dict[str, Union[biopsykit.utils.datatype_helper._HeartRateDataFrame, pandas.core.frame.DataFrame]]¶
Return heart rate result from ECG processing, split into different phases.
- Returns
dictionary with time-series heart rate per phase
- Return type
- ecg_process(outlier_correction='all', outlier_params=None, title=None, method=None, errors='raise')[source]¶
Process ECG signal.
The ECG processing pipeline consists of the following steps:
Filtering
: Usesecg_clean()
to clean the ECG signal and prepare it for R peak detectionR-peak detection
: Usesecg_peaks()
to find and extract R peaks.Outlier correction
(optional): Usescorrect_outlier()
to check detected R peaks for outlier and impute removed outlier by linear interpolation.
- Parameters
outlier_correction (list,
all
orNone
, optional) – List containing outlier correction methods to be applied. Alternatively, passall
to apply all available outlier correction methods, orNone
to not apply any outlier correction. Seeoutlier_corrections()
to get a list of possible outlier correction methods. Default:all
outlier_params (dict) – Dictionary of outlier correction parameters or
None
for default parameters. Seeoutlier_params_default()
for the default parameters. Default:None
title (str, optional) – title of ECG processing progress bar in Jupyter Notebooks or
None
to leave empty. Default:None
method ({'neurokit', 'hamilton', 'pantompkins', 'elgendi', ... }, optional) – method used to clean ECG signal and perform R-peak detection as defined by the
neurokit
library (seeecg_clean()
andecg_peaks()
) orNone
to use default method (neurokit
).errors ({'raise', 'warn', 'ignore'}, optional) –
how to handle errors during ECG processing:
”raise” (default): raise an error
”warn”: issue a warning but still return a dictionary with empty results for this data
”ignore”: ignore errors and continue processing
- Return type
None
See also
correct_outlier()
function to perform R peak outlier correction
outlier_corrections()
list of all available outlier correction methods
outlier_params_default()
dictionary with default parameters for outlier correction
ecg_clean()
neurokit method to clean ECG signal
ecg_peaks()
neurokit method for R-peak detection
Examples
>>> from biopsykit.signals.ecg import EcgProcessor >>> # initialize EcgProcessor instance >>> ecg_processor = EcgProcessor(...)
>>> # use default outlier correction pipeline >>> ecg_processor.ecg_process()
>>> # don't apply any outlier correction >>> ecg_processor.ecg_process(outlier_correction=None)
>>> # use custom outlier correction pipeline: only physiological and statistical outlier with custom thresholds >>> methods = ["physiological", "statistical"] >>> params = { >>> 'physiological': (50, 150), >>> 'statistical': 2.576 >>>} >>> ecg_processor.ecg_process(outlier_correction=methods, outlier_params=params)
>>> # Print available results from ECG processing >>> print(ecg_processor.ecg_result) >>> print(ecg_processor.rpeaks) >>> print(ecg_processor.heart_rate)
- classmethod outlier_corrections()[source]¶
Return all possible outlier correction methods.
Currently available outlier correction methods are:
correlation
: Computes cross-correlation coefficient between every single beat and the average of all detected beats. Marks beats as outlier if cross-correlation coefficient is below a certain threshold.quality
: Uses theECG_Quality
indicator from neurokit to assess signal quality. Marks beats as outlier if the quality indicator is below a certain threshold.artifact
: Artifact detection based on Berntson et al. (1990).physiological
: Physiological outlier removal. Marks beats as outlier if their heart rate is above or below a threshold that is very unlikely to be achieved physiologically.statistical_rr
: Statistical outlier removal based on RR intervals. Marks beats as outlier if the RR intervals are within the xx% highest or lowest values. Values are removed based on the z-score; e.g. 1.96 => 5% (2.5% highest, 2.5% lowest values); 2.576 => 1% (0.5% highest, 0.5% lowest values)statistical_rr_diff
: Statistical outlier removal based on successive differences of RR intervals. Marks beats as outlier if the difference of successive RR intervals are within the xx% highest or lowest heart rates. Values are removed based on the z-score; e.g. 1.96 => 5% (2.5% highest, 2.5% lowest values); 2.576 => 1% (0.5% highest, 0.5% lowest values).
See also
correct_outlier()
function to perform R peak outlier correction
outlier_params_default()
dictionary with default parameters for outlier correction
- Returns
keys of all possible outlier correction methods
- Return type
References
Berntson, G. G., Quigley, K. S., Jang, J. F., & Boysen, S. T. (1990). An Approach to Artifact Identification: Application to Heart Period Data. Psychophysiology, 27(5), 586-598. https://doi.org/10.1111/j.1469-8986.1990.tb01982.x
- classmethod outlier_params_default()[source]¶
Return default parameter for all outlier correction methods.
Note
The outlier correction method
artifact
has no threshold, but0.0
is the default parameter in order to provide a homogenous interfaceSee also
correct_outlier()
function to perform R peak outlier correction
outlier_corrections()
list with available outlier correction methods
- Returns
default parameters for outlier correction methods
- Return type
- classmethod correct_outlier(ecg_processor=None, key=None, ecg_signal=None, rpeaks=None, outlier_correction='all', outlier_params=None, imputation_type=None, sampling_rate=256.0)[source]¶
Perform outlier correction on the detected R peaks.
Different methods for outlier detection are available (see
outlier_corrections()
to get a list of possible outlier correction methods). All outlier methods work independently on the detected R peaks, the results will be combined by a logical ‘or’.RR intervals classified as outliers will be removed and imputed either using linear interpolation (setting
imputation_type
tolinear
) or by replacing it with the average value of the 10 preceding and 10 succeding RR intervals (settingimputation_type
tomoving_average
).To use this function, either simply pass an
EcgProcessor
object together with akey
indicating which phase needs to be processed should be processed or the two dataframesecg_signal
andrpeaks
resulting fromecg_process()
.- Parameters
ecg_processor (
EcgProcessor
, optional) –EcgProcessor
object. If this argument is supplied, thekey
argument needs to be supplied as wellkey (str, optional) – Dictionary key of the phase to process. Needed when
ecg_processor
is passed as argumentecg_signal (
EcgResultDataFrame
, optional) – Dataframe with processed ECG signal. Output fromecg_process()
rpeaks (
RPeakDataFrame
, optional) – Dataframe with detected R peaks. Output fromecg_process()
outlier_correction (list, optional) – List containing the outlier correction methods to be applied. Pass
None
to not apply any outlier correction,all
to apply all available outlier correction methods. Seeoutlier_corrections()
to get a list of possible outlier correction methods. Default:all
outlier_params (dict, optional) – Dict of parameters to be passed to the outlier correction methods or
None
to use default parameters (seeoutlier_params_default()
for more information). Default:None
imputation_type (str, optional) – Method for outlier imputation:
linear
for linear interpolation between the RR intervals before and after R peak outlier, ormoving_average
for average value of the 10 preceding and 10 succeding RR intervals. Default:None
(corresponds tomoving_average
)sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Not needed if
ecg_processor
is supplied as parameter. Default: 256
- Returns
ecg_signal (
EcgResultDataFrame
) – processed ECG signal in standardized formatrpeaks (
RPeakDataFrame
) – extracted R peaks in standardized format
- Return type
Tuple[Union[biopsykit.utils.datatype_helper._EcgResultDataFrame, pandas.core.frame.DataFrame], Union[biopsykit.utils.datatype_helper._RPeakDataFrame, pandas.core.frame.DataFrame]]
See also
ecg_process()
function for ECG signal processing
outlier_corrections()
list of all available outlier correction methods
outlier_params_default()
dictionary with default parameters for outlier correction
Examples
>>> from biopsykit.signals.ecg import EcgProcessor >>> # initialize EcgProcessor instance >>> ecg_processor = EcgProcessor(...)
>>> # Option 1: Use default outlier correction pipeline >>> ecg_signal, rpeaks = ecg_processor.correct_outlier(ecg_processor, key="Data") >>> print(ecg_signal) >>> print(rpeaks) >>> # Option 2: Use custom outlier correction pipeline: only physiological and statistical >>> # RR interval outlier with custom thresholds >>> methods = ["physiological", "statistical_rr"] >>> params = { >>> 'physiological': (50, 150), >>> 'statistical_rr': 2.576 >>>} >>> ecg_signal, rpeaks = ecg_processor.correct_outlier( >>> ecg_processor, key="Data", >>> outlier_correction=methods, >>> outlier_params=params >>> ) >>> print(ecg_signal) >>> print(rpeaks)
- classmethod correct_rpeaks(ecg_processor=None, key=None, rpeaks=None, sampling_rate=256.0)[source]¶
Perform R peak correction algorithms to get less noisy HRV parameters.
R peak correction comes from
neurokit
and is based on an algorithm by Lipponen et al. (2019).To use this function, either simply pass an
EcgProcessor
object together with akey
indicating which phase needs to be processed should be processed or the dataframerpeaks
which is a result fromecg_process()
.Warning
This algorithm might add additional R peaks or remove certain ones, so results of this function might not match with the R peaks of
rpeaks()
. Thus, R peaks resulting from this function might not be used in combination withecg()
since R peak indices won’t match.Note
In BioPsyKit this function is not applied to the detected R peaks during ECG signal processing but only used right before passing R peaks to
hrv_process()
.- Parameters
ecg_processor (
EcgProcessor
, optional) –EcgProcessor
object. If this argument is supplied, thekey
argument needs to be supplied as well.key (str, optional) – Dictionary key of the phase to process. Needed when
ecg_processor
is passed as argument.rpeaks (
RPeakDataFrame
, optional) – Dataframe with detected R peaks. Output fromecg_process()
.sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Not needed if
ecg_processor
is supplied as parameter. Default: 256
- Returns
dataframe containing corrected R peak indices
- Return type
References
Lipponen, J. A., & Tarvainen, M. P. (2019). A robust algorithm for heart rate variability time series artefact correction using novel beat classification. Journal of Medical Engineering and Technology, 43(3), 173-181. https://doi.org/10.1080/03091902.2019.1640306
Examples
>>> from biopsykit.signals.ecg import EcgProcessor >>> # initialize EcgProcessor instance >>> ep = EcgProcessor(...) >>> # correct R peak locations >>> rpeaks_corrected = ep.correct_rpeaks(ecg_processor, key="Data")
- classmethod hrv_process(ecg_processor=None, key=None, rpeaks=None, hrv_types=None, correct_rpeaks=True, index=None, index_name=None, sampling_rate=256.0)[source]¶
Compute HRV parameters on the given data.
By default, it applies R peak correction (see
correct_rpeaks()
) before computing HRV parameters.To use this function, either simply pass an
EcgProcessor
object together with akey
indicating which phase needs to be processed should be processed or the dataframerpeaks
which is a result fromecg_process()
.- Parameters
ecg_processor (
EcgProcessor
, optional) –EcgProcessor
object. If this argument is supplied, thekey
argument needs to be supplied as well.key (str, optional) – Dictionary key of the phase to process. Needed when
ecg_processor
is passed as argument.rpeaks (
RPeakDataFrame
, optional) – Dataframe with detected R peaks. Output fromecg_process()
.hrv_types (str (or list of such), optional) – list of HRV types to be computed. Must be a subset of [“hrv_time”, “hrv_nonlinear”, “hrv_frequency”] or “all” to compute all types of HRV. Refer to
neurokit2.hrv.hrv()
for further information on the available HRV parameters. Default:None
(equals to [“hrv_time”, “hrv_nonlinear”])correct_rpeaks (bool, optional) –
True
to apply R peak correction (usingcorrect_rpeaks()
) before computing HRV parameters,False
otherwise. Default:True
index (str, optional) – Index of the computed HRV parameters. Used to concatenate HRV processing results from multiple phases into one joint dataframe later on. Default:
None
index_name (str, optional) – Index name of the output dataframe. Only used if
index
is also supplied. Default:None
sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Not needed if
ecg_processor
is supplied as parameter. Default: 256
- Returns
dataframe with computed HRV parameters
- Return type
Examples
>>> from biopsykit.signals.ecg import EcgProcessor >>> # initialize EcgProcessor instance >>> ecg_processor = EcgProcessor(...)
>>> # HRV processing using default parameters (time and nonlinear), including R peak correction >>> hrv_output = ecg_processor.hrv_process(ecg_processor, key="Data")
>>> # HRV processing using using all types, and without R peak correction >>> hrv_output = ecg_processor.hrv_process(ecg_processor, key="Data", hrv_types='all', correct_rpeaks=False)
- hrv_batch_process(hrv_types=None)[source]¶
Compute HRV parameters over all phases.
This function computes HRV parameters over all phases using
hrv_process()
.- Parameters
hrv_types (str (or list of such), optional) – list of HRV types to be computed. Must be a subset of [‘hrv_time’, ‘hrv_nonlinear’, ‘hrv_frequency’] or ‘all’ to compute all types of HRV. Refer to
neurokit2.hrv.hrv()
for further information on the available HRV parameters. Default:None
(equals to [‘hrv_time’, ‘hrv_nonlinear’])- Returns
dataframe with HRV parameters over all phases
- Return type
- classmethod ecg_estimate_rsp(ecg_processor=None, key=None, ecg_signal=None, rpeaks=None, edr_type=None, sampling_rate=256)[source]¶
Estimate respiration signal from ECG (ECG-derived respiration, EDR).
To use this function, either simply pass an
EcgProcessor
object together with akey
indicating which phase needs to be processed should be processed or the two dataframesecg_signal
andrpeaks
resulting fromecg_process()
.- Parameters
ecg_processor (
EcgProcessor
, optional) –EcgProcessor
object. If this argument is supplied, thekey
argument needs to be supplied as well.key (str, optional) – Dictionary key of the phase to process. Needed when
ecg_processor
is passed as argument.ecg_signal (
EcgResultDataFrame
, optional) – Dataframe with processed ECG signal. Output fromecg_process()
.rpeaks (
RPeakDataFrame
, optional) – Dataframe with detected R peaks. Output fromecg_process()
edr_type ({'peak_trough_mean', 'peak_trough_diff', 'peak_peak_interval'}, optional) – Method to use for estimating EDR. Must be one of ‘peak_trough_mean’, ‘peak_trough_diff’, or ‘peak_peak_interval’. Default: ‘peak_trough_mean’
sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Not needed if
ecg_processor
is supplied as parameter. Default: 256
- Returns
dataframe with estimated respiration signal
- Return type
Examples
>>> from biopsykit.signals.ecg import EcgProcessor >>> # initialize EcgProcessor instance >>> ecg_processor = EcgProcessor(...)
>>> # Extract respiration signal estimated from ECG using the 'peak_trough_diff' method >>> rsp_signal = ecg_processor.ecg_estimate_rsp(ecg_processor, key="Data", edr_type='peak_trough_diff')
- classmethod rsa_process(ecg_signal, rsp_signal, sampling_rate=256)[source]¶
Compute respiratory sinus arrhythmia (RSA) based on ECG and respiration signal.
RSA is computed both via Peak-to-Trough (P2T) Porges-Bohrer method using
hrv_rsa()
.- Parameters
ecg_signal (
EcgResultDataFrame
, optional) – Dataframe with processed ECG signal. Output fromecg_process()
.rsp_signal (pd.DataFrame) – Dataframe with 1-D raw respiration signal. Can be a ‘true’ respiration signal (e.g. from bioimpedance or Radar) or an ‘estimated’ respiration signal (e.g. from ECG-derived respiration).
sampling_rate (float, optional) – Sampling rate of recorded data in Hz. Default: 256
- Returns
Dictionary containing computed RSA metrics.
- Return type
See also
hrv_rsa()
compute respiratory sinus arrhythmia
Examples
>>> from biopsykit.signals.ecg import EcgProcessor >>> # initialize EcgProcessor instance >>> ecg_processor = EcgProcessor(...)
>>> ecg_signal = ecg_processor.ecg_result['Data'] >>> # Extract respiration signal estimated from ECG using the 'peak_trough_diff' method >>> rsp_signal = ecg_processor.ecg_estimate_rsp(ecg_processor, key="Data", edr_type='peak_trough_diff') >>> # Compute RSA from ECG and Respiration data >>> rsa_output = ecg_processor.rsa_process(ecg_signal, rsp_signal)
- data: Dict[str, pandas.core.frame.DataFrame]¶
Dictionary with raw data, split into different phases.
Each dataframe is expected to be a
DataFrame
.