biopsykit.utils.array_handling module¶
Module providing various functions for low-level handling of array data.
- biopsykit.utils.array_handling.sanitize_input_1d(data)[source]¶
Convert 1-d array-like data (
DataFrame/Series) to a numpy array.
- biopsykit.utils.array_handling.sanitize_input_nd(data, ncols=None)[source]¶
Convert n-d array-like data (
DataFrame/Series) to a numpy array.- Parameters
- Returns
data as n-d numpy array
- Return type
- biopsykit.utils.array_handling.sanitize_input_series(data, name)[source]¶
Convert input data to a pandas Series.
- biopsykit.utils.array_handling.sanitize_input_dataframe_1d(data, column)[source]¶
Convert input data to a pandas DataFrame with one column.
- biopsykit.utils.array_handling.find_extrema_in_radius(data, indices, radius, extrema_type='min')[source]¶
Find extrema values (min or max) within a given radius around array indices.
- Parameters
data (array_like) – input data
indices (array_like) – array with indices for which to search for extrema values around
radius (int or tuple of int) –
radius around
indicesto search for extrema:if
radiusis anintthen search for extrema equally in both directions in the interval [index - radius, index + radius].if
radiusis atuplethen search for extrema in the interval [ index - radius[0], index + radius[1] ]
extrema_type ({'min', 'max'}, optional) – extrema type to be searched for. Default: ‘min’
- Returns
numpy array containing the indices of the found extrema values in the given radius around
indices. Has the same length asindices.- Return type
Examples
>>> from biopsykit.utils.array_handling import find_extrema_in_radius >>> data = pd.read_csv("data.csv") >>> indices = np.array([16, 25, 40, 57, 86, 100]) >>> >>> radius = 4 >>> # search minima in 'data' in a 4 sample 'radius' around each entry of 'indices' >>> find_extrema_in_radius(data, indices, radius) >>> >>> radius = (5, 0) >>> # search maxima in 'data' in a 5 samples before each entry of 'indices' >>> find_extrema_in_radius(data, indices, radius, extrema_type='max')
- biopsykit.utils.array_handling.remove_outlier_and_interpolate(data, outlier_mask, x_old=None, desired_length=None)[source]¶
Remove outliers, impute missing values and optionally interpolate data to desired length.
Detected outliers are removed from array and imputed by linear interpolation. Optionally, the output array can be linearly interpolated to a new length.
- Parameters
data (array_like) – input data
outlier_mask (
ndarray) – boolean outlier mask. Has to be the same length asdata.Trueentries indicate outliers. Ifoutlier_maskis not a bool array values will be casted to boolx_old (array_like, optional) – x values of the input data to be interpolated or
Noneif output data should not be interpolated to new length. Default:Nonedesired_length (int, optional) – desired length of the output signal or
Noneto keep length of input signal. Default:None
- Returns
data with removed and imputed outliers, optionally interpolated to desired length
- Return type
- Raises
ValueError – if
dataandoutlier_maskdon’t have the same length or ifx_oldisNonewhendesired_lengthis passed as parameter
- biopsykit.utils.array_handling.sliding_window(data, window_samples=None, window_sec=None, sampling_rate=None, overlap_samples=None, overlap_percent=None)[source]¶
Create sliding windows from an input array.
The window size of sliding windows can either be specified in samples (
window_samples) or in seconds (window_sec, together withsampling_rate).The overlap of windows can either be specified in samples (
overlap_samples) or in percent (overlap_percent).Note
If
datahas more than one dimension the sliding window view is applied to the first dimension. In the 2-d case this would correspond to applying windows along the rows.- Parameters
data (array_like) – input data
window_samples (int, optional) – window size in samples or
Noneif window size is specified in seconds + sampling rate. Default:Nonewindow_sec (int, optional) – window size in seconds or
Noneif window size is specified in samples. Default:Nonesampling_rate (float, optional) – sampling rate of data in Hz. Only needed if window size is specified in seconds (
window_secparameter). Default:Noneoverlap_samples (int, optional) – overlap of windows in samples or
Noneif window overlap is specified in percent. Default:Noneoverlap_percent (float, optional) – overlap of windows in percent or
Noneif window overlap is specified in samples. Default:None
- Returns
sliding windows from input array.
- Return type
See also
sliding_window_view()create sliding window of input array. low-level function with less input parameter configuration possibilities
- biopsykit.utils.array_handling.sanitize_sliding_window_input(window_samples=None, window_sec=None, sampling_rate=None, overlap_samples=None, overlap_percent=None)[source]¶
Sanitize input parameters for creating sliding windows from array data.
The window size of sliding windows can either be specified in samples (
window_samples) or in seconds (window_sec, together withsampling_rate).The overlap of windows can either be specified in samples (
overlap_samples) or in percent (overlap_percent).- Parameters
window_samples (int, optional) – window size in samples or
Noneif window size is specified in seconds + sampling rate. Default:Nonewindow_sec (int, optional) – window size in seconds or
Noneif window size is specified in samples. Default:Nonesampling_rate (float, optional) – sampling rate of data in Hz. Only needed if window size is specified in seconds (
window_secparameter). Default:Noneoverlap_samples (int, optional) – overlap of windows in samples or
Noneif window overlap is specified in percent. Default:Noneoverlap_percent (float, optional) – overlap of windows in percent or
Noneif window overlap is specified in samples. Default:None
- Returns
window (int) – window size in samples
overlap (int) – window overlap in samples
- Return type
- biopsykit.utils.array_handling.sliding_window_view(array, window_length, overlap, nan_padding=False)[source]¶
Create a sliding window view of an input array with given window length and overlap.
Warning
This function will return by default a view onto your input array, modifying values in your result will directly affect your input data which might lead to unexpected behaviour! If padding is disabled (default), last window fraction of input may not be returned! However, if nan_padding is enabled, this will always return a copy instead of a view of your input data, independent if padding was actually performed or not!
- Parameters
array (
ndarraywith shape (n,) or (n, m)) – array on which sliding window action should be performed. Windowing will always be performed along axis 0.window_length (int) – length of desired window (must be smaller than array length n)
overlap (int) – length of desired overlap (must be smaller than window_length)
nan_padding (bool) – select if last window should be nan-padded or discarded if it not fits with input array length. If nan-padding is enabled the return array will always be a copy of the input array independent if padding was actually performed or not!
- Returns
windowed view (or copy if
nan_paddingisTrue) of input array as specified, last window might be nan-padded if necessary to match window size- Return type
Examples
>>> data = np.arange(0,10) >>> windowed_view = sliding_window_view(array = data, window_length = 5, overlap = 3, nan_padding = True) >>> windowed_view array([[ 0., 1., 2., 3., 4.], [ 2., 3., 4., 5., 6.], [ 4., 5., 6., 7., 8.], [ 6., 7., 8., 9., nan]])
- biopsykit.utils.array_handling.downsample(data, fs_in, fs_out)[source]¶
Downsample input signal to a new sampling rate.
If the output sampling rate is a divisor of the input sampling rate, the signal is downsampled using
decimate(). Otherwise, data is first filtered using an aliasing filter before it is downsampled using linear interpolation.
- biopsykit.utils.array_handling.bool_array_to_start_end_array(bool_array)[source]¶
Find regions in bool array and convert those to start-end indices.
Note
The end index is inclusive!
- Parameters
bool_array (
ndarraywith shape (n,)) – boolean array with either 0/1, 0.0/1.0 or True/False elements- Returns
array of [start, end] indices with shape (n,2)
- Return type
Examples
>>> example_array = np.array([0,0,1,1,0,0,1,1,1]) >>> start_end_list = bool_array_to_start_end_array(example_array) >>> start_end_list array([[2, 4], [6, 9]]) >>> example_array[start_end_list[0, 0]: start_end_list[0, 1]] array([1, 1])
- biopsykit.utils.array_handling.split_array_equally(data, n_splits)[source]¶
Generate indices to split array into parts with equal lengths.
- biopsykit.utils.array_handling.accumulate_array(data, fs_in, fs_out)[source]¶
Accumulate 1-d array by summing over windows.