biopsykit.utils.array_handling module¶
Module providing various functions for low-level handling of array data.
- biopsykit.utils.array_handling.sanitize_input_1d(data)[source]¶
Convert 1-d array-like data (
DataFrame
/Series
) to a numpy array.
- biopsykit.utils.array_handling.sanitize_input_nd(data, ncols=None)[source]¶
Convert n-d array-like data (
DataFrame
/Series
) to a numpy array.- Parameters
data (array_like) – input data
ncols (int or tuple of ints) – number of columns (2nd dimension) the
data
is expected to have, a list of such ifdata
can have a set of possible column numbers orNone
to allow any number of columns. Default:None
- Returns
data as n-d numpy array
- Return type
- biopsykit.utils.array_handling.find_extrema_in_radius(data, indices, radius, extrema_type='min')[source]¶
Find extrema values (min or max) within a given radius around array indices.
- Parameters
data (array_like) – input data
indices (array_like) – array with indices for which to search for extrema values around
radius (int or tuple of int) –
radius around
indices
to search for extrema:if
radius
is anint
then search for extrema equally in both directions in the interval [index - radius, index + radius].if
radius
is atuple
then search for extrema in the interval [ index - radius[0], index + radius[1] ]
extrema_type ({'min', 'max'}, optional) – extrema type to be searched for. Default: ‘min’
- Returns
numpy array containing the indices of the found extrema values in the given radius around
indices
. Has the same length asindices
.- Return type
Examples
>>> from biopsykit.utils.array_handling import find_extrema_in_radius >>> data = pd.read_csv("data.csv") >>> indices = np.array([16, 25, 40, 57, 86, 100]) >>> >>> radius = 4 >>> # search minima in 'data' in a 4 sample 'radius' around each entry of 'indices' >>> find_extrema_in_radius(data, indices, radius) >>> >>> radius = (5, 0) >>> # search maxima in 'data' in a 5 samples before each entry of 'indices' >>> find_extrema_in_radius(data, indices, radius, extrema_type='max')
- biopsykit.utils.array_handling.remove_outlier_and_interpolate(data, outlier_mask, x_old=None, desired_length=None)[source]¶
Remove outliers, impute missing values and optionally interpolate data to desired length.
Detected outliers are removed from array and imputed by linear interpolation. Optionally, the output array can be linearly interpolated to a new length.
- Parameters
data (array_like) – input data
outlier_mask (
ndarray
) – boolean outlier mask. Has to be the same length asdata
.True
entries indicate outliers. Ifoutlier_mask
is not a bool array values will be casted to boolx_old (array_like, optional) – x values of the input data to be interpolated or
None
if output data should not be interpolated to new length. Default:None
desired_length (int, optional) – desired length of the output signal or
None
to keep length of input signal. Default:None
- Returns
data with removed and imputed outliers, optionally interpolated to desired length
- Return type
- Raises
ValueError – if
data
andoutlier_mask
don’t have the same length or ifx_old
isNone
whendesired_length
is passed as parameter
- biopsykit.utils.array_handling.sliding_window(data, window_samples=None, window_sec=None, sampling_rate=None, overlap_samples=None, overlap_percent=None)[source]¶
Create sliding windows from an input array.
The window size of sliding windows can either be specified in samples (
window_samples
) or in seconds (window_sec
, together withsampling_rate
).The overlap of windows can either be specified in samples (
overlap_samples
) or in percent (overlap_percent
).Note
If
data
has more than one dimension the sliding window view is applied to the first dimension. In the 2-d case this would correspond to applying windows along the rows.- Parameters
data (array_like) – input data
window_samples (int, optional) – window size in samples or
None
if window size is specified in seconds + sampling rate. Default:None
window_sec (int, optional) – window size in seconds or
None
if window size is specified in samples. Default:None
sampling_rate (float, optional) – sampling rate of data in Hz. Only needed if window size is specified in seconds (
window_sec
parameter). Default:None
overlap_samples (int, optional) – overlap of windows in samples or
None
if window overlap is specified in percent. Default:None
overlap_percent (float, optional) – overlap of windows in percent or
None
if window overlap is specified in samples. Default:None
- Returns
sliding windows from input array.
- Return type
See also
sliding_window_view()
create sliding window of input array. low-level function with less input parameter configuration possibilities
- biopsykit.utils.array_handling.sanitize_sliding_window_input(window_samples=None, window_sec=None, sampling_rate=None, overlap_samples=None, overlap_percent=None)[source]¶
Sanitize input parameters for creating sliding windows from array data.
The window size of sliding windows can either be specified in samples (
window_samples
) or in seconds (window_sec
, together withsampling_rate
).The overlap of windows can either be specified in samples (
overlap_samples
) or in percent (overlap_percent
).- Parameters
window_samples (int, optional) – window size in samples or
None
if window size is specified in seconds + sampling rate. Default:None
window_sec (int, optional) – window size in seconds or
None
if window size is specified in samples. Default:None
sampling_rate (float, optional) – sampling rate of data in Hz. Only needed if window size is specified in seconds (
window_sec
parameter). Default:None
overlap_samples (int, optional) – overlap of windows in samples or
None
if window overlap is specified in percent. Default:None
overlap_percent (float, optional) – overlap of windows in percent or
None
if window overlap is specified in samples. Default:None
- Returns
window (int) – window size in samples
overlap (int) – window overlap in samples
- Return type
- biopsykit.utils.array_handling.sliding_window_view(array, window_length, overlap, nan_padding=False)[source]¶
Create a sliding window view of an input array with given window length and overlap.
Warning
This function will return by default a view onto your input array, modifying values in your result will directly affect your input data which might lead to unexpected behaviour! If padding is disabled (default), last window fraction of input may not be returned! However, if nan_padding is enabled, this will always return a copy instead of a view of your input data, independent if padding was actually performed or not!
- Parameters
array (
ndarray
with shape (n,) or (n, m)) – array on which sliding window action should be performed. Windowing will always be performed along axis 0.window_length (int) – length of desired window (must be smaller than array length n)
overlap (int) – length of desired overlap (must be smaller than window_length)
nan_padding (bool) – select if last window should be nan-padded or discarded if it not fits with input array length. If nan-padding is enabled the return array will always be a copy of the input array independent if padding was actually performed or not!
- Returns
windowed view (or copy if
nan_padding
isTrue
) of input array as specified, last window might be nan-padded if necessary to match window size- Return type
Examples
>>> data = np.arange(0,10) >>> windowed_view = sliding_window_view(array = data, window_length = 5, overlap = 3, nan_padding = True) >>> windowed_view array([[ 0., 1., 2., 3., 4.], [ 2., 3., 4., 5., 6.], [ 4., 5., 6., 7., 8.], [ 6., 7., 8., 9., nan]])
- biopsykit.utils.array_handling.downsample(data, fs_in, fs_out)[source]¶
Downsample input signal to a new sampling rate.
If the output sampling rate is a divisor of the input sampling rate, the signal is downsampled using
decimate()
. Otherwise, data is first filtered using an aliasing filter before it is downsampled using linear interpolation.
- biopsykit.utils.array_handling.bool_array_to_start_end_array(bool_array)[source]¶
Find regions in bool array and convert those to start-end indices.
Note
The end index is inclusive!
- Parameters
bool_array (
ndarray
with shape (n,)) – boolean array with either 0/1, 0.0/1.0 or True/False elements- Returns
array of [start, end] indices with shape (n,2)
- Return type
Examples
>>> example_array = np.array([0,0,1,1,0,0,1,1,1]) >>> start_end_list = bool_array_to_start_end_array(example_array) >>> start_end_list array([[2, 4], [6, 9]]) >>> example_array[start_end_list[0, 0]: start_end_list[0, 1]] array([1, 1])
- biopsykit.utils.array_handling.split_array_equally(data, n_splits)[source]¶
Generate indices to split array into parts with equal lengths.
- Parameters
data (array_like) – data to split
n_splits (int) – number of splits
- Returns
list with start and end indices which will lead to splitting array into parts with equal lengths
- Return type
list of tuples
- biopsykit.utils.array_handling.accumulate_array(data, fs_in, fs_out)[source]¶
Accumulate 1-d array by summing over windows.