biopsykit.stats.multicoll module¶
Functions to handle multicollinearity in data.
- biopsykit.stats.multicoll.remove_multicollinearity_correlation(data, threshold=0.8)[source]¶
Remove features with multicollinearity based on cross-correlation coefficient.
- Parameters
data (
pandas.DataFrame
) – Input data with features to check for multicollinearity.threshold (float, optional) – Cross-correlation coefficient threshold. Features with a correlation coefficient above this value will be removed. Default: 0.8
- Returns
Dataframe without features with high multicollinearity.
- Return type