biopsykit.plotting package

Module providing various customized plotting functions.

biopsykit.plotting.lineplot(data, x, y, hue=None, style=None, **kwargs)[source]

Draw a line plot with error bars with the possibility of several semantic groupings.

This is an extension to seaborn’s lineplot function (seaborn.lineplot()). It offers the same interface, but several improvements:

  • Data points are not only connected as line, but are also drawn with marker.

  • Lines can have an offset along the categorical (x) axis for better visualization (seaborn equivalent: dodge, which is only available for seaborn.pointplot(), not for seaborn.lineplot()).

  • Further plot parameters (axis labels, ticks, etc.) are inferred from the dataframe.

Equivalent to seaborn, the relationship between x and y can be shown for different subsets of the data using the hue and style parameters. If both parameters are assigned two different grouping variables can be represented.

Error bars are displayed as standard error.

See the seaborn documentation for further information.

Parameters
  • data (DataFrame) – data to plot

  • x (str) – column of x axis in data

  • y (str) – column of y axis in data

  • hue (str, optional) – column name of grouping variable that will produce lines with different colors. Can be either categorical or numeric. If None then data will not be grouped.

  • style (str, optional) – column name of grouping variable that will produce lines with different dashes and/or marker. If None then lines will not have different styles.

  • **kwargs

    Additional parameters to configure the plot. Parameters include:

    • x_offset: offset value to move different groups along the x axis for better visualization. Default: 0.05.

    • xlabel: Label for x axis. If not specified it is inferred from the x column name.

    • ylabel: Label for y axis. If not specified it is inferred from the y column name.

    • xticklabels: List of labels for ticks of x axis. If not specified order is taken as tick labels. If order is not specified tick labels are inferred from x values.

    • ylim: y-axis limits.

    • order: list specifying the order of categorical values along the x axis.

    • hue_order: list specifying the order of processing and plotting for categorical levels of the hue semantic.

    • marker: string or list of strings to specify marker style. If marker is a string, then marker of each line will have the same style. If marker is a list, then marker of each line will have a different style.

    • linestyle: string or list of strings to specify line style. If linestyle is a string, then each line will have the same style. If linestyle is a list, then each line will have a different style.

    • legend_fontsize: font size of legend.

    • legend_loc: location of legend in Axes.

    • ax: pre-existing axes for the plot. Otherwise, a new figure and axes object is created and returned.

    • err_kws: additional parameters to control the aesthetics of the error bars. The err_kws are passed down to matplotlib.axes.Axes.errorbar() or matplotlib.axes.Axes.fill_between(), depending on err_style. Parameters include:

      • capsize: length of error bar caps in points

Returns

  • fig (Figure) – figure object

  • ax (Axes) – axes object

Return type

Tuple[matplotlib.figure.Figure, matplotlib.axes._axes.Axes]

See also

seaborn.lineplot()

line plot function of Seaborn

biopsykit.plotting.stacked_barchart(data, **kwargs)[source]

Draw a stacked bar chart.

A stacked bar chart has multiple bar charts along a categorical axis (x axis) where values are stacked along the value axis (y axis). The categorical axis corresponds to the columns in the dataframe whereas the value axis corresponds to the rows.

This is an extension to the already existing function provided by pandas (pandas.DataFrame.plot(kind='bar', stacked=True)).

Parameters
  • data (DataFrame) – data to plot

  • **kwargs

    Additional parameters to plotting function. For example, this can be:

    • order: order of items along the categorical axis.

    • ylabel: label of y axis.

    • ax: pre-existing axes for the plot. Otherwise, a new figure and axes object is created and returned.

Returns

  • fig (Figure) – figure object

  • ax (Axes) – axes object

Return type

Tuple[matplotlib.figure.Figure, matplotlib.axes._axes.Axes]

See also

stack_groups_percent()

function to rearrange dataframe to be plotted as stacked bar chart

biopsykit.plotting.feature_boxplot(data, x=None, y=None, order=None, hue=None, hue_order=None, stats_kwargs=None, **kwargs)[source]

Draw boxplot with significance brackets.

This is a wrapper of seaborn’s boxplot function (boxplot()) that allows to add significance brackets that indicate statistical significance.

Statistical annotations are plotted using statannot (https://github.com/webermarcolivier/statannot). This library can either use existing statistical results or perform statistical tests internally. To plot significance brackets a list of box pairs where annotations should be added are required. The p values can be provided as well, or, alternatively, be computed by statannot. If StatsPipeline was used for statistical analysis the list of box pairs and p values can be generated using sig_brackets() and passed in the stats_kws parameter. Otherwise, see the statannot documentation for a tutorial on how to specify significance brackets.

Note

The input data is assumed to be in long-format.

Parameters
  • data (DataFrame) – data to plot

  • x (str) – column of x axis in data

  • y (str) – column of y axis in data

  • order (list str, optional) – order to plot the categorical levels along the x axis.

  • hue (str, optional) – column name of grouping variable. Default: None

  • hue_order (list of str, optional) – order to plot the grouping variable specified by hue

  • stats_kwargs (dict, optional) –

    dictionary with arguments for significance brackets.

    If annotations should be added, the following parameter is required:

    • box_pairs: list of box pairs that should be annotated

    If already existing box pairs and p values should be used the following parameter is additionally required:

    • pvalues: list of p values corresponding to box_pairs

    If statistical tests should be computed by statsannot, the following parameters are required:

    • test: type of statistical test to be computed

    • comparisons_correction (optional): Whether (and which) type of multi-comparison correction should be applied. None to not apply any multi-comparison (default).

    The following parameters are optional:

    • pvalue_thresholds: list of p value thresholds for statistical annotations. The default annotation is: ‘*’: 0.01 <= p < 0.05, ‘**’: 0.001 <= p < 0.01, ‘***’: p < 0.001 (\([[1e-3, "***"], [1e-2, "**"], [0.05, "*"]]\))

  • **kwargs

    additional arguments that are passed down to boxplot(), for example:

    • ylabel: label of y axis

    • ax: pre-existing axes for the plot. Otherwise, a new figure and axes object is created and returned.

Returns

  • fig (Figure) – figure object

  • ax (Axes) – axes object

Return type

Tuple[matplotlib.figure.Figure, matplotlib.axes._axes.Axes]

See also

boxplot()

seaborn function to create boxplots

StatsPipeline

class to create statistical analysis pipelines

biopsykit.plotting.multi_feature_boxplot(data, x, y, features, group, order=None, hue=None, hue_order=None, stats_kwargs=None, **kwargs)[source]

Draw multiple features as boxplots with significance brackets.

For each feature, a new subplot will be created. Similarly to feature_boxplot subplots can be annotated with statistical significance brackets (can be specified via stats_kwargs parameter). For further information, see the feature_boxplot() documentation.

Note

The input data is assumed to be in long-format.

Parameters
  • data (DataFrame) – data to plot

  • x (str) – column of x-axis in data

  • y (str) – column of y-axis in data

  • features (list of str or dict of str) – features to plot. If features is a list, each entry must correspond to one feature category in the index level specified by group. A separate subplot will be created for each feature. If similar features (i.e., different slope or AUC parameters) should be combined into one subplot, features can be provided as dictionary. Then, the dict keys specify the feature category (a separate subplot will be created for each category) and the dict values specify the feature (or list of features) that are combined into the subplots.

  • group (str) – name of index level with feature names. Corresponds to the subplots that are created.

  • order (list of str, optional) – order to plot the categorical levels along the x axis

  • hue (str, optional) – column name of grouping variable. Default: None

  • hue_order (list of str, optional) – order to plot the grouping variable specified by hue

  • stats_kwargs (dict, optional) – nested dictionary with arguments for significance brackets. The behavior and expected parameters are similar to the stats_kwargs parameter in feature_boxplot(). However, the box_pairs and pvalues arguments are expected not to be lists, but dictionaries with keys corresponding to the list entries (or the dict keys) in features and box pair / p-value lists are the dict values.

  • **kwargs

    additional arguments that are passed down to boxplot(). For example:

    • order: specifies x-axis order for subplots. Can be a list if order is the same for all subplots or a dict if order should be individual for subplots

    • xticklabels: dictionary to set tick labels of x-axis in subplots. Keys correspond to the list entries (or the dict keys) in features. Default: None

    • ylabels: dictionary to set y-axis labels in subplots. Keys correspond to the list entries (or the dict keys) in features. Default: None

    • axs: list of pre-existing axes for the plot. Otherwise, a new figure and axes object is created and returned.

Returns

  • fig (Figure) – figure object

  • axs (list of Axes) – list of subplot axes objects

Return type

Tuple[matplotlib.figure.Figure, List[matplotlib.axes._axes.Axes]]

See also

boxplot()

seaborn function to create boxplots

feature_boxplot

plot single feature boxplot

StatsPipeline

class to create statistical analysis pipelines and get parameter for plotting significance brackets

biopsykit.plotting.feature_pairplot(data, abbreviate_names=True, **kwargs)[source]

Plot feature pairs of a dataset.

This function provides a convenient interface to the pairplot() class, with several additional features, such as showing abbreviated feature names in the plot with a description below the plot.

Parameters
  • data (DataFrame) – DataFrame with feature data to plot. Data is expected to be in wide format.

  • abbreviate_names (bool, optional) – if True, feature names are abbreviated in the plot and a description is shown below the plot.

  • **kwargs – additional keyword arguments are passed down to pairplot()

Returns

FacetGrid object with the pairplot in it.

Return type

FacetGrid

See also

pairplot

seaborn class to create pairplots