scirpy.tl.alpha_diversity

scirpy.tl.alpha_diversity#

scirpy.tl.alpha_diversity(adata, groupby, *, target_col='clone_id', metric='normalized_shannon_entropy', inplace=True, key_added=None, airr_mod='airr', **kwargs)#

Computes the alpha diversity of clonotypes within a group.

Use a metric out of normalized_shannon_entropy, D50, DXX, and scikit-bio’s alpha diversity metrics. Alternatively, provide a custom function to calculate the diversity based on count vectors as explained here http://scikit-bio.org/docs/latest/diversity.html

Normalized shannon entropy:

Uses the Shannon Entropy as diversity measure. The Entrotpy gets normalized to group size.

D50:

D50 is a measure of the minimum number of distinct clonotypes totalling greater than 50% of total clonotype counts in a given group, as a percentage out of the total number of clonotypes. Adapted from https://patents.google.com/patent/WO2012097374A1/en.

DXX:

Similar to D50 where XX indicates the percentage of total clonotype counts threshold. Requires to pass the percentage keyword argument which can be within 0 and 100.

Ignores NaN values.

Parameters:
  • adata (Union[AnnData, MuData, DataHandler]) – AnnData or MuData object that contains AIRR information.

  • groupby (str) – Column of obs by which the grouping will be performed.

  • target_col (str (default: 'clone_id')) – Column on which to compute the alpha diversity

  • metric (str | Callable[[ndarray], int | float] (default: 'normalized_shannon_entropy')) – A metric used for diversity estimation out of normalized_shannon_entropy, D50, DXX, any of scikit-bio’s alpha diversity metrics, or a custom function.

  • inplace (bool (default: True)) – If True, a column with the result will be stored in obs. Otherwise the result will be returned.

  • key_added (Optional[str] (default: None)) –

    Key under which the result will be stored in obs, if inplace is True. When the function is running on MuData, the result will be written to both mdata.obs["{airr_mod}:{key_added}"] and mdata.mod[airr_mod].obs[key_added].

    Defaults to alpha_diversity_{target_col}.

  • airr_mod (str (default: 'airr')) – Name of the modality with AIRR information is stored in the MuData object. if an AnnData object is passed to the function, this parameter is ignored.

  • **kwargs – Additional arguments passed to the metric function.

Return type:

DataFrame | None

Returns:

Depending on the value of inplace returns a DataFrame with the alpha diversity for each group or adds a column to adata.obs.