scirpy.datasets.vdjdb

Contents

scirpy.datasets.vdjdb#

scirpy.datasets.vdjdb(cached=None, *, cache_path=None, tag='latest')#

Download VDJdb through IggyTop.

VDJdb [BVS+19] is a curated database of T-cell receptor (TCR) sequences with known antigen specificities.

As of v0.24, this is a wrapper around iggytop().

Note

Scirpy datasets are managed through Pooch.

By default, the dataset will be downloaded into your operating system’s default cache directory (See pooch.os_cache() for more details). If it has already been downloaded, it will be retrieved from the cache.

You can override the default cache dir by setting the SCIRPY_DATA_DIR environment variable to a path of your preference.

Parameters:
  • cached (bool | None (default: None)) – Deprecated as of v0.24. Has no effect. Caching is handled through pooch now.

  • cache_path (None (default: None)) – Deprecated as of v0.24. Has no effect.

  • tag (str (default: 'latest')) – The IggyTop release tag to use. Defaults to "latest", which always fetches the most recent release. For reproducibility, pin a specific release tag (e.g. "data-2026.04.25.075304").

Return type:

AnnData

Returns:

An AnnData object containing all entries from VDJDB in obsm["airr"]. Each entry is represented as if it was a cell, but without gene expression. Metadata is stored in adata.uns["DB"].