scirpy.datasets.iggytop#
- scirpy.datasets.iggytop(*, deduplicated=True, tag='latest')#
Return the IggyTop database as an AnnData object.
IggyTop (Immunological Graph Yielding Top receptor-epitope pairings) is a harmonized database of immunoreceptor-epitope pairings integrating data from multiple sources: IEDB, VDJdb, McPAS-TCR, CEDAR, ITRAP, TRAIT, TCR3d, and NeoTCR. V(D)J genes are normalized to IMGT standards and CDR3 sequences are harmonized following AIRR standards. Pre-built datasets are released bimonthly.
By default, a deduplicated version of the dataset is returned. Use this version if you’d like to work with the integrated resource combining data from all source datasets. If you prefer to work with a single resource, set
deduplicated=Falseand filter the resource of interest via.obs["source"].Note
Scirpy datasets are managed through Pooch.
By default, the dataset will be downloaded into your operating system’s default cache directory (See
pooch.os_cache()for more details). If it has already been downloaded, it will be retrieved from the cache.You can override the default cache dir by setting the
SCIRPY_DATA_DIRenvironment variable to a path of your preference.- Parameters:
deduplicated (
bool(default:True)) – IfTrue, return the deduplicated and 10X-filtered dataset. IfFalse, return the full merged dataset including all source records.tag (
str(default:'latest')) – The IggyTop release tag to use. Defaults to"latest", which always fetches the most recent release. For reproducibility, pin a specific release tag (e.g."data-2026.04.25.075304").
- Return type:
- Returns:
An AnnData object containing immunoreceptor-epitope pairings from IggyTop in
obsm["airr"]. Each entry is represented as if it was a cell, but without gene expression data.