Adaptive Immune Receptor Repertoire. Within the Scirpy documentation, we simply speak of immune receptors (IR).

The AIRR community defines standards around AIRR data. Scirpy supports the AIRR Rearrangement schema and complies with the AIRR Software Guidelines.

Alellically included B-cells#

A B cell with two pairs of IG chains. See Dual IR.

awkward array#

Awkward arrays are a data structure that allows to represent nested, variable sized data (such as lists of lists, lists of dictionaries). It is computationally efficient and can be manipulated with NumPy-like idioms.

For more details, check out the awkward documentation


B-cell receptor. A BCR consiste of two Immunoglobulin (IG) heavy chains and two IG light chains. The two light chains contain a variable region, which is responsible for antigen recognition.


Image By CNX OpenStax under the CC BY-4.0 license, obtained from wikimedia commons#


Complementary-determining region. The diversity and, therefore, antigen-specificity of IRs is predominanly determined by three hypervariable loops (CDR1, CDR2, and CDR3) on each of the α- and β receptor arms.

CDR1 and CDR2 are fully encoded in germline V genes. In contrast, the CDR3 loops are assembled from V, (D), and J segments and comprise random additions and deletions at the junction sites. Thus, CDR3 regions make up a large part of the adpative immune receptor variability and are therefore thought to be particularly important for antigen specificity (reviewed in [AHS15]).


Image from [AHS15] under the CC BY-NC-SA-3.0 license.#


Complementary-determining region 3. See CDR.

Chain locus#

Scirpy supports all valid IMGT locus names:

Loci with a VJ junction:
  • TRA (T-cell receptor alpha)

  • TRG (T-cell receptor gamma)

  • IGL (Immunoglobulin lambda)

  • IGK (Immunoglobulin kappa)

Loci with a VDJ junction:
  • TRB (T-cell receptor beta)

  • TRD (T cell receptor delta)

  • IGH (Immunoglobulin heavy chain)


A clonotype designates a collection of T or B cells that descend from a common, antecedent cell, and therefore, bear the same adaptive immune receptors and recognize the same epitopes.

In single-cell RNA-sequencing (scRNA-seq) data, T or B cells sharing identical complementarity-determining regions 3 (CDR3) nucleotide sequences of both VJ and VDJ chains (e.g. both α and β TCR chains) make up a clonotype..

Scirpy provides a flexible approach to clonotype definition based on CDR3 sequence identity or similarity. Additionally, it is possible to require clonotypes to have the same V-gene, enforcing the CDR 1 and 2 regions to be the same.

For more details, see the page about our IR model and the API documentation of

Clonotype cluster#

A higher-order aggregation of clonotypes that have different CDR3 nucleotide sequences, but might recognize the same antigen because they have the same or similar CDR3 amino acid sequence.

See also:

Clonotype modularity#

The clonotype modularity measures how densly connected the transcriptomics neighborhood graph underlying the cells in a clonotype is. Clonotypes with a high modularity consist of cells that are transcriptionally more similar than that of a clonotype with a low modularity. See also

Convergent evolution of clonotypes#

It has been proposed that IRs are subject to convergent evolution, i.e. a selection pressure that leads to IRs recognizing the same antigen ([VKP+06]).

Evidence of convergent evolution could be clonotypes with the same CDR3 amino acid sequence, but different CDR3 nucleotide sequences (due to synonymous codons) or clonotypes with highly similar CDR3 amino acid sequences that recognize the same antigen.

Dual IR#

IRs with more than one pair of VJ and VDJ sequences. While this was previously thought to be impossible due to the mechanism of allelic exclusion ([BSB10]), there is an increasing amound of evidence for a bona fide dual-IR population ([SB19], [JPG10], [VS10]).

For more information on how Scirpy handles dual IRs, see the page about our IR model.

Dual TCR#

TCRs with more than one pair of α- and β (or γ- and δ) chains. See Dual IR.


The part of an antigen that is recognized by the TCR, BCR, or antibody.




Immune receptor.

Multi-tissue clonotype#

A clonotype that occurs in multiple tissues of the same patient.


Cells with more than two pairs of VJ and VDJ sequences that do not fit into the Dual IR model. These are usually rare and could be explained by doublets/multiplets, i.e. two ore more cells that were captured in the same droplet.


(a) UMAP plot of 96,000 cells from [WMdA+20] with at least one detected CDR3 sequence with multichain-cells (n=474) highlighted in green. (b) Comparison of detected reads per cell in multichain-cells and other cells. Multichain cells comprised significantly more reads per cell (p = 9.45 × 10−251, Wilcoxon-Mann-Whitney-test), supporting the hypothesis that (most of) multichain cells are technical artifacts arising from cell-multiplets ([IKK+16]).#

Orphan chain#

A IR chain is called orphan, if its corresponding counterpart has not been detected. For instance, if a cell has only a VJ chain, (e.g. TCR-alpha), but no VDJ chain (e.g. TCR-beta), the cell will be flagged as “Orphan VJ”.

Orphan chains are most likely the effect of stochastic dropouts due to sequencing inefficiencies.

See also

Private clonotype#

A clonotype that is specific for a certain patient.

Productive chain#

Productive chains are IR chains with a CDR3 sequence that produces a functional peptide. Scirpy relies on the preprocessing tools (e.g. CellRanger or TraCeR) for flagging non-productive chains. Typically chains are flagged as non-productive if they contain a stop codon or are not within the reading frame.

Public clonotype#

A clonotype that is shared across multiple patients, e.g. a clonotype recognizing common viral epitope.


Image from [SMR+18] under the CC BY-4.0 license.#

Receptor subtype#

More fine-grained classification of the receptor type into

  • α/β T cells

  • γ/δ T cells

  • IG-heavy/IG-κ B cells

  • IG-heavy/IG-λ B cells

See also

Receptor type#

Classification of immune receptors into BCR and TCR.

See also


T-cell receptor. A TCR consists of one α and one β chain (or, alternatively, one γ and one δ chain). Each chain consists of a constant and a variable region. The variable region is responsible for antigen recognition, mediated by CDR regions.

For more information on how Scirpy represents TCRs, see the page about our receptor model.


Image from Wikimedia commons under the CC BY-3.0 license.#

Tissue-specific clonotype#

A clonotype that only occurs in a certain tissue of a certain patient.


Unique molecular identifier. Some single-cell RNA-seq protocols label each RNA with a unique barcode prior to PCR-amplification to mitigate PCR bias. With these protocols, UMI-counts replace the read-counts generally used with RNA-seq.


The variability of IR chain sequences originates from the genetic recombination of Variable, Diversity and Joining gene segments. The TCR-α, TCR-ɣ, IG-κ, and IG-λ chains get assembled from V and J loci only. We refer to these chains as VJ chains in Scirpy. The TCR-β, TCR-δ, and IG-heavy chains get assembled from all three segments. We refer to these chains as VDJ-chains in Scirpy.

As an example, the figure below shows how a TCR-α chain is assembed from the tra locus. V to J recombination joins one of many TRAV segments to one of many TRAJ segments. Next, introns are spliced out, resulting in a TCR-α chain transcript with V, J and C segments directly next to each other (reviewed in [AHS15]).


Image from [AHS15] under the CC BY-NC-SA-3.0 license.#