class scirpy.ir_dist.metrics.ParallelDistanceCalculator(cutoff, *, n_jobs=-1, block_size=None)#

Abstract base class for a DistanceCalculator that computes distances in parallel.

It does so in a blockwise fashion. The function computing distances for a single block needs to be overriden.

  • n_jobs (int (default: -1)) – Number of jobs to use for the pairwise distance calculation, passed to joblib.Parallel. If -1, use all CPUs (only for ParallelDistanceCalculators). Via the joblib.parallel_config context manager, another backend (e.g. dask) can be selected.

  • block_size (Optional[int] (default: None)) – Deprecated. This is now set in calc_dist_mat.

Attributes table#


The sparse matrix dtype.

Methods table#

calc_dist_mat(seqs[, seqs2, block_size])

Calculate the distance matrix.


Mirror a triangular matrix at the diagonal to make it a square matrix.


ParallelDistanceCalculator.DTYPE = 'uint8'#

The sparse matrix dtype. Defaults to uint8, constraining the max distance to 255.


ParallelDistanceCalculator.calc_dist_mat(seqs, seqs2=None, *, block_size=None)#

Calculate the distance matrix.

See DistanceCalculator.calc_dist_mat().

  • seqs (Sequence[str]) – array containing CDR3 sequences. Must not contain duplicates.

  • seqs2 (Optional[Sequence[str]] (default: None)) – second array containing CDR3 sequences. Must not contain duplicates either.

  • block_size (Optional[int] (default: None)) – The width of a block that’s sent to a worker. A block contains block_size ** 2 elements. If None the block size is determined automatically based on the problem size.

Return type:



Sparse pairwise distance matrix.

static ParallelDistanceCalculator.squarify(triangular_matrix)#

Mirror a triangular matrix at the diagonal to make it a square matrix.

The input matrix must be upper triangular to begin with, otherwise the results will be incorrect. No guard rails!

Return type: