scirpy.ir_dist.metrics.LevenshteinDistanceCalculator#
- class scirpy.ir_dist.metrics.LevenshteinDistanceCalculator(cutoff=2, **kwargs)#
Calculates the Levenshtein edit-distance between sequences.
The edit distance is the total number of deletion, addition and modification events.
This class relies on Python-levenshtein to calculate the distances.
- Choosing a cutoff:
Each modification stands for a deletion, addition or modification event. While lacking empirical data, it seems unlikely that CDR3 sequences with more than two modifications still recognize the same antigen.
- Parameters:
cutoff (
int
(default:2
)) – Will eleminate distances > cutoff to make efficient use of sparse matrices. The default cutoff is2
.n_jobs – Number of jobs to use for the pairwise distance calculation, passed to
joblib.Parallel
. If -1, use all CPUs (only for ParallelDistanceCalculators). Via thejoblib.parallel_config
context manager, another backend (e.g.dask
) can be selected.block_size – Deprecated. This is now set in
calc_dist_mat
.
Attributes table#
The sparse matrix dtype. |
Methods table#
|
Calculate the distance matrix. |
|
Mirror a triangular matrix at the diagonal to make it a square matrix. |
Attributes#
- LevenshteinDistanceCalculator.DTYPE = 'uint8'#
The sparse matrix dtype. Defaults to uint8, constraining the max distance to 255.
Methods#
- LevenshteinDistanceCalculator.calc_dist_mat(seqs, seqs2=None, *, block_size=None)#
Calculate the distance matrix.
See
DistanceCalculator.calc_dist_mat()
.- Parameters:
seqs (
Sequence
[str
]) – array containing CDR3 sequences. Must not contain duplicates.seqs2 (
Optional
[Sequence
[str
]] (default:None
)) – second array containing CDR3 sequences. Must not contain duplicates either.block_size (
Optional
[int
] (default:None
)) – The width of a block that’s sent to a worker. A block containsblock_size ** 2
elements. IfNone
the block size is determined automatically based on the problem size.
- Return type:
- Returns:
Sparse pairwise distance matrix.
- static LevenshteinDistanceCalculator.squarify(triangular_matrix)#
Mirror a triangular matrix at the diagonal to make it a square matrix.
The input matrix must be upper triangular to begin with, otherwise the results will be incorrect. No guard rails!
- Return type: