scirpy.ir_dist.metrics.HammingDistanceCalculator#
- class scirpy.ir_dist.metrics.HammingDistanceCalculator(n_jobs=-1, n_blocks=1, cutoff=2, *, normalize=False, histogram=False)#
Computes pairwise distances between gene sequences based on the “hamming” distance metric.
Set
normalize
to True to use the normalized hamming distance metric instead of the standard hamming distance metric. Then the distance will be calculated as percentage of different positions relative to the sequence length (e.g. AAGG and AAAA -> 50 (%) normalized hamming distance). The cutoff is then also given as normalized hamming distance in percent.The code of this class is based on pwseqdist. Reused under MIT license, Copyright (c) 2020 Andrew Fiore-Gartland.
- Parameters:
cutoff (
int
(default:2
)) – Will eleminate distances > cutoff to make efficient use of sparse matrices.n_jobs (
int
(default:-1
)) – Number of numba parallel threads to use for the pairwise distance calculationn_blocks (
int
(default:1
)) – Number of joblib delayed objects (blocks to compute) given to joblib.Parallelnormalize (
bool
(default:False
)) – Determines whether the normalized hamming distance metric should be used instead of the standard hamming distancehistogram (
bool
(default:False
)) – Determines whether a nearest neighbor histogram should be created
Methods table#
|
Calculates the pairwise distances between two vectors of gene sequences based on the distance metric of the derived class and returns a CSR distance matrix. |
Methods#
- HammingDistanceCalculator.calc_dist_mat(seqs, seqs2=None)#
Calculates the pairwise distances between two vectors of gene sequences based on the distance metric of the derived class and returns a CSR distance matrix. Also creates a histogram based on the minimum value per row of the distance matrix if histogram is set to True.
- Return type: