MOAL.algorithms.coding_theory package

Submodules

MOAL.algorithms.coding_theory.hamming_distance module

MOAL.algorithms.coding_theory.hamming_distance.hamming(str1, str2)[source]

Algorithm based on an old project of mine, ported from yavascript: christabor.github.io/etude/09-02-2014/.

A superior algorithm exists at wikipedia.org/wiki/Hamming_distance#Algorithm_example, but copying it would defeat the purpose.

MOAL.algorithms.coding_theory.lee_distance module

MOAL.algorithms.coding_theory.lee_distance.lee_distance(str1, str2, q=6)[source]

See https://en.wikipedia.org/wiki/Lee_distance for definition.

MOAL.algorithms.coding_theory.levenshtein_distance module

MOAL.algorithms.coding_theory.levenshtein_distance._branch(word, dist)[source]
MOAL.algorithms.coding_theory.levenshtein_distance._check_degenerates(s1, s2)[source]

“In mathematics, a degenerate case is a limiting case in which an element of a class of objects is qualitatively different from the rest of the class and hence belongs to another, usually simpler, class.” -Wikipedia

The degenerates here are strings that are equal; these are not in the class of strings that need comparison by Levenshtein distance.

MOAL.algorithms.coding_theory.levenshtein_distance.lev_iterative(s1, s2)[source]

The iterative approach to the Levenshtein algorithm. Code is ported from (pseudo-code? c?) example on Wikipedia: wikipedia.org/wiki/Levenshtein_distance#Iterative_with_two_matrix_rows

One important thing to note: the Levenshtein function is NOT transitive, so order matters! That is, lev(s1, s2) != lev(s2, s1).

MOAL.algorithms.coding_theory.levenshtein_distance.lev_recursive(s1, s2, l1=None, l2=None, ct=0)[source]

Recursive implementation of the Levenshtein distance algorithm. Algorithm from Wikipedia pseudocode: wikipedia.org/wiki/Levenshtein_distance#Computing_Levenshtein_distance Canonical math example, but not useful for real implementations.

MOAL.algorithms.coding_theory.minhash module

MOAL.algorithms.coding_theory.minhash.DEBUG = False

See http://matthewcasperson.blogspot.com/2013/11/minhash-for-dummies.html for a description of the steps required.

MOAL.algorithms.coding_theory.minhash.hash_fnv1a(data)[source]
MOAL.algorithms.coding_theory.minhash.hash_shingles(shingles)[source]
MOAL.algorithms.coding_theory.minhash.jaccard_coefficient(set1, set2)[source]
MOAL.algorithms.coding_theory.minhash.jaccard_coefficient_naive(set1, set2)[source]
MOAL.algorithms.coding_theory.minhash.min_hash(set1, set2)[source]
MOAL.algorithms.coding_theory.minhash.randoms(total)[source]
MOAL.algorithms.coding_theory.minhash.rehash(hash, randoms)[source]
MOAL.algorithms.coding_theory.minhash.shingleify(pieces, offset=4)[source]

Converts a sentence into “shingles” of length offset