Specific Substring Problem

Specific Substring Problem: an application in bioinformatics

Given two sets of sequences A and B, the Substring Specific problem is to find all minimum substrings in A having distance at least k for each subsequence in B. This work addresses three new implementations for the Maaß algorithm when the Hamming distance is considered: a naive cubic-time algorithm and two quadratic-time algorithms. We run tests to compare the running time of these implementations and another recently described algorithm implementation that uses the edit distance. In addition, we conducted preliminary testing on a large Tara Ocean database, looking for efficient and effective strategies for finding unique sequences in a set of sequences comparing with the other.

Learn More