Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

A Hash Trie Filter Approach to Approximate String Matching for Genomic Databases

Ye-In Chang23 Contact Information, Jiun-Rung Chen23 and Min-Tze Hsu23

(23)  Dept. of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan
Abstract
For genomic databases, approximate string matching with k errors is often considered for genomic sequences, where k errors could be caused by substitution, insertion, or deletion operations. In this paper, we propose a new approximate string matching method, the hash trie filter, for efficiently searching in genomic databases. Our method not only reduces the number of candidates by pruning some unreasonable matched positions, but also dynamically decides the number of ordered matched grams of one candidate, which results in the increase of precision. The experiment results show that the hash trie filter outperforms the well-known (k + s) q-samples filter in terms of the response time and the precision, under different lengths of the query patterns and different error levels.

Keywords  Approximate string matching - filter - hash trie


Contact Information Ye-In Chang
Email: changyi@cse.nsysu.edu.tw
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.113 • Server: mpweb05
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)