Practical methods for constructing suffix trees
Hankins, Richard A.; Tata, Sandeep; Patel, Jignesh M.; Tian, Yuanyuan
2005-09
Citation
Tian, Yuanyuan; Tata, Sandeep; Hankins, Richard A.; Patel, Jignesh M.; (2005). "Practical methods for constructing suffix trees." The VLDB Journal 14(3): 281-299. <http://hdl.handle.net/2027.42/47869>
Abstract
Sequence datasets are ubiquitous in modern life-science applications, and querying sequences is a common and critical operation in many of these applications. The suffix tree is a versatile data structure that can be used to evaluate a wide variety of queries on sequence datasets, including evaluating exact and approximate string matches, and finding repeat patterns. However, methods for constructing suffix trees are often very time-consuming, especially for suffix trees that are large and do not fit in the available main memory. Even when the suffix tree fits in memory, it turns out that the processor cache behavior of theoretically optimal suffix tree construction methods is poor, resulting in poor performance. Currently, there are a large number of algorithms for constructing suffix trees, but the practical tradeoffs in using these algorithms for different scenarios are not well characterized.Publisher
Springer-Verlag
ISSN
0949-877X 1066-8888
Other DOIs
Types
Article
Metadata
Show full item recordAccessibility: If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.