Practical methods for constructing suffix trees
Hankins, Richard A.; Tata, Sandeep; Patel, Jignesh M.; Tian, Yuanyuan
Tian, Yuanyuan; Tata, Sandeep; Hankins, Richard A.; Patel, Jignesh M.; (2005). "Practical methods for constructing suffix trees." The VLDB Journal 14(3): 281-299. <http://hdl.handle.net/2027.42/47869>
AbstractSequence datasets are ubiquitous in modern life-science applications, and querying sequences is a common and critical operation in many of these applications. The suffix tree is a versatile data structure that can be used to evaluate a wide variety of queries on sequence datasets, including evaluating exact and approximate string matches, and finding repeat patterns. However, methods for constructing suffix trees are often very time-consuming, especially for suffix trees that are large and do not fit in the available main memory. Even when the suffix tree fits in memory, it turns out that the processor cache behavior of theoretically optimal suffix tree construction methods is poor, resulting in poor performance. Currently, there are a large number of algorithms for constructing suffix trees, but the practical tradeoffs in using these algorithms for different scenarios are not well characterized.
MetadataShow full item record
Accessibility: If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.