Abstract:
The main goal of this paper is to give an efficient algorithm for the Tree
Pattern Matching problem. We also introduce and give an efficient algorithm
for the Subset Matching problem.
The Subset Matching problem is to find all occurrences of a pattern string p
of length m in a text string t of length n, where each pattern and text
location is a set of characters drawn from some alphabet. The pattern is
said to occur at text position i if the set p[j] is a subset of the set
t[i+j-1], for all j, 1 <= j <= m. We give an O((s+n)\log^3 m) randomized
algorithm for this problem, where s denotes the sum of the sizes of all the
sets.
Then we reduce the Tree Pattern Matching problem to a number of instances of
the Subset Matching problem. This reduction takes linear time and the sum
of the sizes of the Subset Matching problems obtained is also linear.
Coupled with our first result, this implies an O(nlog^3 m) time randomized
of metric spaces are also considered.