Richard Cole, Zvi Galil, Ramesh Hariharan, S. Muthukrishnan, Kunsoo Park
Proceedings of the Thirty Fourth Annual ACM Symposium on the Foundations of Computer Science, 1993, 248-258.
An optimal parallel CRCW-PRAM algorithm to compute witnesses for all non-period vectors of an m1 times m2 pattern is given. The algorithm takes O(log log m) time and does O(m1.m2) work, where m=max{m1,m2}. This yields a work optimal algorithm for 2D pattern matching which takes O(log log m) preprocessing time and O(1) text processing time.
postscript of full paper