To: alfred.pescatore@gmail.com, apulvirenti@dmi.unict.it, ferro@dmi.unict.it, giugno@dmi.unict.it, pigola@dmi.unict.it, shasha@cs.nyu.edu Subject: Re: Algorithm for motifs. This works!! INPUT (Q,G) Q=query graph; G= target We must decide if Q is a motif and what's is p-value. Step 1) Let n be the number of all possible DFS traversal of Q. >> Can't this number be proportional to |Q|! Step 2) Consider the target graph G consider k random permutations of the set of all pairs (V,j) where V is a vertex of G and j is the node that is the start of one DFS traversal of Q Therefore j:= 1,2,....,n. For each such permutation p let p* be the index of the first pair (V,j) that is a MATCH of Q in G. Le m(G) the averag= e of those p*. Step3) FOR I:=3D1 to 1000 Let G_I be a random variant of G; Let m(G_I) be the result of applying step2 to G_I. IF m(G_I)> m(G) then mark I POSITIVE ELSE mark I NEGATIVE END FOR Step3) OUTPUT p-value:=3D #NEGATIVES/1000 Questo funziona di sicuro!! La dimostrazione =E8 la stessa dei vettori di 0= /1. Concerning optimizations of the above procedure: 1) Choosing the data structures to represent compactly all DFS traversals (ex.Trie) 2) Each verification step (V,j) can stop when a failure node is achieved an= d all traversal continuing in that node can be set to FAILING. 3) One should go on in the execution of negative pairs on each G_I until th= e temporary mean is above m(G). If one can not do that the case will be marke= d NEGATIVE. This will avoid expensive searches in graphs with few or zero occurrences.