GraphGrep1.0, 1.1 |
Rosalba Giugno
Universita' di Catania
Dipartimento di Matematica e Informatica
giugno@dmi.unict.it
Dennis Shasha
Courant Institute of Mathematical Sciences
Department of Computer Science
New York University
shasha@cs.nyu.edu
Date 1/1/2002.
Installation
Usage Glide References
GraphGrep1.0 is implemented in ANSI C and in a high performance interpreted
language called K. It has been ported on Unix, and Windows (Cygwin)
platforms.
Follow these instructions to install and test GraphGrep1.0.
GraphGrep1.1 is implemented
only in ANSI C. It has been ported on Unix, and Windows (Cygwin)
platforms.
Follow these instructions to install and test GraphGrep1.0.
To run the examples go to the GraphGrep1.0 or GraphGrep1.1 directory and do the following:
Run "graphbuild ex1/dbsmile.dat". This will create a directory called graphgrepdata under your working directory with the abstract representation of the graphs.
Search for subgraphs:
run "graphgrep -m ex1/mquery.dat";
inspect output in ex1/mquery.dat.out;
run "graphgrep -x ex1/xquery2.dat ";
inspect output in ex1/xquery2.dat.out.
run "graphgrep -x ex1/xquery.dat";
inspect output in ex1/xquery.dat..output.
Note: mquery.dat is represented as a list of node labels and edges; xquery2.dat and xquery.dat are expressed in Glide language. Use the
command line option -x if
the query is in Glide language, otherwise use -m. The wildcards can be used only
in the Glide language.
Replace an existing graph
find the GraphGrep identification number (id) associated with the graph that you want replace in the file "graphgrepdata/indexgraph.dat" (id is the integer in the first column);
run "graphgrep -f 1";
edit the file graph1.dat, inserting or deleting nodes and edges;
run "graphgrep -r 1".
For bugs and questions about GraphGrep please contact giugno@dmi.unict.it and shasha@cs.nyu.edu
To use GraphGrep1.0, 1.1 you have to: (1) create a dataset file; (2) preprocess the collection of graphs; and (3) run a query. The output of "graphgrep" contains all the occurrences of the query in the collection of graphs.
searching for substructures on the database
The command is "graphgrep [-m, -x ] query_file_name"
replace a
graph
To replace a
graph (you can delete or insert edges and nodes in existing graphs) use "graphgrep
-r id ". id is an integer identification given by GraphGrep for that graph.
The list of
the id numbers and the associated graph labels are given in graphgrepdata/indexgraph.dat
(on a Unix machine you may use "grep graph_label graphgrepdata/indexgraph.dat"
to get the graph id)
Glide (Graph LInear DEscription) is a language designed to express graphs. Glide represents a graph with a linear notation. The main idea is to represent a graph as a set of branches where each node is presented only once. The expressions in Glide, called regular graph expressions, allow the description of portions of graphs. There are powerful languages that work only with specific applications. Glide is designed to be as general as possible without compromising efficiency.
Let us formulate the Glide regular graph expressions using examples. First consider the graph in Fig. 1.
Fig. 1: Generic graph.
a/ | |
b/ |
a/b/ | |
h/c/ | |
a/h/c/f/ |
a/(h/c/)b/ | |
i/(b/a/(l/)h/c/)d/ |
c%1/ f/ i%1/ |
a%1/h/c%1%2/d/h/i%2/ |
Note: by node label we mean the value used in a matching algorithm to establish the equivalency between two nodes.
Glide uses wildcards for approximate searches. The
wildcards are used to match single nodes or paths.
Wildcards must be always followed by '/' .
The wildcards used by Glide and their semantics are given below.
Wildcard | Semantic (matching with) | Example | |
. | any (single) node | a/./c/ | |
* | zero or more nodes | a/*/c/ | |
? | zero or one node | a/?/c/ | |
+ | one or more nodes | a/+/c/ |
Glide manipulates any combination of wildcards. A recursive semantics is provided (see examples).
Glide expression | Semantic |
a/./././b/ | any path of length 4 beginning with a node 'a 'and ending with a node 'b' |
a/+/./b/ | any path of length at least 3 beginning with a node 'a' and ending with a node 'b' |
a/?/./ | any path of length at least 2 and at most 3 starting with a node 'a' |
././a/ | any path of length 2 beginning with a node 'a ' |
a%1/(*/b/)./c/d%1/ | any path where 'a' and 'c' are connected through a node, and a path between a and b exists. |
The graph expressions (E's) in Glide are generated by the following grammar, where digit is a positive integer and label is a string.
E <-- 'label/' | 'label{%digit}/' | './' | '*/' | '?/' | '+/'
E <-- EE | '('E')'
Graph | Incorrect graph expression | Correct graph expression |
a/((b/i/c%1)/h/c%1/) | a/(b/i/c%1/)h/c%1/ |
R. Giugno, D. Shasha,
GraphGrep: A Fast and Universal Method for Querying
Graphs.
Proceeding of the International Conference in Pattern recognition
(ICPR), Quebec, Canada, August 2002.
pdf
D. Shasha, J.T-L Wang, R. Giugno,
Algorithmics and Applications of Tree and
Graph Searching
Proceeding of the ACM Symposium on Principles of Database
Systems (PODS), Madison, Wisconsin, June 2002.
pdf