GraphBlast

 

Rosalba Giugno
Universita' di Catania
Dipartimento di Matematica e Informatica
giugno@dmi.unict.it

Dennis Shasha
Courant Institute of Mathematical Sciences
Department of Computer Science
New York University
shasha@cs.nyu.edu



What is GraphGrep?
   

Compare GraphBlast      Demo  

Download 

  1. GraphGrepSX Installation and Usage
  2. GraphBlast
  3. GraphGrep1.0,1.1

References    Collaborators

 


What is GraphGrep?

GraphGrep  is a software package to search for a query graph in a database of graphs. Given a collection of graphs and a pattern graph, GraphGrep finds all the occurrences of the pattern in each graph. The pattern is a subgraph and it can be also a tree, a path, or  a node. The  pattern is expressed as a list of nodes and a list of edges.

The components of GraphGrep are the following:

  1. graphbuild, to index the graphs in the collection and store their abstract representation;
  2. graphgrep, a graph query language;
    2.1 algebra,  a collection of operations to manipulate and select data.
GraphGrep manipulates heterogeneous graph collections and does not depend on a specific application.

Why GraphGrep? Many applications in industry, science, and engineering depend on efficient information retrieval methods from structured or semistructured  databases. These databases contain items with a complex structure where the relationships between the  components  (of the items) have to be maintained. The items are typically  represented by graphs. For example, shapes, images (computer vision),  XML or OEM data (web exchange data model), molecules and proteins (computational biology), graphical sentences and grammars (visual languages) are represented by graphs. By using graphs the problem of information retrieval on a database  is transformed to that of finding  subgraphs  in a collection of graphs. This problem is well known to be NP  complete. Although graph-to-graph matching  algorithms can be used, efficiency reasons dictate the use of special  techniques in order to reduce the search space and the time complexity.

Home


Compare GraphGrep

To use any version of GraphGrep you have to: (1) create a dataset file; (2) preprocess the collection of graphs; and (3) run a query. The output of "graphgrep" contains all the occurrences of the query in the collection of graphs.

GraphGrep2.0 is implemented in C++, it uses an efficient storage system, it is fast and accurate. Compared to GraphGrep1.0 or GraphGrep1.1, GraphGrep2.0 is faster but it does not allow inexact matchings.

  Input
Data Format
Input Query Format Output Format Language Storage System Filtering
System
 Wildcards
GraphGrep2.1 LNE LNE, Glide List of corresponding nodes C++ Berkeley DB Compacted
Hashed Paths
Vector
Yes
GraphGrep2.0 LNE LNE List of corresponding nodes C++ Berkeley DB Compacted
Hashed Paths
Vector
No
GraphGrep1.0

GraphGrep1.1

LNE LNE,Glide List of corresponding paths C, K

C

Binary Files Extended Hashed paths Yes

 

Home


Download

In order to build a very fast system we are exploring different filtering algorithms, storage  and matching techniques. Please specify to us which version you would like to use.

To download GraphGrep, send email to shasha@cs.nyu.edu  and to giugno@dmi.unict.it.
If you care to describe your application, we'd be glad to hear about it. In any case, we will send you instructions for downloading the program.

Home


Installation of GraphGrep2.0

GraphGrep2.0 is implemented in ANSI C++. GraphGrep2.0 interfaces BerkeleyDB as the embedded database system to handle the data. It has been ported on Unix, and  Windows  platforms.
 

Follow these instructions to install GraphGrep2.0.

  1. Install BerkeleyDB

      1. Download BerkeleyDB 4.01 at http://www.sleepycat.com/update/snapshot/db-4.1.25.NC.tar.gz
      2. Unzip it to a folder
      3. Switch to the inside folder "build_unix"
      4. Type "../dist/configure --enable-cxx" and execute
      5. Type "make"  and execute
      6. Type "make install"  and execute
      7. Make sure there is a directory "/usr/local/BerkeleyDB.4.1/"  in your system.
      8. Make sure "libdb_cxx-4.1.a" " libdb_cxx.a" are in the "/usr/lib" folder. You need this library to build GraphGrep
     
      If you have any trouble building BerkelyDB or to see detail information, please check "http://www.sleepycat.com/docs/ref/build_unix/intro.html"
     

  2.  Build GraphGrep2.0

      1. Install successfully  BerkeleyDB.
      2. Unzip the graphgrep2.0 package
      3. Type "cd graphgrep2.0"
      4. Type "cd src; make; cd .. "
      5. Check if there are "graphgrep" and "graphbuild" existing in the current folder.
     

For bugs and questions please contact the giugno@dmi.unict.it and shasha@cs.nyu.edu.

Home


 

Usage of GraphGrep2.0

File formats

 

Run GraphGrep2.0

To run the examples go to the GraphGrep2.0 directory and do the following (If you want to run GraphGrep2.0 in a different directory: make sure that graphgrep and graphbuild are in your path):

Home

 


References


R. Giugno, D. Shasha, GraphGrep: A Fast and Universal Method for Querying Graphs.
Proceeding of the International Conference in Pattern recognition (ICPR), Quebec, Canada, August 2002. pdf
 

D. Shasha, J.T-L Wang, R. Giugno, Algorithmics and Applications of Tree and Graph Searching
Proceeding of the ACM Symposium on Principles of Database Systems (PODS), Madison, Wisconsin, June 2002. pdf

 


Collaborators

Grant Support

This work has been partly supported by the U.S. National Science Foundation under grants IIS-0414763, DBI-0445666, N2010 IOB-0519985, N2010 DBI-0519984, DBI-0421604, and MCB-0209754. This support is greatly appreciated.