Home Research Teaching Misc

Comparison of seven program query languages

 
Feature
Basic Features?
AST Based
Yes
Yes
Yes
Yes
No
No
Trace Based
No
No
No
No
Yes
Yes
Aggregation
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Path Expression
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Partial Path Expression
Yes
Yes
No?
Yes
Yes
Yes
Yes
Recursion
No
Yes
No
No
Yes
Yes
Yes
Looping
Yes
Yes
No
No
No
Yes
Yes
Quantification
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Expression Binding
Yes
Yes
Yes
Yes
No
No
No
Concrete Syntax
No
Yes
No
No
Yes
No
No
Conditionals
Yes
Yes
No
Yes
Yes
Yes
Yes
Namespace
No
Yes
Yes
Yes
No
Yes?
No
Set Operations
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
?
Yes
?
Yes
?
No?
No?
?
Yes
?
Yes
?
Yes
Yes
?
Yes
?
Yes
?
Transformation
Item Generation
Yes
Yes
No
Yes
Yes
Yes
Yes
Replacement
Yes
Yes
No
No
No **
Yes?
No
Deletions
Yes
Yes
No
No
No **
Yes?
No
User Extensibility
External Code
Yes
No
Yes
No
No
No
No
Grammar Based
Yes
Yes
No
No
No
No
No
Data Flow Queries
Reaching
No
No?
Yes
No
Yes
Yes
Yes
Liveness
No
No?
No
No
Yes
Yes
Yes
Lock graph
Control Flow Queries
Dominator Analysis
No
No?
Yes
Yes
Yes
Yes
Yes
Loop Detection              

* Datalog is a general relational database that can be used to store both ast and program trace information. Datalog is included in this comparison because some program query systems (For example, CodeQuest), are Datalog implementations. PQL is also a sugar layer for the bddbddb implementation of Datalog.

** features simulated with instrumentation

 

Feature Description

AST Based: operates on abstract syntax tree (intermediate representation)

Trace Based: operates on a program trace. i.e a set of ordered program events.

Aggregation:

Path Expression: queries can be specified on nested program elements e.g.

Partial Path Expression: path expressions do not need to be exact/complete. (some parts can be wildcarded)

Recursion: queries can recursively call other queries

Looping : query language supports looping

Quantification : report all occurences of objects of type X

Expression Binding : complex expressions can be

Concrete Syntax: queries on a program can be expressed in a syntax identical to, or closely resembling, the syntax of the program's language.

Conditionals:

Namespace: queries discriminate among program objects by the namespace of the objects

Collections: query matches returned as a set of objects

Intersections: find elements common to multiple distinct query matches

Difference: separate elements according to criteria

Union: combine results from different subqueries

Item Generation: queries can create new program objects

Replacement: queries can exchange existing program objects with other objects

Deletions: queries can remove program objects

External Code: users can write additonal code to extend the function of the query language and execute this code from withing queries

Grammar Based: users can easily add features/functionality to the query language by changing the grammar

Reaching: Instructions between the assignment of a value to a variable and the uses of that value (without intervening assignments)

Liveness: range in which values/variables are still used