Retrieving 3D Models

The problem of matching geometric models is hard because: In practice, researchers don't even bother considering the kinds of geometric difference in (3) (you almost never see any discussion of these in this literature.) Rather, they look for a signature S(M) -- i.e. a vector of numbers computable from the model -- and a matching function over signature d(S1,S2) with the following properties: The signature function may be invertable or not, either If the signature function is not invertible, then there exist M1, M2 that are substantially different such that S(M1)=S(M2). The invariances mentioned in (4) above are achieved (when desired) as follows:

Invariance under translation: Choose the origin of the coordinate system to be a uniquely defined point relative to the model (e.g. the center of mass; the center of mass of the boundary; the center of the minimal circumscribing sphere, etc.) These are reasonably stable, and easily computed.

Invariance under scale. Define the unit length in terms of some characteristic dimension of the model e.g. diameter. (Funkhouser uses the median distance from the center to a point on the boundary, because it is less sensitive to outliers: do these occur much?)

Invariance under rotation. Two methods:

Invariance under reflection

Data size

Funkhouser's system has collected about 20,000 solid models off the Web. (Compare, of course, 2 billion documents indexed by Google.) In the object models provided by Viewpoint, the mean number of triangles per model is 3504 (median is 1536). The number of vertices is half the number of triangles = 1752 on average. Figure that each coordinate of each vertex is a 32bit = 4 Byte floating point, hence 12 Bytes per vertex. Connectivity data: Each triangle needs two bytes for each of its its three vertices. Overall 42 KBytes average per model; 800 MBytes for the whole collection: shape description.

Color information will an additional 3 or 4 bytes per triangle = about half as much information again = total of 1.2 GBytes.

Comparing 3D models to text documents.

Princeton System

A Search Engine for 3D Models by Thomas Funkhouser
The search engine can be found at Princeton 3D Models Search Engine

Repertory of 20,000 solid models collected around the web. Searchable by (1) solid model; (2) 2D hand-drawn sketch; (3) text.



Seed: Results returned by Google and other search engines for queries such as "3D and (models or meshes)".

Guided search, score(P) computed as follows: If P is a 3D model, then score = log(number of triangles).
If P is an HTML page then score = count of keywords in title and text that suggest a relation to 3D modelling.
If P is unvisited, then score is a weighted sum of
1. distance-weighted average of scores of documents linking to it.
2. distance-weighted average of scores of models nearby in link graph
3. site score reflecting proportion of documents retrieved from site that are models.

Searching by shape

Either upload your own model from a file, or use library shape and click on "Similar Shape".

3D shape representation

Spherical Harmonic Desciptor
Spherical harmonic analysis is basically Fourier analysis on the surface of a sphere. The kth descriptor is basically the energy associated with the Kth Fourier component. The following account is not mathematically precise, but it gives the flavor:

The 0th descriptor is just the area of Vr.

For K > 0,
Let P, Q be two random points on Sr such that the distance from P to Q on the sphere is PI*r/K.
(Thus, the angle from P to the center of the grid to Q is PI/k.)
Let C(P) and C(Q) be the circles in Sr centered at P and Q of spherical radius PI*r/2K.
Let D(P,Q) = abs(area(C(P) intersect V) - area(C(Q) intersect V))
Let QK,r be the average of D(P,Q) for all such P,Q.
Then QK,r is roughly the Kth spherical harmonic descriptor of Vr.
(I think. Don't quote me on this until I've had a chance to work through the math more carefully.)

Correct features of this rough account:

The advantages of the actual harmonic analysis are

Indexing and Retrieval

A model is indexed by the 512 (= 32 spheres * 16 descriptors) values of Qk,r.
(Actually, it seems very unlikely that they compute higher-order descriptors for small spheres, as they would be meaningless. So probably more like 256 values.)

Note that this is 1 KByte per model, as opposed to the average of 42 Kbytes quoted before.

The differentness of models M1 and M2 is taken to be the Euclidean distance between the two vectors of descriptors.
The best matches for M are just the K nearest neighbors in the database.

Sources of error:

2D-shape representation

1. Compute silhouette of 3D model from 13 view directions: Four corners of the coordinate cube, three faces, and 6 edge centers. (Note that the silhouette from the antipode is just the reflection.)
Thus, any view of the object is within 22.5 degrees of one of these standard views.

2. Compute the distance transform -- value at each pixel = distance to nearest boundary element.

3. Compute circular harmonic descriptors: exactly analogous to spherical harmonic desciptors.

Comment: I don't quite see how this works for user sketches:

Text queries

Extract all the text you can from the model (e.g. filename, captions, informational fields, etc.) and from the anchors, and match queries in the usual way. Stemming, synonyms for file name.

Multimodal queries

Allowed, either initially or as successive feedback.


Shape matching

Test database: 1890 models, between 120 and 120,000 triangles.
85 classes of sizes between 5 and 153. 610 models not in any class.
E.g. 5 classes of chairs: 153 dining room chairs, 10 desk chairs, 5 directors chairs, 25 living room chairs, 6 lounge chairs
Wide range of shapes: 8 forks, 5 cannons, 6 hearts, 17 plates of food.

Shape matching algorithms:

(As mentioned above, principal axis calculation is inherantly unstable.)

Results: Recall-precision curves very much better for 3D harmonics than for others. Over all classes, at recall = 40%, precision = 30% for 3D harmonics; 20% for D2 and shape histograms; 10% for EGI; 5% for moments. Over living-room chairs, at recall = 40%, precision for 3D harmonics = 65%.

Search time of less than 0.25 seconds in database of 17,500 models.

Sketch Interface Experiment

43 students. Shown model rotating for 15 seconds. Asked to
A. Provide a query of up to 5 words for retrieving model. B. Sketch model from top, front, and side.

Query / sketch / query and sketch used for retrieval.

Results: very variable, for different types of objects. It is not even the case that both combined did better than either singly. Two different evaluation measures -- median rank of target object, and percentage of queries where target object in top 16 -- give rather different results.

Interactive Search Results

Two versions created:
(1) "Similar shape" button retrieves by model match;
(2) "Similar shape" butten retrieves by text match.

Task: to find a particular model. User are constrained to use one or the other version of the search engine.

Text search: 48 seconds search time, 2.8 iterations, 60% find on first query, 77% find within first 10 iterations.

3D shape search: 40 seconds search time, 2.4 iterations, 54% find on first query, 89% find within first 10 iterations.

Other work on indexing 3D models

A Web-based Retrieval System for 3D Polygonal Models Motofumi Suzuki

"Nefertiti: a query by content system for three-dimensional model and image databases management" Eric Paquet and Marc Rioux, Image and Vision Computing Vol. 17, No. 2, pp. 157-166. Can be accessed from NYU accounts here

Both of the above use color information as well as shape information.

3D Model Retrieval Dejan V. Vranic and D. Saupe
Uses principal component analysis.

A 3D Digital Library System: Capture, Analysis, Query, and Display Jeremy Rowe et al.
Library of archeological artifacts.