Internet Vision

Semi-Supervised Learning for Gigantic Image Collections

Humans can only ever hope to manually tag a tiny fraction of the images on the Internet. Using semi-supervised learning methods we can propagate these sparse labels to unlabeled images. The challenge is doing this when we have billions of unlabeled images.

80 Million Tiny Images

One recent direction of investigation is in a "brute force" approach to recognition, using a dataset of 80 million images gathered from the web. By using overwhelming amounts of data, very simple algorithms can perform surprisingly well.

Improving Image Search Engines

One possibility is to use Internet image search engines to provide a diverse set of images from which object category models may be trained. In turn, these models may be applied to collections of images from the Internet or elsewhere, enabling search by visual content (known as content-based image retrieval, CBIR), rather than the text-based searches that are currently employed.

This page may have a more recent version on pmwiki.org: PmWiki:InternetVision, and a talk page: PmWiki:InternetVision-Talk.