These are the final papers.
Date | Class | Meta-Topic | Topic | Presenters | Paper 1 | Paper 2 | Presentation Review Panels | Things Due |
Aug 30 2021 | 0 | Getting situated | Intro | fouhey | | | | |
Sep 1 2021 | 1 | | Reading, writing, & arithmetic | fouhey, NA, NA | Unsupervised Visual Representation Learning by Context Prediction (Doersch et al. 2015) | None | | |
Sep 6 2021 | 2 | The basics | Labor Day – no class | NA, NA, NA | | | | |
Sep 8 2021 | 3 | | Architectures | tbd, tbd, tbd | Densely connected convolutional networks (Huang et al. CVPR 2017) | Exploring Randomly Wired Neural Networks for Image Recognition (Xie et al. ICCV 2019) | A; F | |
Sep 13 2021 | 4 | Getting rid of supervision | Self-supervised learning for pretraining | tbd, tbd, tbd | Momentum Contrast for Unsupervised Visual Representation Learning (He et al. 2019) | Exploring Simple Siamese Representation Learning (Chen et al. CVPR 2021) | B; G | |
Sep 15 2021 | 5 | | Reducing supervision for tasks | tbd, tbd, tbd | Space-Time Correspondence as a Contrastive Random Walk (Jabri et al. ECCV 2020) | Shelf-Supervised Mesh Prediction in the Wild (Ye et al. CVPR 2021) | C; H | |
Sep 20 2021 | 6 | Detection | Detection | tbd, tbd, tbd | MaskRCNN (He et al. ICCV 2017) | PointRend: Image Segmentation as Rendering (Kirillov et al. CVPR 2020) | D; I | |
Sep 22 2021 | 7 | | Naming everything | tbd, tbd, tbd | LVIS: A Dataset for Large Vocabulary Instance Segmentation (Gupta et al. CVPR 2019) | Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details (Dave et al. Arxiv 2021) | E; J | |
Sep 27 2021 | 8 | Data gone wrong | Dataset bias | tbd, tbd, tbd | An Unbiased Look at Dataset Bias (Torralba and Efros CVPR 2011) | Do ImageNet Classifiers Generalize to ImageNet? (Recht et al. ICML 2019) | F; A | Re-Proposal Due |
Sep 29 2021 | 9 | | Domain adaptation | tbd, tbd, tbd | Understanding Self-Training for Gradual Domain Adaptation (Kumar et al. ICML 2020) | SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation (Prabhu et al. ICCV 2021) | G; B | |
Oct 4 2021 | 10 | What are our models doing | Ethics & fairness | tbd, tbd, tbd | Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification (Buolamwini and Gebru FAT 2018) | Dissecting racial bias in an algorithm used to manage the health of populations (Obermeyer et al. Science 2019) | H; C | |
Oct 6 2021 | 11 | What are we doing? | If I had 100 million dollars | NA, NA, NA | | | I; D | |
Oct 11 2021 | 12 | 3D | 3D – Implicit functions | tbd, tbd, tbd | PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization (Saito et al. ICCV 2019) | Occupancy Networks: Learning 3D Reconstruction in Function Space (Mescheder et al. CVPR 2019) | J; E | Project Proposal Due |
Oct 13 2021 | 13 | | 3D – Correspondence and pose | tbd, tbd, tbd | SuperGlue: Learning Feature Matching with Graph Neural Networks (Sarlin et al. CVPR 2020) | Bootstrap Your Own Correspondences (El Banani and Johnson, ICCV 2021) | A; F | |
Oct 18 2021 | 14 | Embodiment | NO CLASS But here are some nice papers | tbd, tbd, tbd | Semantic Visual Navigation by Watching Youtube Videos (Chang et al. NeurIPS 2020) | RMA: Rapid Motor Adaptation for Legged Robots (Kumar et al., 2021) | B; G | |
Oct 20 2021 | 15 | | 3D – Multiview | tbd, tbd, tbd | DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras (Teed et al. Arxiv 2021) | SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans (Dai et al. CVPR 2021) | C; H | |
Oct 25 2021 | 16 | | 3D – Radiance fields | tbd, tbd, tbd | NeRF (Mildenhall et al. ECCV 2020) | pixelNERF (Yu et al. CVPR 2021) | D; I | |
Oct 27 2021 | 17 | Embodiment | Hands | tbd, tbd, tbd | Reconstructing Hand-Object Interactions in the Wild (Cao et al. ICCV 2021) | Towards unconstrained joint hand-object reconstruction from RGB videos (Hasson et al. Arxiv 2021) | E; J | |
Nov 1 2021 | 18 | People | Reconstruction | tbd, tbd, tbd | PARE: Part Attention Regressor for 3D Human Body Estimation (Kocabas et al. CVPR 2021) | Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos (Jafarian et al. CVPR 2021) | F; A | |
Nov 3 2021 | 19 | | Understanding interaction | tbd, tbd, tbd | Rearrangement: A Challenge for Embodied AI (Batra et al. Arxiv 2020) | Where2Act: From Pixels to Actions for Articulated 3D Objects (Mo et al. ICCV 2021) | G; B | |
Nov 8 2021 | 20 | Transformers and other models | Transformers | tbd, tbd, tbd | Attention Is All You Need (Vaswani et al. NeurIPS 2017) | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Dosovitskiy et al. Arxiv 2020) | H; C | |
Nov 10 2021 | 21 | | Others | tbd, tbd, tbd | MLP-Mixer: An all-MLP Architecture for Vision (Tolstikhin et al. Arxiv 2021) | Perceiver IO: A General Architecture for Structured Inputs & Outputs (Jaegle et al. Arxiv 2021) | I; D | |
Nov 15 2021 | 22 | Videos | Video | tbd, tbd, tbd | Rescaling Egocentric Vision (Damen et al. ECCV 2020) | Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos (Li et al. CVPR 2020) | J; E | |
Nov 17 2021 | 23 | Science | Science | tbd, tbd, tbd | OpenMonkeyStudio: Automated Markerless Pose Estimation in Freely Moving Macaques (Bala et al. Nature Communications 2020) | Detecting and Tracking Communal Bird Roosts in Weather Radar Data (Cheng et al. AAAI 2020) | A; F | Project Preliminary Report Due |
Nov 22 2021 | 24 | Where are we going? | Where are we going? | NA, NA, NA | | | | |
Nov 24 2021 | 25 | Thanksgiving | Thanksgiving – no class | NA, NA, NA | | | | |
Nov 29 2021 | 26 | Other modalities | Language | tbd, tbd, tbd | Learning Transferable Visual Models From Natural Language Supervision (Radford et al. 2021) | Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling (Lei et al. CVPR 2021) | C; H | |
Dec 1 2021 | 27 | | Sound | tbd, tbd, tbd | Audio-Visual Scene Analysis with Self-Supervised Multisensory Features (Owens et al. ECCV 2018) | Self-supervised object detection from audio-visual correspondence (Afouras et al. Arxiv 2021) | D; I | |
Dec 6 2021 | 28 | Generating pictures | Generative models | tbd, tbd, tbd | Generating Diverse High-Fidelity Images with VQ-VAE-2 (Razavi et al. NeurIPS 2019) | Zero-Shot Text-to-Image Generation (Ramesh et al. 2021) | E; J | Project Final Report Draft
|