The NYC Computer Vision Day is an invite-only event that aims to be an informal day where the computer vision community from NYC and surroundings can share ideas and meet. A primary focus is visibility for graduate students and early career researchers. In addition to a strong showing from ≈ 260 researchers from 60+ research labs and 15+ universities, we anticipate a small number of our industry friends.
Our schedule gives near-equal time to informal discussion and talks to encourage discussion; our talks are primarily from students to give them visibility in the community; our start is slightly later than usual since people are coming from far away.
Breakfast, lunch, and coffee are all provided, and there is time for informal discussion before the official start time of 10AM and after the official end time of 6PM.
9:30 - 10AM | Breakfast Available and Informal Conversations (Not part of official program) | ||
10AM - Noon | Talk Session 1: 2 Keynotes, 9 Lightning Talks | ||
Keynote 1: Chuang Gan (UMass Amherst) Learning World Models for Embodied Generalist Agents |
|||
⚡ Rundi Wu (Columbia): ReconFusion: 3D Reconstruction with Diffusion Priors | |||
⚡ Yueyu Hu (NYU): Towards 3D Telepresense via Point Cloud Videos: Compression, Streaming, and Rendering | |||
⚡ Jason Ma (Penn): Foundation Reward Models for Robot Learning from Human Videos | |||
⚡ Akshaj Veldanda (NYU): Hyper-parameter Tuning for Fair Classification without Sensitive Attribute Access | |||
Keynote 2: Christine Allen-Blanchette (Princeton) Representing color as a symmetry We introduce a convolutional neural network equivariant to color variation by design. We leverage the observation that changes in hue and saturation can be modeled geometrically to encode color information explicitly, resulting in improved interpretability, accuracy and generalizability over conventional counterparts. |
|||
⚡ Ruoshi Liu (Columbia): Learning to Design Tools in the Real World | |||
⚡ Zeliang Zhang (U. Rochester): Discover and Mitigate Multiple Biased Subgroups in Image Classifiers | |||
⚡ Aishik Konwer (Stony Brook University): Less is Enough: Representation Learning with Low-resource Medical Imaging Datasets | |||
⚡ Katherine Xu / Huzheng Yang (Penn): Amodal Completion via Progressive Mixed Context Diffusion / Brain Decodes Deep Nets | |||
⚡ Shengyi Qian (Michigan/NYU): Understanding 3D Object Interaction from Ordinary Images | |||
Noon - 1:30PM | Lunch & Networking | ||
1:30 - 2:40PM | Talk Session 2: 10 Lightning Talks | ||
⚡ Mahi Shafiullah (NYU): On Bringing Robots Home | |||
⚡ Ruyi Lian (Stony Brook University): CheckerPose: Progressive Dense Keypoint Localization for Object Pose Estimation with Graph Neural Network | |||
⚡ Faith Johnson (Rutgers): Feudal Networks for Visual Navigation | |||
⚡ Aditya Chattopadhyay (Penn/JHU): An Information-theoretic Framework for Explainable ML | |||
⚡ R. Kenny Jones (Brown): Learning to Infer Generative Template Programs for Visual Concepts | |||
⚡ Xuan Wang (CUNY): GMC: A General Framework of Multi-stage Context Learning and Utilization for Visual Detection Tasks | |||
⚡ Nate Gillman (Brown): Self-Correcting Self-Consuming Loops for Generative Model Training | |||
⚡ Lahav Lipson (Princeton): Rapid 3D Mapping | |||
⚡ Rahul Sajnani (Brown): GeoDiffuser: Geometry-Based Image Editing with Diffusion Models | |||
⚡ Mingzhen Huang (SUNY Buffalo): Detecting Text-Image Inconsistency with Diffusion Models | |||
2:40 - 3:40PM | Poster Session 1 & Coffee Break | ||
3:40 - 5:00PM | Talk Session 3: 1 Keynote, 7 Lightning Talks | ||
Keynote 3: Lingjie Liu (Penn) Single-view 3D Reconstruction with Diffusion Priors Single-view 3D reconstruction is an ill-posed problem for traditional reconstruction algorithms. In this talk, I will present our recent work using diffusion priors for single-view reconstruction. | |||
⚡ Alexandros Graikos (Stony Brook University): Diffusion models for synthesis of large digital histopathology images | |||
⚡ Shimian Zhang (Penn State): Recurrence in Human and Machine Perception | |||
⚡ Sunnie S. Y. Kim (Princeton): Bridging Computer Vision and HCI: Understanding End-Users' Trust and Explainability Needs in a Real-World Computer Vision Application | |||
⚡ Cheng Phoo (Cornell) & Utkarsh Mall (Cornell/Columbia): Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment | |||
⚡ Xudong Lin (Columbia): Stop Wasting Computation on Crossmodal Pretraining for Large Multimodal Models | |||
⚡ Xichen Pan (NYU): Image Sculpting: Precise Object Editing with 3D Geometry Control | |||
⚡ Irving Fang (NYU): EgoPAT3Dv2: Predicting 3D Action Target from 2D Egocentric Vision for Human-Robot Interaction | |||
5:00 - 6:00PM | Poster Session 2 & Snacks | ||
6:00 - 6:30PM | Snacks Available and Informal Conversations (Not part of official program) |
We will have both oral presentations and posters.
Posters: Each attending PI will be given a 24" (high) x 36" (wide) posterboard in one session. This can be used as the PI sees fit: for instance, a single larger poster or multiple smaller posters.
Presentations: Lightning talks will have 5 minutes to present, and keynotes will have 17 minutes plus 5 minutes of questions.
NYU has strict building security and our room has limited capacity (≤284 in the format that allows us posters). Thus, we have a strict guest list. If you are not a confirmed guest, you will not be admitted to the building or event. There are no exceptions.
NYU Tandon ECE |
NYU Courant CS |