NYC Computer Vision Day 2024

Tentative Schedule

Our schedule gives near-equal time to informal discussion and talks to encourage discussion; our talks are primarily from students to give them visibility in the community; our start is slightly later than usual since people are coming from far away.

Breakfast, lunch, and coffee are all provided, and there is time for informal discussion before the official start time of 10AM and after the official end time of 6PM.

9:30 - 10AM	Breakfast Available and Informal Conversations (Not part of official program)
10AM - Noon	Talk Session 1: 2 Keynotes, 9 Lightning Talks
		Keynote 1: Chuang Gan (UMass Amherst) Learning World Models for Embodied Generalist Agents
	⚡ Rundi Wu (Columbia): ReconFusion: 3D Reconstruction with Diffusion Priors
	⚡ Yueyu Hu (NYU): Towards 3D Telepresense via Point Cloud Videos: Compression, Streaming, and Rendering
	⚡ Jason Ma (Penn): Foundation Reward Models for Robot Learning from Human Videos
	⚡ Akshaj Veldanda (NYU): Hyper-parameter Tuning for Fair Classification without Sensitive Attribute Access
		Keynote 2: Christine Allen-Blanchette (Princeton) Representing color as a symmetry We introduce a convolutional neural network equivariant to color variation by design. We leverage the observation that changes in hue and saturation can be modeled geometrically to encode color information explicitly, resulting in improved interpretability, accuracy and generalizability over conventional counterparts.
	⚡ Ruoshi Liu (Columbia): Learning to Design Tools in the Real World
	⚡ Zeliang Zhang (U. Rochester): Discover and Mitigate Multiple Biased Subgroups in Image Classifiers
	⚡ Aishik Konwer (Stony Brook University): Less is Enough: Representation Learning with Low-resource Medical Imaging Datasets
	⚡ Katherine Xu / Huzheng Yang (Penn): Amodal Completion via Progressive Mixed Context Diffusion / Brain Decodes Deep Nets
	⚡ Shengyi Qian (Michigan/NYU): Understanding 3D Object Interaction from Ordinary Images
Noon - 1:30PM	Lunch & Networking
1:30 - 2:40PM	Talk Session 2: 10 Lightning Talks
	⚡ Mahi Shafiullah (NYU): On Bringing Robots Home
	⚡ Ruyi Lian (Stony Brook University): CheckerPose: Progressive Dense Keypoint Localization for Object Pose Estimation with Graph Neural Network
	⚡ Faith Johnson (Rutgers): Feudal Networks for Visual Navigation
	⚡ Aditya Chattopadhyay (Penn/JHU): An Information-theoretic Framework for Explainable ML
	⚡ R. Kenny Jones (Brown): Learning to Infer Generative Template Programs for Visual Concepts
	⚡ Xuan Wang (CUNY): GMC: A General Framework of Multi-stage Context Learning and Utilization for Visual Detection Tasks
	⚡ Nate Gillman (Brown): Self-Correcting Self-Consuming Loops for Generative Model Training
	⚡ Lahav Lipson (Princeton): Rapid 3D Mapping
	⚡ Rahul Sajnani (Brown): GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
	⚡ Mingzhen Huang (SUNY Buffalo): Detecting Text-Image Inconsistency with Diffusion Models
2:40 - 3:40PM	Poster Session 1 & Coffee Break
3:40 - 5:00PM	Talk Session 3: 1 Keynote, 7 Lightning Talks
		Keynote 3: Lingjie Liu (Penn) Single-view 3D Reconstruction with Diffusion Priors Single-view 3D reconstruction is an ill-posed problem for traditional reconstruction algorithms. In this talk, I will present our recent work using diffusion priors for single-view reconstruction.
	⚡ Alexandros Graikos (Stony Brook University): Diffusion models for synthesis of large digital histopathology images
	⚡ Shimian Zhang (Penn State): Recurrence in Human and Machine Perception
	⚡ Sunnie S. Y. Kim (Princeton): Bridging Computer Vision and HCI: Understanding End-Users' Trust and Explainability Needs in a Real-World Computer Vision Application
	⚡ Cheng Phoo (Cornell) & Utkarsh Mall (Cornell/Columbia): Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment
	⚡ Xudong Lin (Columbia): Stop Wasting Computation on Crossmodal Pretraining for Large Multimodal Models
	⚡ Xichen Pan (NYU): Image Sculpting: Precise Object Editing with 3D Geometry Control
	⚡ Irving Fang (NYU): EgoPAT3Dv2: Predicting 3D Action Target from 2D Egocentric Vision for Human-Robot Interaction
5:00 - 6:00PM	Poster Session 2 & Snacks
6:00 - 6:30PM	Snacks Available and Informal Conversations (Not part of official program)