Dilip Krishnan

I am a Senior Staff Research Scientist at Google’s Cambridge office (Massachusetts), working on multiple topics in machine learning: generative models (images and text/LLM’s), multimodal representation learning, computer vision and large language models. My research has been deployed in multiple Google PA’s such as Ads, YouTube and Cloud. From August 2013 to November 2014, I was a Postdoctoral Associate with Bill Freeman at MIT’s CSAIL Lab. In June 2013, I received my PhD from the Computer Science department at New York University , under the supervision of Rob Fergus. I was awarded a Microsoft Research PhD Fellowship for 2010-2011, a Dean’s Dissertation Fellowship for 2012-2013 from the Graduate School of Arts and Sciences and the Janet Fabri prize (2013-2014) for outstanding dissertation in Computer Science. I am also an active angel investor and startup advisor.

Email: dilipkay@gmail.com.

CV

Google Scholar

Publications and Products:

L. Fan, D. Krishnan, P. Isola, D. Katabi, Y. Tian. Improving CLIP Training with Language Rewrites. 2023.
- In this paper, we introduce Language augmented CLIP (LaCLIP), a simple yet highly effective approach to enhance CLIP training through language rewrites. Specifically for ImageNet zero-shot accuracy, LaCLIP outperforms CLIP by 8.2% on CC12M and 2.4% on LAION-400M. Code is available at https://github.com/LijieFan/LaCLIP.
H. Chang, H. Zhang, J. Barber, AJ Maschinot, J. Lezama, L. Jiang, M.-H. Yang, K. Murphy, W. T. Freeman, M. Rubinstein, Y. Li, D. Krishnan. Muse: Text-To-Image Generation via Masked Generative Transformers, ICML 2023.
- - We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. More results are available at https://muse-model.github.io
T. Li, H. Chang, S. Mishra, H. Zhang, D. Katabi, D. Krishnan. MAGE: MAsked Generative Encoder To Unify Representation Learning and Image Synthesis, CVPR 2023.
- In this work, we propose MAsked Generative Encoder (MAGE), the first framework to unify SOTA image generation and self-supervised representation learning. Our key insight is that using variable masking ratios in masked image modeling pre-training can allow generative training (very high masking ratio) and representation learning (lower masking ratio) under the same training framework.
S. Mishra, J. Robinson, H. Chang, D. Jacobs, A. Sarna, AJ Maschinot, Dilip Krishnan. A simple, efficient and scalable contrastive masked autoencoder for learning visual representations. 2022.
- We introduce CAN, a simple, efficient and scalable method for self-supervised learning of visual representations. Our framework is a minimal and conceptually clean synthesis of (C) contrastive learning, (A) masked autoencoders, and (N) the noise prediction approach used in diffusion models. CAN outperforms MAE and SimCLR when pre-training on ImageNet, but is especially useful for pre-training on larger uncurated datasets such as JFT-300M.
A. Sellergren, C. Chen, Z. Nablusi, Y. Li, A. Maschinot, A. Sarna, J. Huang, C. Lau, S. R. Kalidindi, M. Etemadi, F. Garcia-Vicente, D. Melnick, Y. Liu, K. Eswaran, D. Tse, N. Beladia, D. Krishnan, S. Shetty. Simplified Transfer Learning for Chest Radiography Models using Less Data. Radiology, 2022.
- Supervised contrastive learning enabled performance comparable to state-of-the-art deep learning models in multiple clinical tasks by using as few as 45 images and is a promising method for predictive modeling with use of small data sets and for predicting outcomes in shifting patient populations.
C. Hermann, K. Sargent, L. Jiang, H. Chang, R. Zabih, C. Liu, D. Krishnan, D. Sun. Pyramid Adversarial Training Improves ViT Performance. CVPR 2022, Oral. Best Paper Finalist.
- A new multi-scale adversarial data augmentation mechanism that provides significant boosts in top-1 accuracy for both clean and robustness metrics.
S. K. Mishra, A. Shah, A. Bansal, A. N. Jagganatha, A. Sharma, D. Jacobs, D. Krishnan. Pyramid Object-aware Cropping for Self-Supervised Learning. 2021.
- We show that object-aware cropping performs significantly better than random cropping as data augmentation for self-supervised learning. On the OpenImages dataset, we achieve an 8.8% MAP improvement using MoCo-v2 pre-training with object-aware cropping.
A. Kalia, D. Krishnan, S. Hassoun. Contrastive Multiview Coding for Enzyme-Substrate Interaction Prediction. Machine Learning in Computational Biology, 2021 (Spotlight).
- We apply contrastive learning to the problem of predicting enzymatic interactions, by leveraging auxiliary data in the KEGG database.
A. Burns, A. Sarna, D. Krishnan, A. Maschinot. Unsupervised Disentanglement without Autoencoding: Pitfalls and Future Directions. ICML 2021 Workshop: Self-Supervised Learning for Reasoning and Perception.
- We explore regularization methods with contrastive learning, which could result in disentangled representations that are powerful enough for large scale datasets and downstream applications.
P. Teterwak, C. Zhang, D. Krishnan, M. Mozer. Understanding Invariance via Feedforward Inversion of Discriminatively Trained Classifiers. ICML 2021.
- Using modern network inversion techniques, we show that surprisingly large amounts of information are stored in the logits of a discriminatively trained deep network.
Y. Tian, C. Sun, B. Poole, D. Krishnan, C. Schmid, P. Isola. What Makes for Good Views for Contrastive Learning?. Neurips 2020.
- We explore the connection between information minimization and views for contrastive representation learning, drawing insights into the tradeoffs in view choices.
P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, D. Krishnan. Supervised Contrastive Learning. To be presented at Neurips 2020.
- A new loss function to train supervised deep networks, based on contrastive learning. Our new loss performs significantly better than cross-entropy across a range of architectures and data augmentations.
- Reference implementation on CIFAR-10 here.
Y. Tian, Y. Wang, D. Krishnan, J. Tenenbaum and P. Isola. Rethinking Few-Shot Image Classification: a good embedding is all you need?, ECCV 2020.
- We show that simply learning good embeddings provides much better reuslts on few-shot learning benchmarks, than fancy optimization algorithms.
Y. Jiang, B. Neyshabur, H. Mobahi, D. Krishnan and S. Bengio. Fantastic Generalization Measures and Where to Find Them, ICLR 2020.
- We perform a comprehensive study of generalization measures by comparing around 40 complexity measures over 10000 deep models. We find surprising observations about the predictive power of these measures, and areas for promising further study.
Y. Tian, D. Krishnan, P. Isola. Contrastive Representation Distillation, ICLR 2020.
- We combine representation learning and knowledge distillation to achieve state of the art results in knowledge transfer across a wide range of tasks. To the best our knowledge, this is the first method to consistently outperform Knowledge Distillation.
Y. Tian, D. Krishnan, P. Isola. Contrastive Multiview Coding, ECCV, 2020.
- We study unsupervised representation learning when multiple views of the data are present, and draw connections to maximization of mutual information.
P. Teterwak, A. Sarna, D. Krishnan, C. Liu, A. Maschinot, W. T. Freeman. Boundless: Generative Adversarial Networks for Image Extension, ICCV 2019.
- We adapt GAN models to image extension problem using a novel training mechanism. This seems to be the first work applying GAN models to image extrapolation.
C. Qin, J. Martens, S. Gowal, D. Krishnan, A. Fawzi, S. De, R. Stanforth, P. Kohli. Adversarial Robustness through Local Linearization, To appear in NeurIPS 2019.
- We present a novel regularizer that promotes local the loss surface of a deep network objective to be locally linear. This promotes robustness to adversarial perturbations, achieving a new state of the art for Imagenet.
V. Birodkar, H. Mobahi, D. Krishnan, S. Bengio. A Closed-form Learned Pooling for Deep Classification Networks, June 2019.
- We introduce a new form of learned pooling that is non-uniform across space and shows benefits for robustness to geometric corruptions and perturbations. Our pooling operator can be computed in closed-form by spectral decomposition of matrices associated with class separability.
Co-organizer of the ICML 2019 workshop, Understanding and Improving Generalization in Deep Networks, with Peter Bartlett and Dawn Song (UC Berkeley), Nati Srebro (TTIC/Google), Hossein Mobahi (Google Research) and Behnam Neyshabur (NYU).
Y. Jiang, D. Krishnan, H. Mobahi and S. Bengio, Predicting the Generalization Gap in Deep Networks with Margin Distributions. Presented at ICLR 2019.
- Building on our previous large margin work, we show that using margin distributions at all layers of a deep network is a powerful predictor of generalization for the CIFAR-10 and CIFAR-100 datasets, doing significantly better than other generalization bounds published in the literature.
G. Elsayed, D. Krishnan, H. Mobahi, K. Regan and S. Bengio. Large Margin Deep Networks for Classification. Presented at NeurIPS 2018.
- We extend the concept of margin, used in Support Vector Machines, to deep networks, allowing for margin at input, hidden or output layers of the network. The resulting models show promising results for generalization, learning from noisy labels and robustness to adversarial examples.
T. Dekel, C. Gan, D. Krishnan, C. Liu, W. T. Freeman. Sparse, Smart Contours to Represent and Edit Images, CVPR 2018.
- We use Generative Adversarial Networks (GANs) to learn the reconstruction of images from sparse contour information, with applications in editing.
D. Mané, D. Smilkov, D. Krishnan, D. Fritz, F. Viégas, J. Wexler, J. Wilson, K. Wongsuphasawat, M. Wattenberg. TensorFlow Graph Visualizer: Drawing Dataflow Graphs of Deep Learning Models. IEEE VAST 2017, Best Paper Award.
Y. Ovadia, Y. Halpern, D. Krishnan, J. Livni, D. Newburger, R. Poplin, T. Zha and D. Sculley. Learning to Count Mosquitoes for the Sterile Insect Technique, KDD 2017.
May 2017, Google product launch (Deep-learning based smart stickers):
- https://techcrunch.com/2017/05/11/googles-neural-network-generated-custom-face-stickers-are-like-bitmoji-that-arent-horrible/
- https://research.googleblog.com/2017/05/neural-network-generated-illustrations.html
- https://twitter.com/search?f=tweets&vertical=default&q=https%3A%2F%2Fwww.fastcodesign.com%2F90124964%2Fexclusive-new-google-tool-uses-ai-to-create-custom-emoji-of-you-from-a-selfie&src=typd
F. Cole, D. Belanger, D. Krishnan, I. Mosseri, A. Sarna, W. T. Freeman. Face Synthesis from Facial Identity Features, CVPR 2017 (Spotlight).
K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, D. Krishnan. Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks, CVPR 2017 (Oral presentation). Presentation video.
- This paper by Apple Inc., received a Best Paper award at CVPR 2017, with an idea and technique very similar to that presented by us. The differences in our papers are in the specific datasets used and applications considered, and some specific aspects of the GAN optimization.
- Follow-up work by teams at Google results in amazing performance for robotic grasping tasks: https://research.googleblog.com/2017/10/closing-simulation-to-reality-gap-for.html
Y. Ovadia, Y. Halpern, D. Krishnan, J. Livni, D. Newburger, R. Poplin, T. Zha and D. Sculley. Learning to Count and Classify Mosquitoes for the Sterile Insect Technique, NIPS 2016 Machine Learning for Healthcare Workshop.
K. Bousmalis, G. Trigeorgis, N. Silberman, D. Krishnan and D. Erhan. Domain Separation Networks, NIPS 2016.
P. Isola, D. Zoran, D. Krishnan and E. H. Adelson. Learning Visual Groups from Co-occurrences in Space and Time, ICLR (Workshop Paper) 2016.
D. Zoran, P. Isola, D. Krishnan and W. T. Freeman. Learning Ordinal Relationships for Mid-Level Vision, ICCV 2015.
Y. Shih, D. Krishnan, F. Durand and W. T. Freeman. Reflection Removal using Ghosting Cues, CVPR 2015.
- MIT News profile of our work.
D. Zoran, D. Krishnan, J. Bento and W. T. Freeman. Shape and Illumination from Shading using the Generic Viewpoint Assumption, NIPS 2014.
P. Isola, D. Zoran, D. Krishnan and E. H. Adelson. Crisp Boundary Detection Using Pointwise Mutual Information, ECCV 2014.
D. Krishnan, J. Bruna and R. Fergus. Blind Deconvolution with Non-local Sparsity Reweighting, ArXiv, June 2014.
D. Eigen, D. Krishnan and R. Fergus. Restoring An Image Taken Through a Window Covered with Dirt or Rain, ICCV 2013.
D. Krishnan, R. Fattal and R. Szeliski. Efficient Preconditioners for Laplacians Arising in Computer Graphics and Computational Photography, SIGGRAPH 2013.
D. Krishnan, R. Szeliski. Multigrid and Multilevel Preconditioners for Computational Photography, SIGGRAPH Asia 2011.
D. Krishnan, T. Tay, R. Fergus. Blind Deconvolution using a Normalized Sparsity Measure, CVPR 2011.
M. Zeiler, D. Krishnan, G. Taylor, R. Fergus. Deconvolutional Networks, CVPR 2010.
D. Krishnan, R. Fergus. Fast Image Deconvolution using Hyper-Laplacian Priors , NIPS 2009.
D. Krishnan, R. Fergus. Dark Flash Photography, SIGGRAPH 2009.
- Our Dark Photography work has been profiled online: http://www.engadget.com/2009/07/17/invisible-flash-produces-photos-without-glares/ http://blogs.zdnet.com/gadgetreviews/?p=5977 http://www.newscientist.com/article/dn17468-invisible-flash-takes-photos-without-the-glare.html
D. Krishnan, Q. V. Pham, A. Y. Yip. A primal-dual active-set algorithm for bilaterally constrained total variation deblurring and piecewise constant Mumford-Shah segmentation problems, Advances in Computational Mathematics. Published Online, September 2008.
D. Krishnan, P. Lin, A. Y. Yip. A primal-dual active-set method for non-negativity constrained total variation deblurring problems, IEEE Transactions on Image Processing, 16(11), pp. 2766-2777, November 2007.
D. Krishnan, P. Lin, X. C. Tai. An efficient operator-splitting method for noise removal in images, Communications in Computational Physics, 1(5), pp. 847-858, October 2006.
D. Krishnan, M. N. Chong, S. Kalra. On the computational aspects of Gibbs-Markov random field modelling of missing data in image sequences, IEEE Transactions on Image Processing, 8(8), pp. pp. 1139-1142, August 1999.
S. Kalra, D. Krishnan, M. N. Chong. A MRF based model based scheme for accurate detection and adaptive interpolation of missing data in highly corrupted image sequences, IEEE Conference on Image Processing, 1999.
M. N. Chong, D. Krishnan. An edge-preserving MRF model for the detection of missing data in image sequences, IEEE Transactions on Signal Processing, 5(4), pp. 81-83, April 1998.

Patents (USPTO): List of issued and applied patents here.

Entrepreneurship/Angel Investing/Advising:

Angel investor/advisor: helm.ai, jabali.ai, AgShift, Piction Health, Folia Health, and many others.
Co-founder and CTO of Nirvana Digital, later acquired by Black Magic Designs. Nirvana Digital developed Revival, a film and video restoration system, now used in post-production companies worldwide, and Resolve, a world-leading solution that combines editing, color correction, visual effects, motion graphics and audio post production in one tool. These products are installed on millions of laptops worldwide. Resolve was featured in Apple’s 2022 Keynote.

Professional Activities: Reviewer/Area Chair for the following conferences and journals:

NeurIPS, ICML, ICLR, ICCV, CVPR, ECCV etc.

Talks (not updated recently):

Efficient Preconditioning for Laplacian Matrices. Microsoft Research, Cambridge, England, January 2014.
Invited Talk – Fast Image Deconvolution Using Hyper Laplacian Priors. Group Meeting of Fredo Durand’s graphics group, MIT (CSAIL), October 22, 2010.
Invited Talk – Dark Flash Photography and Fast Image Deconvolution Using Hyper Laplacian Priors. Rick Szeliski’s Interactive Visual Media Group, Microsoft Research, Seattle, December 3, 2009.
Dark Flash Photography. SIGGRAPH 2009, New Orleans, August 7, 2009.
Dark Flash Photography. Group meeting of Laboratory for Computational Vision, NYU, May 28, 2009.
Dark Flash Photography. Group meeting of Bill Freeman’s Vision Group, MIT (CSAIL), May 19, 2009.
Dark Flash Photography. Graphics Seminar, NYU, May 8, 2009.