Publications
You can also find publications on my Google Scholar profile.
Preprint
Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models
Amrith Setlur*, Saurabh Garg*, Virginia Smith, Sergey Levine
Generate to Discriminate: Expert Routing for Continual Learning
Yewon Byun, Sanket Vaibhav Mehta, Saurabh Garg, Emma Strubell, Bryan Wilder, Zachary Chase Lipton
PRO: Pseudo-label Regularized Optimization on Unlabeled Test Data
Tzu-Ching Yen, Saurabh Garg, Alex Smola, Zachary Chase Lipton, Francesco Locatello
Publications
TiC-CLIP: Continual Training of CLIP Models
Saurabh Garg, Mehrdad Farajtabar, Hadi Pouransari, Raviteja Vemulapalli, Sachin Mehta, Oncel Tuzel, Vaishaal Shankar, Fartash Faghri
Oral at NeurIPS DistShift Workshop, 2023
Paper / Code / Talk / Poster / Summary / Bibtex
We highlight temporal distribution shift problems with OpenAI CLIP models and propose a continual learning benchmark with 12.7 B image-text pairs with time metadata for continual training of CLIP.
Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift
Saurabh Garg*, Amrith Setlur*, Zachary Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan
Advances in Neural Information Processing Systems (NeurIPS), 2023
Paper / Code / Talk / Poster / Summary / Bibtex
Our work examines the synergy between self-training and contrastive learning, finding complementary benefits under distribution shift setups and up to 8% accuracy improvement in certain datasets.
Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms
Dheeraj Baby*, Saurabh Garg*, Thomson Yen*, Sivaraman Balakrishnan, Zachary Lipton, Yu-Xiang Wang
Spotlight at Advances in Neural Information Processing Systems (NeurIPS), 2023
Paper / Code / Talk / Poster / Summary / Bibtex
The paper presents new algorithms that adapt to shifting label distributions in supervised and unsupervised online learning settings, ensuring optimal dynamic regret without knowing the extent of label distribution drift, and improving accuracy by 1-3% in various scenarios while remaining sample and computationally efficient.
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy
Elan Rosenfeld, Saurabh Garg
Advances in Neural Information Processing Systems (NeurIPS), 2023
Paper / Code / Talk / Poster / Summary / Bibtex
The paper introduces a new upper bound on the error for deep neural networks under distribution shifts, utilizing unlabeled test data, an intuitive condition, and a novel `disagreement loss', providing reliable and tight error bounds.
RLSbench: A Large-Scale Empirical Study of Domain Adaptation Under Relaxed Label Shift
Saurabh Garg, Nick Erickson, James Sharpnack, Alex Smola, Siva Balakrishnan, Zachary Lipton
NeurIPS Workshop on Distribution Shifts (DistShift), 2022
International Conference on Machine Learning (ICML), 2023
Website / Paper / Code / Talk / Poster / Summary / Bibtex
A large scale study of domain adaptation methods under scenarios where both label distribution and conditionals p(x|y) may shift, highlights brittleness of existing methods and simple fixes that improves the performance.
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Kundan Krishna, Saurabh Garg, Jeffrey Bigham, Zachary Lipton
NeurIPS Workshop on Transfer Learning for NLP, 2022
Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Paper / Code / Talk / Poster / Summary / Bibtex
We observe that pretraining only on the downstream dataset can perform comparably or often even better than pretraining on a huge upstream corpora.
CHILS: Zero-shot Image Classification with Hierarchical Label Sets
Zachary Novack, Julian McAuley Zachary Lipton, Saurabh Garg
First Workshop on Multimodal Representation Learning at ICLR, 2023
International Conference on Machine Learning (ICML), 2023
Paper / Code / Talk / Poster / Summary / Bibtex
This work introduces CHiLS, a strategy for zero-shot classification to improve CLIP-like models that focuses on improving class names and utilizes implicit semantic hierarchies to enhance accuracy without requiring additional training.
Deconstructing Distributions: A Pointwise Framework of Learning
Gal Kaplun*, Nikhil Ghosh*, Saurabh Garg, Boaz Barak, Preetum Nakkiran
NeurIPS Workshop on Distribution Shifts (DistShift), 2022
International Conference on Learning Representations (ICLR), 2023
Paper / Code / Talk / Poster / Summary / Bibtex
We propose a new lens for studying the pointwise performance of learning algorithms which reveals new insights into their behavior and goes beyond traditional notions of in-distribution and "out-of-distribution" learning.
Disentangling the Mechanisms Behind Implicit Regularization in SGD
Zachary Novack, Simran Kaur, Tanya Marwah, Saurabh Garg, Zachary Lipton
Spotlight at NeurIPS Workshop on The Benefits of Higher-Order Optimization in Machine Learning, 2022
International Conference on Learning Representations (ICLR), 2023
Paper / Code / Talk / Poster / Bibtex
Domain Adaptation under Open Set Label Shift
Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton
ICML Workshop on Spurious Correlations, Invariance, and Stability (SCIS), 2022
Advances in Neural Information Processing Systems (NeurIPS), 2022
Paper / Code / Talk / Poster / Summary / Bibtex
We introduce the problem of domain adaptation under Open Set Label Shift (OSLS) where the label distribution can change arbitrarily and a new class may arrive during deployment, but the class-conditional distributions p(x|y) are domain-invariant.
Unsupervised Learning under Latent Label Shift
Manley Roberts, Pranav Mani, Saurabh Garg, Zachary C. Lipton
ICML Workshop on Spurious Correlations, Invariance, and Stability (SCIS), 2022
Advances in Neural Information Processing Systems (NeurIPS), 2022
Paper / Code / Talk / Poster / Summary / Bibtex
We introduce unsupervised learning under Latent Label Shift (LLS), where we have access to unlabeled data from multiple domains such that the label marginals p_d(y) can shift across domains but the class conditionals p(x|y) do not.
Characterizing Datapoints via Second-Split Forgetting
Pratyush Maini, Saurabh Garg, Zachary Lipton, Zico Kolter
Spotlight at ICML Workshop on Spurious Correlations, Invariance, and Stability (SCIS), 2022
Advances in Neural Information Processing Systems (NeurIPS), 2022
Paper / Code / Talk / Poster / Summary / Bibtex
We analyze the forgetting and learning dynamics of neural networks to characterize different types of hard examples as belonging to mislabeled, rare and complex categories..
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance
Saurabh Garg, Sivaraman Balakrishnan, Zachary Lipton, Behnam Neyshabur, Hanie Sedghi
NeurIPS Workshop on Distribution Shift (DistShift), 2021
International Conference on Learning Representations (ICLR), 2022
Paper / Code / Talk / Poster / Summary / Bibtex
Given access to labeled source data and unlabeled target data, we investigate methods to predict target domain performance and find a simple method that does surprisingly well.We also explore the theoretical foundations of the problem, proving that identifying the accuracy is just as hard as identifying the optimal predictor.
Mixture Proportion Estimation and PU Learning: A Modern Approach
Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary Lipton
Spotlight at Advances in Neural Information Processing Systems (NeurIPS), 2021
ICML Workshop on Uncertainty in Deep Learning, 2021
Paper / Code / Talk / Poster / Summary / Bibtex
Given only Positive (P) and Unlabeled (U) data, containing both P and Negative (N) samples, we propose new approaches to estimate fraction of P in U and learn P vs N classifier.
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg, Sivaraman Balakrishnan, Zico Kolter, Zachary Lipton
Long Talk at International Conference on Machine Learning (ICML), 2021
ICLR Workshop on RobustML, 2021
Paper / Code / Talk / Poster / Summary / Bibtex
We introduce a method that leverages unlabeled data to produce generalization bound. When a trained model fits clean training data well but randomly labeled training data added in poorly, we show that its generalization to the population is guaranteed.
On Proximal Policy Optimization’s Heavy-Tailed Gradients
Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, Zico Kolter, Zachary Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar
Short talk at International Conference on Machine Learning (ICML), 2021
ICLR Workshop on Science and Engineering of Deep Learning, 2021
Paper / Code / Talk / Poster / Summary / Bibtex
We empirically characterized PPO’s gradients, demonstrating that they become more heavy-tailed as training proceeds. We examined issues due to heavy-tailed nature of gradients and show that PPO clipping heuristics offset heavy-tailedness in gradients.
A Unified View of Label Shift Estimation
Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary Lipton
Advances in Neural Information Processing Systems (NeurIPS), 2020
Contributed Talk at ICML Workshop on Uncertainty in Deep Learning, 2020
Paper / Talk / Poster / Summary / Bibtex
We provide a unified framework relating techniques that use off-the-shelf predictors for label shift estimation. We argue that these methods all employ calibration, either explicitly or implicitly, differing only in the choice of calibration method and their optimization objective.
Neural Architecture for Question Answering Using a Knowledge Graph and Web Corpus
Uma Sawant, Saurabh Garg, Soumen Chakrabarti, Ganesh Ramakrishnan
Information Retrieval Journal, 2019
Invited Oral at European Conference on Information Retrieval (ECIR), 2020
Paper / Talk / Bibtex
Estimating Uncertainty in MRF-based Image Segmentation: An Exact-MCMC Approach
Suyash Awate, Saurabh Garg, Rohit Jena
Medical Image Analysis Journal, 2019
Paper / Bibtex
Code-Switched Language models using Dual RNNs and Same-Source Pretraining
Saurabh Garg*, Tanmay Parekh*, Preethi Jyothi (*joint first authors)
Awarded EMNLP Non-Student Travel Grant
Proceedings of Empirical Methods in Natural Language Processing (EMNLP), 2018
Paper / Bibtex
Uncertainty Estimation in Segmentation with Perfect MCMC Sampling in Bayesian MRFs
Saurabh Garg, Suyash Awate
Proceedings of Medical Image Computing & Computer Assisted Intervention (MICCAI), 2019
Paper / Bibtex
Dual Language Models for Code Mixed Speech Recognition
Saurabh Garg, Tanmay Parekh, Preethi Jyothi
Awarded ISCA Student Travel Grant
Proceedings of Interspeech 2018 (19th Annual Conference of ISCA)
Paper / Bibtex