Standard analyses of DP-SGD rely on the unrealistic bounded-gradient assumption. We propose an algorithm with high-probability convergence guarantees that avoids this restriction while achieving an optimal privacy-utility tradeoff [13]. Complementary SDE analyses [19] show that adaptive methods such as DP-SignSGD and DP-Adam perform better under strong privacy constraints and are less sensitive to learning-rate tuning.
Works on distributed RL remains limited, and such problems are inherently nonsmooth due to operations like maximization. In [15], we develop an algorithm with provable convergence for nonsmooth convex objectives that simultaneously handles both the objective and constraints. The method significantly outperforms several baselines in distributed training of humanoid agents.
Compression is a practical approach to reducing communication overhead. Biased compressors (e.g., Top-K) are particularly effective in practice, but introduce bias that can degrade performance. We design provably convergent algorithms for smooth nonconvex, convex, and strongly convex objectives that achieve optimal asymptotic complexity while reducing communication in both centralized [9] and decentralized [10] settings. In [5], we further develop adaptive biased compressors that adjust compression strength during training.
Standard convergence guarantees for local methods do not capture the benefits of local updates, despite their strong empirical performance. In [6], we identified a class of functions for which local methods provably achieve faster convergence.
Transmitting full Hessians in federated settings is impractical. We address this bottleneck by designing second-order methods that employ Hessian and gradient compression to reduce communication overhead [1-4]. Crucially, these methods retain the hallmark properties of Newton's method, achieving local superlinear convergence independent of the condition number.
These theoretical results advance the understanding of optimization algorithms in federated and distributed settings by providing stronger guarantees under realistic conditions, improving communication and computation efficiency, and explaining the robustness and effectiveness of modern methods in practice.
14th International Conference on Learning Representations (ICLR 2026)
PDF
arXiv
ICLR
Poster
Abstract
42nd International Conference on Machine Learning (ICML 2025)
PDF
arXiv
ICML
Poster
Abstract
28th International Conference on Artificial Intelligence and Statistics (Oral, AISTATS 2025)
PDF
arXiv
AISTATS
Poster
Abstract
13th International Conference on Learning Representations (ICLR 2025)
PDF
arXiv
ICLR
Poster
Abstract
12th International Conference on Learning Representations (ICLR 2025)
PDF
arXiv
ICLR
Poster
Abstract
27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)
PDF
arXiv
AISTATS
Poster
Abstract
Transactions on Machine Learning Research (TMLR 2025)
PDF
arXiv
TMLR
Abstract
Transactions on Machine Learning Research (TMLR 2023)
PDF
arXiv
TMLR
Abstract
Transactions on Machine Learning Research (TMLR 2023)
PDF
arXiv
TMLR
poster
HOO-22 NeurIPS
Abstract
25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022)
PDF
arXiv
AISTATS
Poster
Abstract
39th International Conference on Machine Learning (ICML 2022)
PDF
arXiv
ICML
FL-ICML
BFOM-ICML
Poster
3-min video
Abstract
38th International Conference on Machine Learning (ICML 2021)
PDF
arXiv
ICML
NSF-TRIPODS
Poster
5-min video
Abstract