Education

 
 
 
 
 

PhD in Computer Science

UniBas, Department of Mathematics and Computer Science

Oct. 2023 – Present Basel, Switzerland
Supervisor: Aurelien Lucchi
 
 
 
 
 

Master of Science in Data Science

IP Paris, Department of Applied Mathematics

Sept. 2021 – Aug. 2023 GPA 17.65/20 Palaiseau, France
Thesis: Unified Analysis of Asynchronous Algorithms
Thesis Supervisor: Mher Safaryan, Dan Alistarh
 
 
 
 
 

Bachelor of Science in Applied Mathematics and Physics

MIPT, Phystech School of Applied Mathematics and Informatics

Sept. 2017 – Jul. 2021 GPA 4.95/5 (9.27/10)Dolgoprudny, Russia
Thesis: Distributed Second Order Methods with Fast Rates and Compressed Communication
Thesis Supervisor: Peter Richtárik

Recent Posts

I am happy to share some exciting news that just came from ICML 2026. I co-authored 4 papers accepted to the conference, including one spotlight! I am extremely grateful to all my collaborators, and lucky to have an opportunity to work with outstanding people in the field. The list of papers include:
1️⃣ Non-Euclidean Edge of Stability (spotlight). This is a joint work with Michael Crawshaw, Jeremy Cohen, and Robert Gower.
2️⃣ BST rule. This is a joint work with Roman Machacek, Aurelien Lucchi, Antonio Silveti-Falls, Eduard Gorbunov, and Volkan Cevher.
3️⃣ H0-H1 condition. This is a joint work with Foivos Alimisis and Aurelien Lucchi.
4️⃣ SDE under L0-L1 condition. This is a joint work with Enea Monzio Compagnoni, Aurelien Lucchi, Antonio Orvieto, Eduard Gorbunov.

I am happy to share some news regarding my research in the beginning of 2026.
1️⃣ Our paper on DP-SGD with SDEs has been accepted to ICLR 2026 in Rio de Janeiro, Brazil as a poster presentation.
2️⃣ New paper on Edge of Stability of Non-Euclidean descent methods is now available on arXiv. This is a joint work with Michael Crawshaw, Jeremy Cohen, and Robert Gower.

🚨 New paper alert! 🚨
Our latest work has been accepted to NeurIPS! We demonstrate how adapting the NGN stepsize with momentum enhances optimization both theoretically and practically.
What’s inside?
📚 Theory: Convergence in convex and non-convex settings.
🔧 Cleaner analysis: We drop restrictive assumptions (no bounded gradients, no interpolation), improving over prior work on Stochastic Polyak stepsize.
🚀 Practice: NGN + momentum and NGN + Adam = way more robust to learning rate choices, from ResNets to large-scale LMs.
Special credit to my amazing coauthors: Niccolò Ajroldi, Antonio Orvieto, Aurelien Lucchi.

I am happy to announce that our recent that studies Safe-EF algorithm for distributed non-smooth optimization with constraints has been accepted to ICML 2025! Quick summary of our contributions:
⚙️ Safe-EF algorithm which provably converges in the federated non-smooth optimization with constraints;
🚨 Algorithms like Compressed SGD and EF21 fail to converge in such a setting;
🛠️ Lower bounds for first-order methods in the compressed non-smooth convex setting.
🔄 High probability analysis in the stochastic setting;
🤖 Extensive experiments, including distributed humanoid training, show that Safe-EF efficiently minimizes the objective while preserving constraint violations. This is a joint work with Yarden As and Ilyas Fatkhullin. Check out our paper for more details!

Our new paper on distributed optimization with strong optimization and DP guarantees is out! We introduce Clip21-SGD2M, a method featuring a double momentum mechanism—one for managing stochastic noise and another for averaging DP noise. We establish optimal convergence guarantees in both deterministic and stochastic settings, along with a near-optimal privacy-utility tradeoff in the DP framework. Finally, our method demonstrates competitive performance in practice efficiently handling the noise in training neural networks.

Contact