Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Newtonian-Shampoo: Modified Newton-Schulz Adapted for Shampoo Preconditioners
Published:
Shampoo optimizer was proposed in [1]
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post.
publications
A panda? no, it’s a sloth: Slowdown attacks on adaptive multi-exit neural network inference
Sanghyun Hong, Yiğitcan Kaya, Ionuţ-Vlad Modoranu, Tudor Dumitraş
Published in ICLR 2021 (Spotlight 🔦)
A new adversarial attack to introduce delay in the predictions of multi-exit deep neural networks.
Error Feedback Can Accurately Compress Preconditioners
Ionut-Vlad Modoranu, Aleksei Kalinov, Eldar Kurtic, Elias Frantar, Dan Alistarh
Published in ICML 2024
Reduce the memory usage of M-FAC optimizer via sparsity, low-rank and error feedback.
MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence
Ionut-Vlad Modoranu, Mher Safaryan, Grigory Malinovsky, Eldar Kurtic, Thomas Robert, Peter Richtarik, Dan Alistarh
Published in NeurIPS 2024
Reduce the memory usage of Adam optimizer via sparsity and error feedback.
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
Thomas Robert, Mher Safaryan, Ionut-Vlad Modoranu, Dan Alistarh
Published in ICLR 2025
Improved low-rank optimization for LLMs (over GaLore).
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
Diyuan Wu, Ionut-Vlad Modoranu, Mher Safaryan, Denis Kuznedelev, Dan Alistarh
Published in NeurIPS 2024
Theoretical guarantees for sparse, second order pruning.
Unified Scaling Laws for Compressed Representations
Andrei Panferov, Alexandra Volkova, Ionut-Vlad Modoranu, Vage Egiazarian, Mher Safaryan, Dan Alistarh
Published in Arxiv
Scaling laws for quantization and sparsity.
Optimizers Qualitatively Alter Solutions and We Should Leverage This
Razvan Pascanu, Clare Lyle, Ionut-Vlad Modoranu, Naima Elosegui Borras, Dan Alistarh, Petar Velickovic, Sarath Chandar, Soham De, James Martens
Published in Arxiv
Optimizers have been introduced and benchamrked with respect to how fast they reach a specific loss. In this work we hypothesize they might also have other effects, such as inducing certain biases.
FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of Large Language Models
Ionut-Vlad Modoranu, Mher Safaryan, Erik Schultheis, Max Ryabinin, Artem Chumachenko, Dan Alistarh
Published in Arxiv
FFT-based low-rank optimization for LLMs.
DASH: Faster Shampoo via Batched Block Preconditioning and Efficient Inverse-Root Solvers
Ionut-Vlad Modoranu, Philip Zmushko, Erik Schultheis, Mher Safaryan, Dan Alistarh
Published in Arxiv
Faster implementation of Distributed Shampoo by stacking the preconditioner blocks into 3D tensors.
LoRDO: Distributed Low-Rank Optimization with Infrequent Communication
Andrej Jovanović, Alex Iacob, Mher Safaryan, Ionut-Vlad Modoranu, Lorenzo Sani, William F Shen, Xinchi Qiu, Dan Alistarh, Nicholas D Lane
Published in Arxiv
A framework that unifies low-rank optimization with infrequent synchronization.
