Reinforcement learning ∩ LLMs, Generative models, Artificial intelligence
- San Francisco, CA
- alecwangcq.github.io
Highlights
- Pro
-
f-divergence-dpo Public
Direct preference optimization with f-divergences.
-
-
EigenDamage-Pytorch Public
Code for "EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis" https://arxiv.org/abs/1905.05934
-
GraSP Public
Code for "Picking Winning Tickets Before Training by Preserving Gradient Flow" https://openreview.net/pdf?id=SkgsACVKPH
-
KFAC-Pytorch Public
Pytorch implementation of KFAC and E-KFAC (Natural Gradient).
-
-
-
-