I am a Senior Research Scientist at Google DeepMind in NYC, where I work on sparsity, information retrieval, foundation models, and their intersections.
Before Google, I was a postdoc at UC Berkeley working with Prof. Yi Ma. I received my PhD in ECE at Johns Hopkins University in 2018, advised by Prof. RenΓ© Vidal. Prior to Hopkins, I got my B.S. and M.S. degrees at Peking University.
My research areas broadly include machine learning, computer vision, optimization and signal processing. I am interested in the development of mathematical principles and practical numerical algorithms for analyzing and interpreting modern data.
My CV can be found here.
π₯ News
- [2025.07] Gemma3n, available on Hugging Face, comes with activation sparsity from using our Statistical Top-K introduced in the Spark Transformer paper.
- [2025.06] Paper Release: Spark Transformer, which follows up on our earlier works β Lazy Neuron, 2022 and HiRE, 2024 β to introduce a strong activation sparsity (8% nonzeros in FFN and top-256 in Attention) to modern LLMs.
- [2025.03] Co-organized Conference on Parsimony and Learning (CPAL) at Stanford, CA
π Recent Publications
Activation Sparsity in Transformers
-
Spark Transformer: Reactivating Sparsity in FFN and Attention
Chong You*, Kan Wu*, Zhipeng Jia*, Lin Chen*, Srinadh Bhojanapalli, Jiaxian Guo, Utku Evci, Jan Wassenberg, Praneeth Netrapalli, Jeremiah J. Willcock, Suvinay Subramanian, Felix Chern, Alek Andreev, Shreya Pathak, Felix Yu, Prateek Jain, David E. Culler, Henry M. Levy, Sanjiv Kumar
Neural Information Processing Systems (NeurIPS), 2025
[Arxiv] -
HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM Inference
Yashas Samaga B L, Varun Yerram, Chong You, Srinadh Bhojanapalli, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli
[Arxiv] -
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers
Zonglin Li*, Chong You*, Srinadh Bhojanapalli, Daliang Li, Ankit Singh Rawat, Sashank J. Reddi, Ke Ye, Felix Chern, Felix Yu, Ruiqi Guo, Sanjiv Kumar
International Conference on Learning Representations (ICLR), 2022
[Arxiv]
Retrieval and Generative Models
-
ReTuning: Towards Scalable In-context Retrieval
Nilesh Gupta, Chong You, Srinadh Bhojanapalli, Sanjiv Kumar, Inderjit S Dhillon, Felix Yu
Neural Information Processing Systems (NeurIPS), 2025 -
Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
Chong You, Rajesh Jayaram, Ananda Theertha Suresh, Robin Nittka, Felix X. Yu, Sanjiv Kumar
Neural Information Processing Systems (NeurIPS), 2025
[Arxiv] -
Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models
Haotian Ye, Himanshu Jain, Chong You, Ananda Theertha Suresh, Haowei Lin, James Zou, Felix Yu
International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
[Arxiv] -
Functional Interpolation for Relative Positions Improves Long Context Transformers
Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli
International Conference on Learning Representations (ICLR), 2024
[Arxiv]