Welcome to Adams Wei Yu's Homepage

Adams Wei Yu

Adams Wei Yu

I am a research scientist at Google DeepMind (prev. Google Brain). I got my PhD from MLD at CMU (old page). I work on the following projects:

Gemini: Co-lead of multimodal thinking and reasoning (SOTA on MMMU etc). Core contributor of post-training (SOTA on LMSYS etc).

Gemini App (a.k.a. Bard before): Co-lead of multimodality understanding and retrieval (first multimodal launch of Bard, a.k.a. Multi-Bard). Co-lead of post-training data that enabled the first launch of Bard.

Veo: Core contributor.

Email: adams(my last name)wei AT gmail DOT com

(Last updated: 04/2025)

Publications (Google Scholar)

"Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities".
Gemini Team, Google.

"Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context".
Gemini Team, Google.

"Gemini: A Family of Highly Capable Multimodal Models".
Gemini Team, Google.

"Veo".
Abhishek Sharma, Adams Yu, Ali Razavi, Andeep Toor, Andrew Pierson, Ankush Gupta, Austin Waters, Daniel Tanis, Dumitru Erhan, Eric Lau, Eleni Shaw, Gabe Barth-Maron, Greg Shaw, Han Zhang, Henna Nandwani, Hernan Moraldo, Hyunjik Kim, Irina Blok, Jakob Bauer, Jeff Donahue, Junyoung Chung, Kory Mathewson, Kurtis David, Lasse Espeholt, Marc van Zee, Matt McGill, Medhini Narasimhan, Miaosen Wang, Mikołaj Bińkowski, Mohammad Babaeizadeh, Mohammad Taghi Saffar, Nick Pezzotti, Pieter-Jan Kindermans, Poorva Rane, Rachel Hornung, Robert Riachi, Ruben Villegas, Rui Qian, Sander Dieleman, Serena Zhang, Serkan Cabi, Shixin Luo, Shlomi Fruchter, Signe Nørly, Srivatsan Srinivasan, Tobias Pfaff, Tom Hume, Vikas Verma, Weizhe Hua, William Zhu, Xinchen Yan, Xinyu Wang, Yelin Kim, Yuqing Du, and Yutian Chen.

"Haloquest: A visual hallucination dataset for advancing multimodal reasoning".
(ECCV'24) Zhecan Wang, Garrett Bingham, Adams Wei Yu, Quoc Le, Thang Luong, Golnaz Ghiasi.

"Large Language Models Cannot Self-Correct Reasoning Yet".
(ICLR'24) Jie Huang, Xinyun Chen, Swaroop Mishra, Huaixiu Steven Zheng, Adams Wei Yu, Xinying Song, Denny Zhou.

"DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining". (Spotlight) (code)
(NeurIPS'23) Sang Michael Xie, Hieu Pham, Xuanyi Dong, Nan Du, Hanxiao Liu, Yifeng Lu, Percy Liang, Quoc Le, Tengyu Ma, Adams Wei Yu.

"Scaling instruction-finetuned language models".
(JMLR) Hyung Won Chung*, Le Hou*, Shayne Longpre*, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Wei Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, Jason Wei*. (* Equal contribution).

"Glam: Efficient scaling of language models with mixture-of-experts".
(ICML'22) Nan Du*, Yanping Huang*, Andrew M Dai*, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten P Bosma, Zongwei Zhou, Tao Wang, Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc Le, Yonghui Wu, Zhifeng Chen, Claire Cui. (* Equal contribution)

"Finetuned language models are zero-shot learners". (Oral)
(ICLR'22) Jason Wei*, Maarten Bosma*, Vincent Y Zhao*, Kelvin Guu*, Adams Wei Yu, Brian Lester, Nan Du, Andrew M Dai, Quoc V. Le. (* Equal contribution)

"Simvlm: Simple visual language model pretraining with weak supervision".
(ICLR'22) Zirui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, Yuan Cao.

"Deepfusion: Lidar-camera deep fusion for multi-modal 3d object detection".
(CVPR'22) Yingwei Li*, Adams Wei Yu*, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Yifeng Lu, Denny Zhou, Quoc V Le, Alan Yuille, Mingxing Tan. (* Equal contribution)

"Rethinking of graph pretraining on molecular representation".
(NeurIPS'22) Ruoxi Sun, Hanjun Dai, Adams Wei Yu.

"Combined scaling for open-vocabulary image classification".
Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V Le.

"Towards zero-label language learning".
Zirui Wang, Adams Wei Yu, Orhan Firat, Yuan Cao.

"AutoHAS: Efficient hyperparameter and architecture search".
Xuanyi Dong, Mingxing Tan, Adams Wei Yu, Daiyi Peng, Bogdan Gabrys, Quoc V. Le.

"Compositional generalization via neural-symbolic stack machines".
(NeurIPS'20) Xinyun Chen, Chen Liang, Adams Wei Yu, Dawn Song, Denny Zhou.

"Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension". (Spotlight)
(ICLR'20) Xinyun Chen, Chen Liang, Adams Wei Yu, Denny Zhou, Dawn Song, Quoc V. Le.

"Block-normalized gradient method: An empirical study for training deep neural network".
Adams Wei Yu, Lei Huang, Qihang Lin, Ruslan Salakhutdinov, Jaime Carbonell.

"Doubly Stochastic Primal-Dual Coordinate Method for Bilinear Saddle-Point Problem".
Adams Wei Yu, Qihang Lin, Tianbao Yang.

"DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization".
(JMLR) Lin Xiao, Adams Wei Yu, Qihang Lin, Weizhu Chen.

"QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension".
(ICLR'18) Adams Wei Yu, David Dohan, Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, Quoc Le.

No.1 on SQuAD dataset in both accuracy and speed, as of April 23, 2018.
Featured at Google I/O 2018.

"Orthogonal Weight Normalization: Solution to Optimization over Multiple Dependent Stiefel Manifolds in Deep Neural Networks". (Full Oral)
(AAAI'18) Lei Huang, Xianglong Liu, Bo Lang, Adams Wei Yu, Bo Li.

"On Computationally Tractable Selection of Experiments in Regression Models". (Accepted, to appear)
(JMLR) Yining Wang, Adams Wei Yu, Aarti Singh.

"Learning to Skim Text". (Long Paper)
(ACL'17) Adams Wei Yu, Hongrae Lee, Quoc Le.

"An Improved Gap-Dependency Analysis of the Noisy Power Method". (Full oral)
(COLT'16) Maria Florina Balcan**, Simon S. Du**, Yining Wang**, Adams Wei Yu**. (** α-β order)

"Adadelay: Delay Adaptive Distributed Stochastic Optimization".
(AISTATS'16) Suvrit Sra, Adams Wei Yu, Mu Li, Alex Smola.

"Efficient Structured Matrix Rank Minimization".
(NIPS'14) Adams Wei Yu, Wanli Ma, Yaoliang Yu, Jaime G. Carbonell, Suvrit Sra.

"Saddle Points and Accelerated Perceptron Algorithms". (Full oral, INFORMS Data Mining Best Student Paper Finalist)
(ICML'14) Adams Wei Yu, Fatma Kılınç-Karzan, Jaime G. Carbonell.

"Reverse Top-k Search using Random Walk with Restart". (oral)
(VLDB'14) Adams Wei Yu, Nikos Mamoulis, Hao Su.

"Efficient Euclidean Projections onto the Intersection of Norm Balls". (oral)
(ICML'12) Adams Wei Yu*, Hao Su*, Li Fei-Fei. (* Indicates equal contribution)

PhD Thesis:

"Effective and Efficient Learning at Scale".
Adams Wei Yu.
Committee: Jaime Carbonell (advisor), Alex Smola (advisor), Ruslan Salakhutdinov, Quoc Le (Google Brain), Chris Manning (Stanford).
PDF.

Interns and residents (Chronological)
I have the privilege to have worked with the following interns and residents at Google.

Mingyang Deng (MIT PhD)
Sang Michael Xie (Stanford PhD --> Meta GenAI)
Yingwei Li (JHU PhD --> Waymo)
Zirui Wang (CMU PhD --> Google Brain)
Jason Wei (Google AI Resident --> Google Brain)
Xuanyi Dong (UTS PhD --> Google Brain)
Xinyun Chen (Berkeley PhD --> Google Brain)