Adams Wei Yu

(Adams) Wei Yu

I am a research scientist at Google DeepMind (prev. Google Brain), working on alignment and leading the multimodality efforts in Bard (a.k.a. MultiBard). I generally work on large language models and multimodality via data centric methodologies. Previously I got my PhD from MLD at CMU (old page).

My work has significantly contributed to Bard, PaLM API, Youtube and Waymo. In the meantime, I also pursue state-of-the-art research.

Email: adams(my last name)wei AT gmail DOT com

(Last updated: 05/2023)

Publications (Google Scholar)

  • "DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining". (code)
    Sang Michael Xie, Hieu Pham, Xuanyi Dong, Nan Du, Hanxiao Liu, Yifeng Lu, Percy Liang, Quoc Le, Tengyu Ma, Adams Wei Yu.

  • "Scaling instruction-finetuned language models".
    Hyung Won Chung*, Le Hou*, Shayne Longpre*, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Wei Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, Jason Wei*. (* Equal contribution).

  • "Glam: Efficient scaling of language models with mixture-of-experts".
    (ICML'22) Nan Du*, Yanping Huang*, Andrew M Dai*, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten P Bosma, Zongwei Zhou, Tao Wang, Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc Le, Yonghui Wu, Zhifeng Chen, Claire Cui. (* Equal contribution)

  • "Finetuned language models are zero-shot learners". (Oral)
    (ICLR'22) Jason Wei*, Maarten Bosma*, Vincent Y Zhao*, Kelvin Guu*, Adams Wei Yu, Brian Lester, Nan Du, Andrew M Dai, Quoc V. Le. (* Equal contribution)

  • "Simvlm: Simple visual language model pretraining with weak supervision".
    (ICLR'22) Zirui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, Yuan Cao.

  • "Deepfusion: Lidar-camera deep fusion for multi-modal 3d object detection".
    (CVPR'22) Yingwei Li*, Adams Wei Yu*, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Yifeng Lu, Denny Zhou, Quoc V Le, Alan Yuille, Mingxing Tan. (* Equal contribution)

  • "Rethinking of graph pretraining on molecular representation".
    (NeurIPS'22) Ruoxi Sun, Hanjun Dai, Adams Wei Yu.

  • "Combined scaling for open-vocabulary image classification".
    Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V Le.

  • "Towards zero-label language learning".
    Zirui Wang, Adams Wei Yu, Orhan Firat, Yuan Cao.

  • "AutoHAS: Efficient hyperparameter and architecture search".
    Xuanyi Dong, Mingxing Tan, Adams Wei Yu, Daiyi Peng, Bogdan Gabrys, Quoc V. Le.

  • "Compositional generalization via neural-symbolic stack machines".
    (NeurIPS'20) Xinyun Chen, Chen Liang, Adams Wei Yu, Dawn Song, Denny Zhou.

  • "Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension". (Spotlight)
    (ICLR'20) Xinyun Chen, Chen Liang, Adams Wei Yu, Denny Zhou, Dawn Song, Quoc V. Le.

  • "Block-normalized gradient method: An empirical study for training deep neural network".
    Adams Wei Yu, Lei Huang, Qihang Lin, Ruslan Salakhutdinov, Jaime Carbonell.

  • "Doubly Stochastic Primal-Dual Coordinate Method for Bilinear Saddle-Point Problem".
    Adams Wei Yu, Qihang Lin, Tianbao Yang.

  • "DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization".
    (JMLR) Lin Xiao, Adams Wei Yu, Qihang Lin, Weizhu Chen.

  • "QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension".
    (ICLR'18) Adams Wei Yu, David Dohan, Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, Quoc Le.

  • "Orthogonal Weight Normalization: Solution to Optimization over Multiple Dependent Stiefel Manifolds in Deep Neural Networks". (Full Oral)
    (AAAI'18) Lei Huang, Xianglong Liu, Bo Lang, Adams Wei Yu, Bo Li.

  • "On Computationally Tractable Selection of Experiments in Regression Models". (Accepted, to appear)
    (JMLR) Yining Wang, Adams Wei Yu, Aarti Singh.

  • "Learning to Skim Text". (Long Paper)
    (ACL'17) Adams Wei Yu, Hongrae Lee, Quoc Le.

  • "An Improved Gap-Dependency Analysis of the Noisy Power Method". (Full oral)
    (COLT'16) Maria Florina Balcan**, Simon S. Du**, Yining Wang**, Adams Wei Yu**. (** α-β order)

  • "Adadelay: Delay Adaptive Distributed Stochastic Optimization".
    (AISTATS'16) Suvrit Sra, Adams Wei Yu, Mu Li, Alex Smola.

  • "Efficient Structured Matrix Rank Minimization".
    (NIPS'14) Adams Wei Yu, Wanli Ma, Yaoliang Yu, Jaime G. Carbonell, Suvrit Sra.

  • "Saddle Points and Accelerated Perceptron Algorithms". (Full oral, INFORMS Data Mining Best Student Paper Finalist)
    (ICML'14) Adams Wei Yu, Fatma Kılınç-Karzan, Jaime G. Carbonell.

  • "Reverse Top-k Search using Random Walk with Restart". (oral)
    (VLDB'14) Adams Wei Yu, Nikos Mamoulis, Hao Su.

  • "Efficient Euclidean Projections onto the Intersection of Norm Balls". (oral)
    (ICML'12) Adams Wei Yu*, Hao Su*, Li Fei-Fei. (* Indicates equal contribution)
PhD Thesis:
  • "Effective and Efficient Learning at Scale".
    Adams Wei Yu.
    Committee: Jaime Carbonell (advisor), Alex Smola (advisor), Ruslan Salakhutdinov, Quoc Le (Google Brain), Chris Manning (Stanford).

Interns and residents (Chronological)

I have the privilege to have worked with the following interns and residents at Google.
  • Sang Michael Xie (Stanford PhD --> ?)
  • Yingwei Li (JHU PhD --> Waymo)
  • Zirui Wang (CMU PhD --> Google Brain)
  • Jason Wei (Google AI Resident --> Google Brain)
  • Xuanyi Dong (UTS PhD --> Google Brain)
  • Xinyun Chen (Berkeley PhD --> Google Brain)