Matthew W. Hoffman

I’m a research scientist at Google DeepMind where I work primarily on reinforcement learning, sequential decision making problems, and RLHF.

Recent News

Recent Papers

Below are some recent preprints and publications. And while I try to keep this relatively up-to-date, it is almost inevitable that I fall behind. Check out my Google Scholar entry for more publications.

  1. P. G. Sessa, R. Dadashi, L. Hussenot, J. Ferret, N. Vieillard, A. Ramé, B. Shahriari, S. Perrin, A. Friesen, G. Cideron, S. Girgin, P. Stanczyk, A. Michi, D. Sinopalnikov, S. Ramos, A. Héliou, A. Severyn, M. W. Hoffman, N. Momchev, and O. Bachem. (2024). BOND: Aligning LLMs with Best-of-N Distillation. Google DeepMind. [pdf] [bibtex]

  2. Gemma Team, M. Riviere, S. Pathak, P. G. Sessa, C. Hardin, S. Bhupatiraju, L. Hussenot, T. Mesnard, B. Shahriari, A. Ramé, J. Ferret, P. Liu, P. Tafti, A. Friesen, M. Casbon, S. Ramos, R. Kumar, C. L. Lan, S. Jerome, A. Tsitsulin, N. Vieillard, P. Stanczyk, S. Girgin, N. Momchev, M. W. Hoffman, S. Thakoor, J.-B. Grill, B. Neyshabur, O. Bachem, and et al. (2024). Gemma 2: Improving Open Language Models at a Practical Size. Google DeepMind. [pdf] [bibtex]

  3. M. W. Hoffman, B. Shahriari, J. Aslanides, G. Barth-Maron, N. Momchev, D. Sinopalnikov, P. Stańczyk, S. Ramos, A. Raichuk, D. Vincent, L. Hussenot, R. Dadashi, G. Dulac-Arnold, M. Orsini, A. Jacq, J. Ferret, N. Vieillard, S. K. S. Ghasemipour, S. Girgin, O. Pietquin, F. Behbahani, T. Norman, A. Abdolmaleki, A. Cassirer, F. Yang, K. Baumli, S. Henderson, A. Friesen, R. Haroun, A. Novikov, S. G. Colmenarejo, S. Cabi, C. Gulcehre, T. L. Paine, S. Srinivasan, A. Cowie, Z. Wang, B. Piot, and N. de Freitas. (2022). Acme: A Research Framework for Distributed Reinforcement Learning. Google DeepMind. [pdf] [bibtex]