Matthew W. Hoffman

I’m a research scientist at Google DeepMind where I work primarily on reinforcement learning, sequential decision making problems, and RLHF.

Recent News

July, 2024. An updated version of Gemma, an open-weights LLM with fantastic performance was released; see here for a blog describing this work and here for a more in-depth paper. I couldn’t be more pleased to have been involved in this work.
September, 2022. We have just released a second version of our Acme paper, which is a significant rewrite that includes many more algorithms and an additional focus on batch/offline algorithms. We also give a more deep description of the distributed backbone of Acme. And of course we have opensourced all of this work here.
April, 2021. We have just released Launchpad, a system for defining and launching distributed programs particularly tuned towards machine learning applications. This partially makes up the backbone we use for the distributed variants of RL algorithms in Acme.

more news

Recent Papers

Below are some recent preprints and publications. And while I try to keep this relatively up-to-date, it is almost inevitable that I fall behind. Check out my Google Scholar entry for more publications.

P. G. Sessa, R. Dadashi, L. Hussenot, J. Ferret, N. Vieillard, A. Ramé, B. Shahriari, S. Perrin, A. Friesen, G. Cideron, S. Girgin, P. Stanczyk, A. Michi, D. Sinopalnikov, S. Ramos, A. Héliou, A. Severyn, M. W. Hoffman, N. Momchev, and O. Bachem. (2024). BOND: Aligning LLMs with Best-of-N Distillation. Google DeepMind. [pdf] [bibtex]

@techreport{sessa2024bond,
  title = {BOND: Aligning {LLMs} with Best-of-{N} Distillation},
  author = {Sessa, Pier Giuseppe and Dadashi, Robert and Hussenot, Léonard and Ferret, Johan and Vieillard, Nino and Ramé, Alexandre and Shahriari, Bobak and Perrin, Sarah and Friesen, Abe and Cideron, Geoffrey and Girgin, Sertan and Stanczyk, Piotr and Michi, Andrea and Sinopalnikov, Danila and Ramos, Sabela and Héliou, Amélie and Severyn, Aliaksei and Hoffman, Matthew W. and Momchev, Nikola and Bachem, Olivier},
  year = {2024},
  month = jul,
  institution = {Google DeepMind},
  howpublished = {arXiv:2407.14622},
  link = {https://arxiv.org/pdf/2407.14622.pdf}
}

Gemma Team, M. Riviere, S. Pathak, P. G. Sessa, C. Hardin, S. Bhupatiraju, L. Hussenot, T. Mesnard, B. Shahriari, A. Ramé, J. Ferret, P. Liu, P. Tafti, A. Friesen, M. Casbon, S. Ramos, R. Kumar, C. L. Lan, S. Jerome, A. Tsitsulin, N. Vieillard, P. Stanczyk, S. Girgin, N. Momchev, M. W. Hoffman, S. Thakoor, J.-B. Grill, B. Neyshabur, O. Bachem, and et al. (2024). Gemma 2: Improving Open Language Models at a Practical Size. Google DeepMind. [pdf] [bibtex]

@techreport{gemma2,
  title = {Gemma 2: Improving Open Language Models at a Practical Size},
  author = {{Gemma Team} and Riviere, Morgane and Pathak, Shreya and Sessa, Pier Giuseppe and Hardin, Cassidy and Bhupatiraju, Surya and Hussenot, Léonard and Mesnard, Thomas and Shahriari, Bobak and Ramé, Alexandre and Ferret, Johan and Liu, Peter and Tafti, Pouya and Friesen, Abe and Casbon, Michelle and Ramos, Sabela and Kumar, Ravin and Lan, Charline Le and Jerome, Sammy and Tsitsulin, Anton and Vieillard, Nino and Stanczyk, Piotr and Girgin, Sertan and Momchev, Nikola and Hoffman, Matthew W. and Thakoor, Shantanu and Grill, Jean-Bastien and Neyshabur, Behnam and Bachem, Olivier and {et al.}},
  year = {2024},
  month = jul,
  institution = {Google DeepMind},
  howpublished = {arXiv:2408.00118},
  link = {https://arxiv.org/pdf/2408.00118.pdf}
}

M. W. Hoffman, B. Shahriari, J. Aslanides, G. Barth-Maron, N. Momchev, D. Sinopalnikov, P. Stańczyk, S. Ramos, A. Raichuk, D. Vincent, L. Hussenot, R. Dadashi, G. Dulac-Arnold, M. Orsini, A. Jacq, J. Ferret, N. Vieillard, S. K. S. Ghasemipour, S. Girgin, O. Pietquin, F. Behbahani, T. Norman, A. Abdolmaleki, A. Cassirer, F. Yang, K. Baumli, S. Henderson, A. Friesen, R. Haroun, A. Novikov, S. G. Colmenarejo, S. Cabi, C. Gulcehre, T. L. Paine, S. Srinivasan, A. Cowie, Z. Wang, B. Piot, and N. de Freitas. (2022). Acme: A Research Framework for Distributed Reinforcement Learning. Google DeepMind. [pdf] [bibtex]

@techreport{hoffman:2022,
  title = {Acme: A Research Framework for Distributed Reinforcement Learning},
  author = {Hoffman, Matthew W. and Shahriari, Bobak and Aslanides, John and Barth-Maron, Gabriel and Momchev, Nikola and Sinopalnikov, Danila and Stańczyk, Piotr and Ramos, Sabela and Raichuk, Anton and Vincent, Damien and Hussenot, Léonard and Dadashi, Robert and Dulac-Arnold, Gabriel and Orsini, Manu and Jacq, Alexis and Ferret, Johan and Vieillard, Nino and Ghasemipour, Seyed Kamyar Seyed and Girgin, Sertan and Pietquin, Olivier and Behbahani, Feryal and Norman, Tamara and Abdolmaleki, Abbas and Cassirer, Albin and Yang, Fan and Baumli, Kate and Henderson, Sarah and Friesen, Abe and Haroun, Ruba and Novikov, Alex and Colmenarejo, Sergio Gómez and Cabi, Serkan and Gulcehre, Caglar and Paine, Tom Le and Srinivasan, Srivatsan and Cowie, Andrew and Wang, Ziyu and Piot, Bilal and de Freitas, Nando},
  year = {2022},
  month = sep,
  note = {Revised edition; earlier version Jul. 2020},
  institution = {Google DeepMind},
  howpublished = {arXiv:2006.00979},
  link = {https://arxiv.org/pdf/2006.00979.pdf}
}

more publications