I’m a research scientist at Google DeepMind where I work primarily on reinforcement learning, sequential decision making problems, and RLHF.
July, 2024. An updated version of Gemma, an open-weights LLM with fantastic performance was released; see here for a blog describing this work and here for a more in-depth paper. I couldn’t be more pleased to have been involved in this work.
September, 2022. We have just released a second version of our Acme paper, which is a significant rewrite that includes many more algorithms and an additional focus on batch/offline algorithms. We also give a more deep description of the distributed backbone of Acme. And of course we have opensourced all of this work here.
April, 2021. We have just released Launchpad, a system for defining and launching distributed programs particularly tuned towards machine learning applications. This partially makes up the backbone we use for the distributed variants of RL algorithms in Acme.
Below are some recent preprints and publications. And while I try to keep this relatively up-to-date, it is almost inevitable that I fall behind. Check out my Google Scholar entry for more publications.
P. G. Sessa, R. Dadashi, L. Hussenot, J. Ferret, N. Vieillard, A. Ramé, B. Shahriari, S. Perrin, A. Friesen, G. Cideron, S. Girgin, P. Stanczyk, A. Michi, D. Sinopalnikov, S. Ramos, A. Héliou, A. Severyn, M. W. Hoffman, N. Momchev, and O. Bachem. (2024). BOND: Aligning LLMs with Best-of-N Distillation. Google DeepMind. [pdf] [bibtex]
Gemma Team, M. Riviere, S. Pathak, P. G. Sessa, C. Hardin, S. Bhupatiraju, L. Hussenot, T. Mesnard, B. Shahriari, A. Ramé, J. Ferret, P. Liu, P. Tafti, A. Friesen, M. Casbon, S. Ramos, R. Kumar, C. L. Lan, S. Jerome, A. Tsitsulin, N. Vieillard, P. Stanczyk, S. Girgin, N. Momchev, M. W. Hoffman, S. Thakoor, J.-B. Grill, B. Neyshabur, O. Bachem, and et al. (2024). Gemma 2: Improving Open Language Models at a Practical Size. Google DeepMind. [pdf] [bibtex]
M. W. Hoffman, B. Shahriari, J. Aslanides, G. Barth-Maron, N. Momchev, D. Sinopalnikov, P. Stańczyk, S. Ramos, A. Raichuk, D. Vincent, L. Hussenot, R. Dadashi, G. Dulac-Arnold, M. Orsini, A. Jacq, J. Ferret, N. Vieillard, S. K. S. Ghasemipour, S. Girgin, O. Pietquin, F. Behbahani, T. Norman, A. Abdolmaleki, A. Cassirer, F. Yang, K. Baumli, S. Henderson, A. Friesen, R. Haroun, A. Novikov, S. G. Colmenarejo, S. Cabi, C. Gulcehre, T. L. Paine, S. Srinivasan, A. Cowie, Z. Wang, B. Piot, and N. de Freitas. (2022). Acme: A Research Framework for Distributed Reinforcement Learning. Google DeepMind. [pdf] [bibtex]