I’m a research scientist at Google DeepMind where I work primarily on reinforcement learning and sequential decision making problems.
September, 2022. We have just released a second version of our Acme paper, which is a significant rewrite that includes many more algorithms and an additional focus on batch/offline algorithms. We also give a more deep description of the distributed backbone of Acme. And of course we have opensourced all of this work here.
April, 2021. We have just released Launchpad, a system for defining and launching distributed programs particularly tuned towards machine learning applications. This partially makes up the backbone we use for the distributed variants of RL algorithms in Acme.
June, 2020. Along with some great colleagues at DeepMind we’re releasing Acme, an RL framework that we’ve been working on and using for our own research for quite some time. You can check it out here or take a look at our whitepaper!
Below are some recent preprints and publications. And while I try to keep this relatively up-to-date, it is almost inevitable that I fall behind. Check out my Google Scholar entry for more publications.
Hoffman, M., Shahriari, B., Aslanides, J., Barth-Maron, G., Behbahani, F., Norman, T., Abdolmaleki, A., Cassirer, A., Yang, F., Baumli, K., Henderson, S., Novikov, A., Colmenarejo, S. G., Cabi, S., Gulcehre, C., Paine, T. L., Cowie, A., Wang, Z., Piot, B., and de Freitas, N. (2020). Acme: A Research Framework for Distributed Reinforcement Learning. arXiv:2006.00979. [pdf] [bibtex]
Gu, A., Gulcehre, C., Paine, T. L., Hoffman, M., and Pascanu, R. (2019). Improving the Gating Mechanism of Recurrent Neural Networks. arXiv:1910.09890. [pdf] [bibtex]
Paine, T. L., Gulcehre, C., Shahriari, B., Denil, M., Hoffman, M., Soyer, H., Tanburn, R., Kapturowski, S., Rabinowitz, N., Williams, D., Barth-Maron, G., Wang, Z., de Freitas, N., and Team, W. (2019). Making Efficient Use of Demonstrations to Solve Hard Exploration Problems. arXiv:1909.01387. [pdf] [bibtex]