Publications

Below is a collection of my publications, including any preprints or technical reports. You can also check out my Google Scholar entry for a list that is more likely to be up-to-date.

Papers

P. G. Sessa, R. Dadashi, L. Hussenot, J. Ferret, N. Vieillard, A. Ramé, B. Shahriari, S. Perrin, A. Friesen, G. Cideron, S. Girgin, P. Stanczyk, A. Michi, D. Sinopalnikov, S. Ramos, A. Héliou, A. Severyn, M. W. Hoffman, N. Momchev, and O. Bachem. (2024). BOND: Aligning LLMs with Best-of-N Distillation. Google DeepMind. [pdf] [bibtex]

@techreport{sessa2024bond,
  title = {BOND: Aligning {LLMs} with Best-of-{N} Distillation},
  author = {Sessa, Pier Giuseppe and Dadashi, Robert and Hussenot, Léonard and Ferret, Johan and Vieillard, Nino and Ramé, Alexandre and Shahriari, Bobak and Perrin, Sarah and Friesen, Abe and Cideron, Geoffrey and Girgin, Sertan and Stanczyk, Piotr and Michi, Andrea and Sinopalnikov, Danila and Ramos, Sabela and Héliou, Amélie and Severyn, Aliaksei and Hoffman, Matthew W. and Momchev, Nikola and Bachem, Olivier},
  year = {2024},
  month = jul,
  institution = {Google DeepMind},
  howpublished = {arXiv:2407.14622},
  link = {https://arxiv.org/pdf/2407.14622.pdf}
}

Gemma Team, M. Riviere, S. Pathak, P. G. Sessa, C. Hardin, S. Bhupatiraju, L. Hussenot, T. Mesnard, B. Shahriari, A. Ramé, J. Ferret, P. Liu, P. Tafti, A. Friesen, M. Casbon, S. Ramos, R. Kumar, C. L. Lan, S. Jerome, A. Tsitsulin, N. Vieillard, P. Stanczyk, S. Girgin, N. Momchev, M. W. Hoffman, S. Thakoor, J.-B. Grill, B. Neyshabur, O. Bachem, and et al. (2024). Gemma 2: Improving Open Language Models at a Practical Size. Google DeepMind. [pdf] [bibtex]

@techreport{gemma2,
  title = {Gemma 2: Improving Open Language Models at a Practical Size},
  author = {{Gemma Team} and Riviere, Morgane and Pathak, Shreya and Sessa, Pier Giuseppe and Hardin, Cassidy and Bhupatiraju, Surya and Hussenot, Léonard and Mesnard, Thomas and Shahriari, Bobak and Ramé, Alexandre and Ferret, Johan and Liu, Peter and Tafti, Pouya and Friesen, Abe and Casbon, Michelle and Ramos, Sabela and Kumar, Ravin and Lan, Charline Le and Jerome, Sammy and Tsitsulin, Anton and Vieillard, Nino and Stanczyk, Piotr and Girgin, Sertan and Momchev, Nikola and Hoffman, Matthew W. and Thakoor, Shantanu and Grill, Jean-Bastien and Neyshabur, Behnam and Bachem, Olivier and {et al.}},
  year = {2024},
  month = jul,
  institution = {Google DeepMind},
  howpublished = {arXiv:2408.00118},
  link = {https://arxiv.org/pdf/2408.00118.pdf}
}

M. W. Hoffman, B. Shahriari, J. Aslanides, G. Barth-Maron, N. Momchev, D. Sinopalnikov, P. Stańczyk, S. Ramos, A. Raichuk, D. Vincent, L. Hussenot, R. Dadashi, G. Dulac-Arnold, M. Orsini, A. Jacq, J. Ferret, N. Vieillard, S. K. S. Ghasemipour, S. Girgin, O. Pietquin, F. Behbahani, T. Norman, A. Abdolmaleki, A. Cassirer, F. Yang, K. Baumli, S. Henderson, A. Friesen, R. Haroun, A. Novikov, S. G. Colmenarejo, S. Cabi, C. Gulcehre, T. L. Paine, S. Srinivasan, A. Cowie, Z. Wang, B. Piot, and N. de Freitas. (2022). Acme: A Research Framework for Distributed Reinforcement Learning. Google DeepMind. [pdf] [bibtex]

@techreport{hoffman:2022,
  title = {Acme: A Research Framework for Distributed Reinforcement Learning},
  author = {Hoffman, Matthew W. and Shahriari, Bobak and Aslanides, John and Barth-Maron, Gabriel and Momchev, Nikola and Sinopalnikov, Danila and Stańczyk, Piotr and Ramos, Sabela and Raichuk, Anton and Vincent, Damien and Hussenot, Léonard and Dadashi, Robert and Dulac-Arnold, Gabriel and Orsini, Manu and Jacq, Alexis and Ferret, Johan and Vieillard, Nino and Ghasemipour, Seyed Kamyar Seyed and Girgin, Sertan and Pietquin, Olivier and Behbahani, Feryal and Norman, Tamara and Abdolmaleki, Abbas and Cassirer, Albin and Yang, Fan and Baumli, Kate and Henderson, Sarah and Friesen, Abe and Haroun, Ruba and Novikov, Alex and Colmenarejo, Sergio Gómez and Cabi, Serkan and Gulcehre, Caglar and Paine, Tom Le and Srinivasan, Srivatsan and Cowie, Andrew and Wang, Ziyu and Piot, Bilal and de Freitas, Nando},
  year = {2022},
  month = sep,
  note = {Revised edition; earlier version Jul. 2020},
  institution = {Google DeepMind},
  howpublished = {arXiv:2006.00979},
  link = {https://arxiv.org/pdf/2006.00979.pdf}
}

F. Yang, G. Barth-Maron, P. Stańczyk, M. W. Hoffman, S. Liu, M. Kroiss, A. Pope, and A. Rrustemi. (2021). Launchpad: A programming model for distributed machine learning research. Google DeepMind. [pdf] [bibtex]

@techreport{yang:2021,
  title = {Launchpad: A programming model for distributed machine learning
      research},
  author = {Yang, Fan and Barth-Maron, Gabriel and Stańczyk, Piotr and Hoffman, Matthew W. and Liu, Siqi and Kroiss, Manuel and Pope, Aedan and Rrustemi, Alban},
  year = {2021},
  month = jun,
  institution = {Google DeepMind},
  howpublished = {arXiv:2106.04516},
  link = {https://arxiv.org/pdf/2106.04516.pdf}
}

C. Gulcehre, S. G. Colmenarejo, Z. Wang, J. Sygnowski, T. Paine, K. Zolna, Y. Chen, M. W. Hoffman, R. Pascanu, and N. de Freitas. (2021). Regularized behavior value estimation. Google DeepMind. [pdf] [bibtex]

@techreport{gulcehre:2021,
  title = {Regularized behavior value estimation},
  author = {Gulcehre, Caglar and Colmenarejo, Sergio G{\'o}mez and Wang, Ziyu and Sygnowski, Jakub and Paine, Thomas and Zolna, Konrad and Chen, Yutian and Hoffman, Matthew W and Pascanu, Razvan and de Freitas, Nando},
  year = {2021},
  institution = {Google DeepMind},
  howpublished = {arXiv:2103.09575},
  link = {https://arxiv.org/pdf/2103.09575.pdf}
}

C. Gulcehre, Z. Wang, A. Novikov, T. Paine, S. Gómez Colmenarejo, K. Zolna, R. Agarwal, J. S. Merel, D. J. Mankowitz, C. Paduraru, G. Dulac-Arnold, J. Li, M. Norouzi, M. W. Hoffman, N. Heess, and N. de Freitas. (2020). RL unplugged: A suite of benchmarks for offline reinforcement learning. Neural Information Processing Systems. [bibtex]

@article{gulcehre:2020,
  title = {{RL} unplugged: A suite of benchmarks for offline reinforcement
      learning},
  author = {Gulcehre, Caglar and Wang, Ziyu and Novikov, Alexander and Paine, Thomas and {Gómez Colmenarejo}, Sergio and Zolna, Konrad and Agarwal, Rishabh and Merel, Josh S and Mankowitz, Daniel J and Paduraru, Cosmin and Dulac-Arnold, Gabriel and Li, Jerry and Norouzi, Mohammad and Hoffman, Matthew W and Heess, Nicolas and de Freitas, Nando},
  journal = {Neural Information Processing Systems},
  month = dec,
  year = {2020}
}

Y. Chen, A. L. Friesen, F. Behbahani, A. Doucet, D. Budden, M. W. Hoffman, and N. de Freitas. (2020). Modular meta-learning with shrinkage. Neural Information Processing Systems. [bibtex]

@article{chen:2020,
  title = {Modular meta-learning with shrinkage},
  author = {Chen, Yutian and Friesen, Abram L and Behbahani, Feryal and Doucet, Arnaud and Budden, David and Hoffman, Matthew W and de Freitas, Nando},
  journal = {Neural Information Processing Systems},
  month = dec,
  year = {2020}
}

A. Gu, C. Gulcehre, T. L. Paine, M. W. Hoffman, and R. Pascanu. (2019). Improving the Gating Mechanism of Recurrent Neural Networks. Google DeepMind. [pdf] [bibtex]

@techreport{gu:2019,
  title = {Improving the Gating Mechanism of Recurrent Neural Networks},
  author = {Gu, Albert and Gulcehre, Caglar and Paine, Tom Le and Hoffman, Matthew W. and Pascanu, Razvan},
  year = {2019},
  month = oct,
  institution = {Google DeepMind},
  howpublished = {arXiv:1910.09890},
  link = {https://arxiv.org/pdf/1910.09890.pdf}
}

T. L. Paine, C. Gulcehre, B. Shahriari, M. Denil, M. W. Hoffman, H. Soyer, R. Tanburn, S. Kapturowski, N. Rabinowitz, D. Williams, G. Barth-Maron, Z. Wang, N. de Freitas, and W. Team. (2019). Making Efficient Use of Demonstrations to Solve Hard Exploration Problems. Google DeepMind. [pdf] [bibtex]

@techreport{paine:2019,
  title = {Making Efficient Use of Demonstrations to Solve Hard Exploration
      Problems},
  author = {Paine, Tom Le and Gulcehre, Caglar and Shahriari, Bobak and Denil, Misha and Hoffman, Matthew W. and Soyer, Hubert and Tanburn, Richard and Kapturowski, Steven and Rabinowitz, Neil and Williams, Duncan and Barth-Maron, Gabriel and Wang, Ziyu and de Freitas, Nando and Team, Worlds},
  year = {2019},
  month = oct,
  institution = {Google DeepMind},
  howpublished = {arXiv:1909.01387},
  link = {https://arxiv.org/pdf/1909.01387.pdf}
}

B. Shillingford, Y. Assael, M. W. Hoffman, T. Paine, C. Hughes, U. Prabhu, H. Liao, H. Sak, K. Rao, L. Bennett, M. Mulville, B. Coppin, B. Laurie, A. Senior, and N. de Freitas. (2019). Large-scale visual speech recognition. In INTERSPEECH. [pdf] [bibtex]

@inproceedings{shillingford:2019,
  title = {Large-scale visual speech recognition},
  author = {Shillingford, Brendan and Assael, Yannis and Hoffman, Matthew W and Paine, Thomas and Hughes, C{\'i}an and Prabhu, Utsav and Liao, Hank and Sak, Hasim and Rao, Kanishka and Bennett, Lorrayne and Mulville, Marie and Coppin, Ben and Laurie, Ben and Senior, Andrew and de Freitas, Nando},
  booktitle = {INTERSPEECH},
  month = sep,
  year = {2019},
  link = {https://arxiv.org/pdf/1807.05162.pdf}
}

T. L. Paine, S. G. Colmenarejo, Z. Wang, S. Reed, Y. Aytar, T. Pfaff, M. W. Hoffman, G. Barth-Maron, S. Cabi, D. Budden, and N. de Freitas. (2018). One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL. arXiv:1810.05017. [pdf] [bibtex]

@techreport{paine:2018,
  title = {One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets
      with RL},
  author = {Paine, Tom Le and Colmenarejo, Sergio Gómez and Wang, Ziyu and Reed, Scott and Aytar, Yusuf and Pfaff, Tobias and Hoffman, Matt W. and Barth-Maron, Gabriel and Cabi, Serkan and Budden, David and de Freitas, Nando},
  year = {2018},
  month = oct,
  howpublished = {arXiv:1810.05017},
  link = {https://arxiv.org/pdf/1810.05017.pdf}
}

G. Barth-Maron, M. W. Hoffman, D. Budden, W. Dabney, D. Horgan, D. and TB, A. Muldal, N. Heess, and T. Lillicrap. (2018). Distributed Distributional Deterministic Policy Gradients. In International Conference on Learning Representations. [pdf] [bibtex]

@inproceedings{barth-maron:2018,
  title = {Distributed Distributional Deterministic Policy Gradients},
  author = {Barth-Maron, Gabriel and Hoffman, Matthew W and Budden, David and Dabney, Will and Horgan, Dan and and TB, Dhruva and Muldal, Alistair and Heess, Nicolas and Lillicrap, Timothy},
  booktitle = {International Conference on Learning Representations},
  month = apr,
  year = {2018},
  link = {https://arxiv.org/pdf/1804.08617.pdf}
}

S. Cabi, S. G. Colmenarejo, M. W. Hoffman, M. Denil, Z. Wang, and N. Freitas. (2017). The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously. In Conference on Robotic Learning. [pdf] [bibtex]

@inproceedings{cabi:2017,
  title = {The Intentional Unintentional Agent: Learning to Solve Many Continuous
      Control Tasks Simultaneously},
  author = {Cabi, Serkan and Colmenarejo, Sergio G{\'o}mez and Hoffman, Matthew W and Denil, Misha and Wang, Ziyu and Freitas, Nando},
  booktitle = {Conference on Robotic Learning},
  month = nov,
  year = {2017},
  link = {https://arxiv.org/pdf/1707.03300.pdf}
}

Y. Chen, M. W. Hoffman, S. G. Colmenarejo, M. Denil, T. P. Lillicrap, M. Botvinick, and N. de Freitas. (2017). Learning to learn without gradient descent by gradient descent. In International Conference on Machine Learning. [pdf] [bibtex]

@inproceedings{chen:2017,
  title = {Learning to learn without gradient descent by gradient descent},
  author = {Chen, Yutian and Hoffman, Matthew W and Colmenarejo, Sergio G{\'o}mez and Denil, Misha and Lillicrap, Timothy P and Botvinick, Matt and de Freitas, Nando},
  booktitle = {International Conference on Machine Learning},
  month = aug,
  year = {2017},
  link = {https://arxiv.org/pdf/1611.03824.pdf}
}

O. Wichrowska, N. Maheswaranathan, M. W. Hoffman, S. G. Colmenarejo, M. Denil, N. de Freitas, and J. Sohl-Dickstein. (2017). Learned optimizers that scale and generalize. International Conference on Machine Learning. [pdf] [bibtex]

@article{wichrowska:2017,
  title = {Learned optimizers that scale and generalize},
  author = {Wichrowska, Olga and Maheswaranathan, Niru and Hoffman, Matthew W and Colmenarejo, Sergio Gomez and Denil, Misha and de Freitas, Nando and Sohl-Dickstein, Jascha},
  journal = {International Conference on Machine Learning},
  month = aug,
  year = {2017},
  link = {https://arxiv.org/pdf/1703.04813.pdf}
}

M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, and N. de Freitas. (2016). Learning to learn by gradient descent by gradient descent. In Neural Information Processing Systems. [pdf] [bibtex]

@inproceedings{andrychowicz:2016,
  title = {Learning to learn by gradient descent by gradient descent},
  author = {Andrychowicz, Marcin and Denil, Misha and Gomez, Sergio and Hoffman, Matthew W. and Pfau, David and Schaul, Tom and de Freitas, Nando},
  booktitle = {Neural Information Processing Systems},
  month = dec,
  year = {2016},
  link = {https://arxiv.org/pdf/1606.04474.pdf}
}

J. M. Hernández-Lobato, M. A. Gelbart, R. P. Adams, M. W. Hoffman, and Z. Ghahramani. (2016). A general framework for constrained Bayesian optimization using information-based search. Journal of Machine Learning Research, 17. [pdf] [bibtex]

@article{hernandez-lobato:2016,
  title = {A general framework for constrained Bayesian optimization using
      information-based search},
  author = {Hern{\'a}ndez-Lobato, Jos{\'e} Miguel and Gelbart, Michael A and Adams, Ryan P and Hoffman, Matthew W and Ghahramani, Zoubin},
  journal = {Journal of Machine Learning Research},
  volume = {17},
  month = jun,
  year = {2016},
  link = {https://arxiv.org/pdf/1511.09422.pdf}
}

M. W. Hoffman, and Z. Ghahramani. (2015). Output-Space Predictive Entropy Search for Flexible Global Optimization. In NIPS workshop on Bayesian optimization. [pdf] [bibtex]

@inproceedings{hoffman:2015,
  title = {Output-Space Predictive Entropy Search for Flexible Global
      Optimization},
  author = {Hoffman, Matthew W. and Ghahramani, Zoubin},
  booktitle = {NIPS workshop on Bayesian optimization},
  month = dec,
  year = {2015}
}

J. M. Hernández-Lobato, M. A. Gelbart, M. W. Hoffman, R. P. Adams, and Z. Ghahramani. (2015). Predictive Entropy Search for Bayesian Optimization with Unknown Constraints. In International Conference on Machine Learning. [pdf] [bibtex]

@inproceedings{hernandez-lobato:2015,
  title = {Predictive Entropy Search for Bayesian Optimization with Unknown
      Constraints},
  author = {Hern\'andez-Lobato, Jos\'e Miguel and Gelbart, Michael A. and Hoffman, Matthew W. and Adams, Ryan P. and Ghahramani, Zoubin},
  booktitle = {International Conference on Machine Learning},
  month = aug,
  year = {2015},
  link = {https://arxiv.org/pdf/1502.05312.pdf}
}

B. Shahriari, Z. Wang, M. W. Hoffman, A. Bouchard-Côté, and N. de Freitas. (2015). An Entropy Search Portfolio for Bayesian Optimization. arXiv:1406.4625. [pdf] [bibtex]

@techreport{shahriari:2015,
  title = {An Entropy Search Portfolio for Bayesian Optimization},
  author = {Shahriari, Bobak and Wang, Ziyu and Hoffman, Matthew W. and Bouchard-C\^ot\'e, Alexandre and de Freitas, Nando},
  howpublished = {arXiv:1406.4625},
  year = {2015},
  link = {http://arxiv.org/pdf/1406.4625.pdf}
}

M. W. Hoffman, and B. Shahriari. (2014). Modular mechanisms for Bayesian optimization. In NIPS workshop on Bayesian optimization. [pdf] [bibtex]

@inproceedings{hoffman:2014b,
  title = {Modular mechanisms for Bayesian optimization},
  author = {Hoffman, Matthew W. and Shahriari, Bobak},
  booktitle = {NIPS workshop on Bayesian optimization},
  month = dec,
  year = {2014}
}

J. M. Hernández-Lobato, M. W. Hoffman, and Z. Ghahramani. (2014). Predictive Entropy Search for Efficient Global Optimization of Black-box Functions. In Neural Information Processing Systems. [pdf] [bibtex]

@inproceedings{hernandez-lobato:2014,
  title = {Predictive Entropy Search for Efficient Global Optimization of
      Black-box Functions},
  author = {Hern\'andez-Lobato, Jos\'e Miguel and Hoffman, Matthew W. and Ghahramani, Zoubin},
  booktitle = {Neural Information Processing Systems},
  month = dec,
  year = {2014},
  link = {https://arxiv.org/pdf/1406.2541}
}

M. W. Hoffman, B. Shahriari, and N. de Freitas. (2014). On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. In International Conference on Artificial Intelligence and Statistics. [pdf] [bibtex]

@inproceedings{hoffman:2014,
  title = {On correlation and budget constraints in model-based bandit
      optimization with application to automatic machine learning},
  author = {Hoffman, Matthew W and Shahriari, Bobak and de Freitas, Nando},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  month = apr,
  year = {2014}
}

M. W. Hoffman, and N. de Freitas. (2012). Inference strategies for solving semi-Markov decision processes. In L. E. Sucar, E. F. Morales, and J. Hoey (Eds.), Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions. IGI Global. [pdf] [bibtex]

@incollection{hoffman:2012a,
  title = {Inference strategies for solving semi-{Markov} decision processes},
  author = {Hoffman, Matthew W. and de Freitas, Nando},
  booktitle = {Decision Theory Models for Applications in Artificial
      Intelligence: Concepts and Solutions},
  editor = {Sucar, L. Enrique and Morales, Eduardo F. and Hoey, Jesse},
  publisher = {IGI Global},
  year = {2012}
}

M. W. Hoffman, A. Lazaric, M. Ghavamzadeh, and R. Munos. (2012). Regularized Least Squares Temporal Difference Learning with Nested ell_2 and ell_1 Penalization. In European Workshop on Reinforcement Learning. [pdf] [bibtex]

@inproceedings{hoffman:2012b,
  title = {Regularized Least Squares Temporal Difference Learning with Nested
      ell_2 and ell_1 Penalization},
  author = {Hoffman, Matthew W and Lazaric, Alessandro and Ghavamzadeh, Mohammad and Munos, R\'emi},
  booktitle = {European Workshop on Reinforcement Learning},
  series = {Recent Advances in Machine Learning},
  year = {2012}
}

M. Ghavamzadeh, A. Lazaric, M. W. Hoffman, and R. Munos. (2011). Finite-Sample Analysis of Lasso-TD. In International Conference on Machine Learning. [pdf] [bibtex]

@inproceedings{ghavamzadeh:2011,
  title = {Finite-Sample Analysis of {Lasso-TD}},
  author = {Ghavamzadeh, Mohammad and Lazaric, Alessandro and Hoffman, Matthew W. and Munos, R\'emi},
  booktitle = {International Conference on Machine Learning},
  year = {2011}
}

M. W. Hoffman, E. Brochu, and N. de Freitas. (2011). Portfolio Allocation for Bayesian Optimization. In Uncertainty in Artificial Intelligence. [pdf] [bibtex]

@inproceedings{hoffman:2011,
  title = {Portfolio Allocation for {Bayesian} Optimization},
  author = {Hoffman, Matthew W and Brochu, Eric and de Freitas, Nando},
  booktitle = {Uncertainty in Artificial Intelligence},
  year = {2011}
}

M. W. Hoffman, H. Kueck, N. de Freitas, and A. Doucet. (2009). New inference strategies for solving Markov decision processes using reversible jump MCMC. In Uncertainty in Artificial Intelligence. [pdf] [bibtex]

@inproceedings{hoffman:2009b,
  title = {New inference strategies for solving {Markov} decision processes
      using reversible jump {MCMC}},
  author = {Hoffman, Matthew W and Kueck, Hendrik and de Freitas, Nando and Doucet, Arnaud},
  booktitle = {Uncertainty in Artificial Intelligence},
  year = {2009}
}

M. W. Hoffman, N. de Freitas, A. Doucet, and J. Peters. (2009). An Expectation Maximization algorithm for continuous Markov Decision Processes with arbitrary reward. In International Conference on Artificial Intelligence and Statistics. [pdf] [code] [bibtex]

@inproceedings{hoffman:2009a,
  title = {An {Expectation Maximization} algorithm for continuous {Markov}
      Decision Processes with arbitrary reward},
  author = {Hoffman, Matthew W. and de Freitas, Nando and Doucet, Arnaud and Peters, Jan},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  year = {2009},
  code = {https://github.com/mwhoffman/mogmdp}
}

H. Kueck, M. W. Hoffman, A. Doucet, and N. de Freitas. (2009). Inference and Learning for Active Sensing, Experimental Design and Control. In Iberian Conference on Pattern Recognition and Image Analysis. [pdf] [bibtex]

@incollection{kueck:2009,
  title = {Inference and Learning for Active Sensing, Experimental Design and
      Control},
  author = {Kueck, Hendrik and Hoffman, Matthew W. and Doucet, Arnaud and de Freitas, Nando},
  booktitle = {Iberian Conference on Pattern Recognition and Image Analysis},
  year = {2009}
}

M. W. Hoffman, A. Doucet, N. de Freitas, and A. Jasra. (2007). Bayesian policy learning with trans-dimensional MCMC. In Neural Information Processing Systems. [pdf] [bibtex]

@inproceedings{hoffman:2007,
  title = {Bayesian policy learning with trans-dimensional {MCMC}},
  author = {Hoffman, Matthew W. and Doucet, Arnaud and de Freitas, Nando and Jasra, Ajay},
  booktitle = {Neural Information Processing Systems},
  year = {2007}
}

M. W. Hoffman, A. Doucet, N. de Freitas, and A. Jasra. (2007). On solving general state-space sequential decision problems using inference algorithms (No. TR-2007-04). University of British Columbia, Computer Science. [pdf] [bibtex]

@techreport{hoffman:2007a,
  title = {On solving general state-space sequential decision problems using
      inference algorithms},
  author = {Hoffman, Matthew W. and Doucet, Arnaud and de Freitas, Nando and Jasra, Ajay},
  institution = {University of British Columbia, Computer Science},
  number = {TR-2007-04},
  year = {2007}
}

M. W. Hoffman, D. B. Grimes, A. P. Shon, and R. P. N. Rao. (2006). A probabilistic model of gaze imitation and shared attention. Neural Networks, 19. [pdf] [bibtex]

@article{hoffman:2006,
  title = {A probabilistic model of gaze imitation and shared attention},
  author = {Hoffman, Matthew W. and Grimes, David B. and Shon, Aaron P. and Rao, Rajesh P.~N.},
  journal = {Neural Networks},
  volume = {19},
  year = {2006}
}

A. P. Shon, D. B. Grimes, C. L. Baker, M. W. Hoffman, S. Zhou, and R. P. N. Rao. (2005). Probabilistic gaze imitation and saliency learning in a robotic head. In International Conference on Robotics and Automation. [pdf] [bibtex]

@inproceedings{shon:2005,
  title = {Probabilistic gaze imitation and saliency learning in a robotic head},
  author = {Shon, Aaron P and Grimes, David B and Baker, Chris L and Hoffman, Matthew W and Zhou, Shengli and Rao, Rajesh PN},
  booktitle = {International Conference on Robotics and Automation},
  year = {2005}
}

Thesis

M. W. Hoffman. (2013). Decision making with inference and learning methods (PhD thesis). University of British Columbia. [pdf] [bibtex]

@phdthesis{hoffman:2013:thesis,
  title = {Decision making with inference and learning methods},
  author = {Hoffman, Matthew W},
  school = {University of British Columbia},
  year = {2013}
}