Tom Bush

Aspiring Machine Learning Researcher & Current Machine Learning Student

Basic MCMC - The Metropolis-Hastings Algorithm

16 minute read

In machine learning, we often face computations over random variables that are analytically intractable and are hence forced to use Monte Carlo (MC) approxim...

BadTorch Part 3: Language Model Decoding and Dropout

15 minute read

In my spare moments for the past few weeks I have been finishing off my “BadTorch” project of re-creating Pytorch’s module-level API for recurrent language m...

Bandit Algorithms (& The Exploration-Exploitation Tradeoff)

18 minute read

RL problems are unique in that RL agents face much greater uncertainty (e.g. about rewards, the environment) than is faced by models in supervised learning. ...

Variational Dropout in Recurrent Models

10 minute read

A workhorse of deep learning is dropout which is typically thought to help limit the extent to which models overfit to training data. However, the question o...

Exploring Tradeoffs Between Safety Metrics with MNIST

19 minute read

Despite huge advances in their capabilities when measured along standard performance dimensions (e.g. recognising images, producing language, forecasting wea...