Markov Chain Monte Carlo
Bayesian calculations used in occupational hygiene are mathematically mind boggling. Neither you* nor I would be able to solve the equations that underpin ExpoStats or IH_DataAnalyst. And yet we still get useful and accurate solutions to our stats analysis. How is this possible? Using Markov Chain Monte Carlos (MCMC)^.
We will go through an insightful analogy soon, but before that let’s touch on the fundamentals.
How it works
Firstly, what are we calculating? We want to know what the exposure monitoring data we have collected says about the world given our pre-exisiting beliefs. Say you have a materials lab on a metalliferous mine. The mine next door doing exactly the same process says “our testing shows that this process results in extreme levels of arsenic!”. But when you do testing, your exposure level results are moderate. You decide both datasets are valid but incomplete. You want to combine them to get a more nuanced position. You have your prior (next door data), your likelihood (your results), and you want to a calculate a posterior (more nuanced position). Based on a previous post, it sounds like it should be simple multiplication. Unfortunately the math is… well… it’s density integration with multiple dimensions involving hierarchical parameters with potentially non-conjugate priors. If that scares you, don’t worry, that only means you are still sane. Instead we do it one baby step at a time.
Secondly, MCMC can be broken down into its two part. Monte Carlo (named after Monaco) is means of approximating the answer using random sampling. With sufficient samples, the hope is that the underlining pattern will reveal itself which at first may appear to be just noise. A Markov chain is a step-by-step process where the next position is influenced by your current location. Example - imagine you are drunk and trying to walk home. Every step you take you stop and decide whether your next step is going to be forward or backward. So you stumble back and forth like an idiot. But wherever you are currently standing is based on your location a step before, as apposed to teleporting around the street. Combine these two parts and we get the MCMC.
A Markov chain being used to calculate a frequency distribution
MCMCs are sometimes referred to as “random walks through the parameter space”. Still too math-y for me. I prefer the idea that you are an explorer incrementally mapping a mountainous forest. You can’t see the landscape all at once but you can piece it together. See? Stats is an exciting adventure!
Becoming president
One more analogy I have taken directly from a textbook to drive it all home**:
Imagine you are on the campaign trail to become president of an archipelago. You want to spend your campaigning time on each island in proportion to its population. But oh no! All the census data was burned down, and you don’t know what the population of each island is. Each night, you need to decide: are you going to island hop east, west, or stay another day where you are. Thankfully, you can call ahead and ask the local mayor their island population relative to your current location.
Genius that you are, you come up with a plan. First, you flip a (fair) coin to decide whether to go east or west. If the proposed island has a larger population than the current island, then you will definitely go to that island. On the other hand, if the proposed island has a smaller population than the current island, then you only go with a percentage chance. If the population of the proposed island is only half as big as the current island, the probability of going there is only 50%.
What’s amazing about this heuristic is that it works: In the long run, the probability that you are campaigning on any one of the islands exactly matches the relative population of the island!
Conclusion
Hopefully this gives you a feel for what is happening each time you use Bayesian calculators. Obviously it is a bit more complex than our analogies and there are many things to consider when designing these algorithms- autocorrelation, number of steps, starting value etc. But from the perspective of everyday users, a conceptual understanding is all that is needed.
Reference
J. Kruschke (2015). Doing Bayesian Data Analysis. Elsevier.
*Or maybe you are a mathematician or statistician. That’s cool, let’s talk.
^ Fact: it’s way more fun to say Monte Carlo Markov Chain
** This is taken from the referenced textbook. It has been slightly shortened and reworded. It’s a great (if a bit challenging) introduction into Bayesian Stats in R.