Home
Statistics
DISCLAIMER: This blog is unfinished and may contain typos or mistakes. If you notice any, please send me any feedback or suggestions.
When I was in college, I hated statistics. I thought it was tedious—a myopic mess of obscure procedures that blighted “Methods” sections across the arXiv, distracting from the core science.
After just a few years of doing X-ray astronomy, I’m completely converted.
Firstly, statistics doesn’t distract from the science. It enables the science. A good statistical treatment can expose a hidden signal, dismiss false positives, and most importantly form a foundation that’s reliable enough to support years of future research.
I’ve also learned that statistics is a much more motivated field than I had thought. Especially Bayesian statistics is founded on a few simple axioms, and if you simply follow them you will find yourself re-deriving the common techniques from first principles. It’s rather like physics itself in this way.
I’m writing this blog to lay out these founding principles of statistics, in the hope that other people will find them as useful as I have. My view is heavily influenced by practices in astronomy; for example, I’ll discuss Bayesian statistics more than frequentist, though I will cover both. I will do my best to provide interactive visualizations to explain key concepts. However, I’ll stop short of providing example code for solving statistical problems, focusing on the core principles instead.
Bayesian Statistics
- Introduction to probability
- Expected values, Variance, and the Central Limit Theorem
- What is Bayesian statistics?
- Parameter fitting and uncertainties
- Example: Fitting a line
- Calculating uncertainty on the prediction
- More general example: Gaussian posteriors
- Most general example: MCMCs
- Model comparison
- A gallery of likelihoods
- Priors: dos and don’ts
Frequentist Statistics
Common problems and their solutions
- I have too many parameters to fit for (Gibbs sampling)
- I don’t know my likelihood distribution, but I have lots of data (Jackknifing and bootstrapping)
- I don’t know my likelihood distribution, but I have a simulator (Extracting covariances from simulations)
- I don’t have a good model, but I have a simulator (Machine learning)
Mathematical review
I am a PhD candidate in physics at Stanford University. I study high energy astrophysical objects such as pulsars.
This blog was built using wikid, a markdown-to-html converter that I built to be easily applicable to science blogs.