(Michael Chinen)

31 Jan 21
22:19

Books I Read in 2020: Stats

I wanted to write in-depth book reviews for books that I enjoyed in 2020 at the very start of the year. Well, it’s already a month and I haven’t gotten to it.
So to kickstart it, I’m just going to make the task easier and just list some of the books I read in 2020 that were noteworthy with one or two sentences. And to narrow it down further, I’m just going to talk about stats-related books, because I read a few of them.

Science Fictions, Stuart Ritchie (2020)

This is a relentless, even thrilling debunking of bad science, starting with the replication crisis in psychology and ending up touching much more than I expected. The stories are captivating, and the explanations focus on the system’s perverse incentives for fraud, hype and negligence more than blaming the individuals (although there is definitely shaming where it is called for). The criticism of Kahneman’s overconfidence in Thinking Fast and Slow was refreshing to read, because he (and Yudkowsky in his senquences) says something to the effect of ‘you have no other choice to believe after reading these studies’, which felt like it didn’t match the larger message about questioning your beliefs and updating them on new information. It is a good lesson, not without a healthy helping of irony that rationalists need to be told that they should be less confident that they don’t know everything yet. Another great criticism was about the hugely popular, and often unchallenged Matthew Walker’s Why We Sleep, which I just took at face value before reading this book.

Statistical Rethinking, Richard McElreath, 2nd edition (2020)

This is an excellent introduction to Bayesian statistics, and pairs wonderfully with the author’s engaging, enlightening, and entertaining recorded 2019 lectures on YouTube as well as the homework problems on GitHub. Miraculously, McElreath manages to pull off a new video phenomena that mixes statistics with hints of standup comedy. There are no mesmerizing 3blue1brown-like plots, but McElreath picks interesting problems and datasets to play with, and breaks down the model into the core components that a newcomer would need to understand it. I also appreciate that the course is designed for a wide range of people, so there are very little assumptions about math backgrounds, but if you know calculus, information theory, and linear algebra, there are nice little asides that go deeper. It’s also great how up-to-date the book is – I didn’t expect to be so interested by the developments in Hamiltonian Monte Carlo that have come in the past few years, but it seems the field is undergoing rapid development. This book also helped me with understanding casual inference and confounders and is a great follow-up to the casual-reader oriented Pearl book mentioned below.

The Book of Why, Judea Pearl (2018)

This book was the first thing I read on causal inference and causality in general. It’s a light non-fiction book that assumes no math background, and does a good job of explaining how to answer the question of disentangling correlation and causation. The book has interesting problems and examples, such as how controlling on certain types of data (confounders) can actually cause you to find spurious causation, and how to go about showing that smoking really does cause cancer when it is unethical to do a randomized controlled trial and there are tons of confounders. There is a fair amount of interesting history in it, and because of that, there are traces of personal politics that seem slightly out of place, but it doesn’t detract from the book too much. This book is sort of a teaser for Pearl’s deeper textbook, Causality, which I haven’t read. The Book of Why doesn’t really go into how you would create a model of your own, or how such a bayesian model would compare to the deep neural network/stochastic gradient descent models that are driving the computing industry today. Perhaps Causality covers some of these things, but I felt like it would benefit from a few comparisons between the popular frameworks with a toy problem that deep learning can’t solve. Still, it could be argued that this was not the point of the book. In any case it was an enlightening introduction that gave me other questions to pursue, which is the type of book I am after.

Books I Read in 2020: Stats

Science Fictions, Stuart Ritchie (2020)

Statistical Rethinking, Richard McElreath, 2nd edition (2020)

The Book of Why, Judea Pearl (2018)