Useful resources on methods

Introductory Texts

These are some of the texts that helped me get started with quant methods. I still return to them regularly when I want a refresher. Essential.

Wooldridge, J. M. (2015). Introductory Econometrics: A modern approach. Nelson Education.

Angrist, J. D., & Pischke, J. S. (2014). Mastering ‘metrics: The path from cause to effect. Princeton University Press.

Cook, T. D., Campbell, D. T., & Shadish, W. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin.

Imai, K. (2017). Quantitative social science: an introduction. Princeton University Press.

Gerber, A. S., & Green, D. P. (2012). Field experiments: Design, analysis, and interpretation. WW Norton.

Causal Inference

Rosenbaum, P. R. (2005) Observational study. In Everitt, B., & Howell, D. C. (Eds.).  Encyclopedia of statistics in behavioral science (pp. 1809–1814). Link here.

Imai, K., King, G., & Stuart, E. A. (2008). Misunderstandings between experimentalists and observationalists about causal inference. Journal of the Royal Statistical Society: Series A171(2), 481-502.

Shadish, W. R. (2010). Campbell and Rubin: A primer and comparison of their approaches to causal inference in field settings. Psychological Methods15(1), 3.

Imbens, G. W., & Rubin, D. B. (2015). A Classification of Assignment Mechanisms. In their book Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.

Imbens, G. W., & Rubin, D. B. (2015). A Brief History of the Potential Outcomes Approach to Causal Inference. In their book Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.

Rubin, D. B. (2008). For objective causal inference, design trumps analysis. The Annals of Applied Statistics, 808-840.

Steiner, P. M., Kim, Y., Hall, C. E., & Su, D. (2017). Graphical models for quasi-experimental designs. Sociological Methods & Research46(2), 155-188.

Surveys & Tutorials: Statistics & Modelling

Textbooks are long and expensive. Sometimes you just want a clear overview.

Steve Strand’s open access course on using regression methods in educational research. Link here.

McNeish, D., & Kelley, K. (2018). Fixed effects models versus mixed effects models for clustered data: Reviewing the approaches, disentangling the differences, and making recommendations. Psychological Methods.

Visualising Hierarchical Models. Link here.

Kreuter, F., & Valliant, R. (2007). A survey on survey statistics: What is done and can be done in Stata. Stata Journal7(1), 1.

Waldmann, E. (2018). Quantile regression: A short story on how and why. Statistical Modelling18(3-4), 203-218.

Landau, S. (2002). Using survival analysis in psychology. Understanding Statistics: Statistical Issues in Psychology, Education, and the Social Sciences1(4), 233-270.

Lei, P. W., & Wu, Q. (2007). Introduction to structural equation modeling: Issues and practical considerations. Educational Measurement: issues and practice26(3), 33-43.

Sterba, S. K. (2009). Alternative model-based and design-based frameworks for inference from samples to populations: From polarization to integration. Multivariate behavioral research44(6), 711-740.

Lee, Y. R., & Pustejovsky, J. E. (2023). Comparing random effects models, ordinary least squares, or fixed effects with cluster robust standard errors for cross-classified data. Psychological Methods.

Surveys & Tutorials: Evaluation methods

Textbooks are long and expensive. Sometimes you just want a clear overview.

Deaton, A., & Cartwright, N. (2017). Understanding and misunderstanding randomized controlled trials. Social Science & Medicine.

St Clair, T., & Cook, T. D. (2015). Difference-in-differences methods in public finance. National Tax Journal68(2), 319.

Olden, A., & Møen, J. (2022). The triple difference estimator. The Econometrics Journal,

Hallberg, K., Williams, R., Swanlund, A., & Eno, J. (2018). Short Comparative Interrupted Time Series Using Aggregate School-Level Data in Education Research. Educational Researcher47(5), 295-306.

Lee, D. S., & Lemieux, T. (2010). Regression discontinuity designs in economics. Journal of Economic Literature48(2), 281-355.

Lousdal, M. L. (2018). An introduction to instrumental variable assumptions, validation and estimation. Emerging Themes in Epidemiology15(1), 1-7.

Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research46(3), 399-424…

…this survey article is twinned with this case study and tutorial article…

…Austin, P. C. (2011). A tutorial and case study in propensity score analysis: an application to estimating the effect of in-hospital smoking cessation counseling on mortality. Multivariate Behavioral Research46(1), 119-151.

Richwine, C., Luo, Q. E., Thorkildsen, Z., Chong, N. J., Morris, R., Barnow, B. S., & Pandey, S. K. (2022). Defining and assessing the value of canonical mixed methods research designs in public policy and public administration. Journal of Policy Analysis and Management.

Randomisation inference

Senn, S. (2012). Tea for three: of infusions and inferences and milk in first. Significance9(6), 30-33.

Rosenbaum, P. (2012). Observation and experiment. Harvard University Press. Part 1.

Rosenberger, W. F., Uschner, D., & Wang, Y. (2019). Randomization: The forgotten component of the randomized clinical trial. Statistics in Medicine38(1), 1-12.

Keele, L., McConnaughy, C., & White, I. (2012). Strengthening the experimenter’s toolbox: Statistical estimation of internal validity. American Journal of Political Science56(2), 484-499.

Young, A. (2019). Channeling fisher: Randomization tests and the statistical insignificance of seemingly significant experimental results. The Quarterly Journal of Economics134(2), 557-598.

Heß, S. (2017). Randomization inference with Stata. The Stata Journal17(3), 630-651.

Rosenberger, W. F., & Lachin, J. M. (2015). Randomization in clinical trials: theory and practice. John Wiley & Sons.

Within-Study Comparisons

There are two ways to think about whether a research design will give you a causal effect. One is to use econometric theory to think about the identifying assumptions. This is currently the dominant approach. The other is to conduct empirical studies evaluating whether particular research designs reproduce the results from randomised controlled trials. The latter approach is now beginning to provide substantive guidance on when observational studies do isolate causal effects. Having a good pretest, for example, is worth its weight in gold; using “focal, local” matches helps; stable pretest trends make the analysts life much easier; and RDDs are in many ways as good as RCTs.

Ferraro, P. J., & Miranda, J. J. (2017). Panel data designs and estimators as substitutes for randomized controlled trials in the evaluation of public programs. Journal of the Association of Environmental and Resource Economists4(1), 281-317.

Cook, T. D., Shadish, W. R., & Wong, V. C. (2008). Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within‐study comparisons. Journal of Policy Analysis and Management27(4), 724-750.

Fortson, K., Gleason, P., Kopa, E., & Verbitsky-Savitz, N. (2015). Horseshoes, hand grenades, and treatment effects? Reassessing whether nonexperimental estimators are biased. Economics of Education Review44, 100-113.

Hallberg, K., Wong, V. C., & Cook, T. D. (2016). Evaluating Methods for Selecting School-Level Comparisons in Quasi-Experimental Designs: Results from a Within-Study Comparison.

St. Clair, T., Hallberg, K., & Cook, T. D. (2016). The validity and precision of the comparative interrupted time-series design: three within-study comparisons. Journal of Educational and Behavioral Statistics41(3), 269-299.

Hallberg, K., Wing, C., Wong, V., & Cook, T. (2014). Clinical trials and regression discontinuity designs. The Oxford Handbook of Quantitative Methods.

Chaplin, D. D., Cook, T. D., Zurovac, J., Coopersmith, J. S., Finucane, M. M., Vollmer, L. N., & Morris, R. E. (2018). The internal and external validity of the regression discontinuity design: A meta-analysis of 15 within-study comparisons. Journal of Policy Analysis and Management37(2), 403-429.

Weidmann, B., Miratrix, L. (2020). Lurking inferential monsters? Quantifying selection bias in evaluations of school programmes. Journal of Policy Analysis and Management link.

Common Errors and Misinterpretations

Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European Journal of Epidemiology31(4), 337-350.

Wang, L. L., Watts, A. S., Anderson, R. A., & Little, T. D. (2013). Common fallacies in quantitative research methodology. In Masyn, K. E., Nathan, P., & Little, T. (Eds.). The Oxford Handbook of Quantitative Methods, Vol. 2: Statistical Analysis, 718.

Colquhoun, D. (2014). An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society open science1(3). Link here.

Gelman, A., & Carlin, J. (2014). Beyond power calculations: Assessing type S (sign) and type M (magnitude) errors. Perspectives on Psychological Science, 9(6), 641-651.

Senn, S. (2013). Seven myths of randomisation in clinical trials. Statistics in Medicine32(9), 1439-1450.

Berk, R. (2010). What you can and can’t properly do with regression. Journal of Quantitative Criminology26(4), 481-487.

Bollen, K. A., & Pearl, J. (2013). Eight myths about causality and structural equation models. In Handbook of causal analysis for social research (pp. 301-328). Springer, Dordrecht.

@MartenvSmeden has an epic twitter thread on misconceptions here

Trafimow, D. (2022). A New Way to Think About Internal and External Validity. Perspectives on Psychological Science, 17456916221136117.


A distinctive feature of the social sciences is that many variables of interest are not directly observable. Unless you are willing to restrict yourself to only conducting policy evaluations, this necessitates careful engagement with the science of measurement.

Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review111(4), 1061-1071.

McNeish, D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychological Methods23(3), 412.

Spector, P. E. (2014). Survey design and measure development. The Oxford Handbook of Quantitative Methods, Vol. 1, 170.

Preacher, K. J., & MacCallum, R. C. (2003). Repairing Tom Swift’s electric factor analysis machine. Understanding statistics: Statistical issues in psychology, education, and the social sciences2(1), 13-43.

Schmitt, T. A. (2011). Current methodological considerations in exploratory and confirmatory factor analysis. Journal of Psychoeducational Assessment29(4), 304-321.

Sass, D. A., & Schmitt, T. A. (2010). A comparative investigation of rotation criteria within exploratory factor analysis. Multivariate Behavioral Research45(1), 73-103.

Favero, N., & Bullock, J. B. (2014). How (not) to solve the problem: An evaluation of scholarly responses to common source bias. Journal of Public Administration Research and Theory25(1), 285-308.

Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: the state of the art and future directions for psychological research. Developmental Review41, 71-90.

Weidman, A. C., Steckler, C. M., & Tracy, J. L. (2017). The jingle and jangle of emotion assessment: Imprecise measurement, casual scale usage, and conceptual fuzziness in emotion research. Emotion17(2), 267-295.

Cheema, J. R. (2014). A review of missing data handling methods in education research. Review of Educational Research84(4), 487-508.

White, I. R., Royston, P., & Wood, A. M. (2011). Multiple imputation using chained equations: issues and guidance for practice. Statistics in Medicine30(4), 377-399.

The Role of Theory

Theory is essential for any evaluation that is not an RCT because it’s necessary to assess the plausibility of the identifying assumptions. A strong theory of selection can also help model the assignment mechanism properly when using propensity score matching. It’s also important for generalising the findings from RCTs, which generally have weak external validity.

Clarke, B., Gillies, D., Illari, P., Russo, F., & Williamson, J. (2014). Mechanisms and the evidence hierarchy. Topoi33(2), 339-360.

Illari, P. M., & Williamson, J. (2012). What is a mechanism? Thinking about mechanisms across the sciences. European Journal for Philosophy of Science2(1), 119-135.

Cook, T. D. (2014). Generalizing causal knowledge in the policy sciences: External validity as a task of both multiattribute representation and multiattribute extrapolation. Journal of Policy Analysis and Management33(2), 527-536.

Healy, K. (2017). Fuck nuance. Sociological Theory35(2), 118-127.

Van Lange, P. A. (2013). What we should expect from theories in social psychology: Truth, abstraction, progress, and applicability as standards (TAPAS). Personality and Social Psychology Review17(1), 40-55.

Cartwright, N. D. (2013). Evidence: for policy and wheresoever rigor is a must. Link here.

Michie, S., Richardson, M., Johnston, M., Abraham, C., Francis, J., Hardeman, W., … & Wood, C. E. (2013). The behavior change technique taxonomy (v1) of 93 hierarchically clustered techniques: building an international consensus for the reporting of behavior change interventions. Annals of Behavioral Medicine46(1), 81-95.

Scheel, A.M., Tiokhin L., Isager, P.M., & Lakens, D. (in press). Why Hypothesis Testers Should Spend Less Time Testing Hypotheses. Perspectives on Psychological Science.

Trafimow, D. (2022). A New Way to Think About Internal and External Validity. Perspectives on Psychological Science, 17456916221136117.

Stats Visualisations

Sampling Distributions. Link here.

Visualising Hierarchical Models. Link here.

Random Assignment. Links here and here.

Seeing Statistical Theory. Link here.

Leckie, G., Charlton, C., & Goldstein, H. (2016). Communicating uncertainty in school value-added league tables. Centre for Multilevel Modelling, University of Bristol. URL:

Causal Inference Animated Plots. Link here.

R Psychologist stats visualizations. The Cohen’s D page is brilliant. Link here.

Regression and causality. Link here.

Marc Lajeunesse here

Sam Sims Quantitative Education Research

%d bloggers like this: