Computational probabilistic modeling

Bayesian workflow

  • On Bayesian Workflow (70min with discussion)
    Talk given in seminar. This is the latest version of the talk, which is gradually evolving and with slight differences in emphasis dependending on the audience.
    • The pre-print Bayesian workflow.
    • Abstract: I discuss some parts of Bayesian workflow with focus on the need and justification for iterative workflow. The talk is partly based on a review paper by Gelman, Vehtari, Simpson, Margossian, Carpenter, Yao, Kennedy, Gabry, Bürkner, and Modrák with the following abstract: "The Bayesian approach to data analysis provides a powerful way to handle uncertainty in all observations, model parameters, and model structure using probability theory. Probabilistic programming languages make it easier to specify and fit Bayesian models, but this still leaves us with many options regarding constructing, evaluating, and using these models, along with many remaining challenges in computation. Using Bayesian inference to solve real-world problems requires not only statistical skills, subject matter knowledge, and programming, but also awareness of the decisions made in the process of data analysis. All of these aspects can be understood as part of a tangled workflow of applied Bayesian statistics. Beyond inference, the workflow also includes iterative model building, model checking, validation and troubleshooting of computational problems, model understanding, and model comparison. We review all these aspects of workflow in the context of several examples, keeping in mind that applied research can involve fitting many models for any given problem, even if only a subset of them are relevant once the analysis is over.

  • On Bayesian Workflow (45min)
    Talk given in Finnish Center for Artificial Intelligence (FCAI) Machine learning coffee seminar.

  • On Bayesian Workflow (65min)
    Talk given in Generable seminar

Pareto-\(\hat{k}\) diagnostic

Cross-validation

  • Uncertainty in Bayesian leave-one-out cross-validation based model comparison (30min)
    CMStatistics conference talk.
    • The pre-print Uncertainty in Bayesian leave-one-out cross-validation based model comparison.
    • Abstract: Leave-one-out cross-validation (LOO-CV) is a popular method for comparing Bayesian models based on their estimated predictive performance on new, unseen, data. Estimating the uncertainty of the resulting LOO-CV estimate is a complex task and it is known that the commonly used standard error estimate is often too small. We analyse the frequency properties of the LOO-CV estimator and study the uncertainty related to it. We provide new results of the properties of the uncertainty both theoretically and empirically and discuss the challenges of estimating it. We show that problematic cases include: comparing models with similar predictions, misspecified models, and small data. In these cases, there is a weak connection in the skewness of the sampling distribution and the distribution of the error of the LOO-CV estimator. We show that it is possible that the problematic skewness of the error distribution, which occurs when the models make similar predictions, does not fade away when the data size grows to infinity in certain situations.

posteriordb

Stan

Gaussian processes