Accelerated regression-based summary statistics for discrete stochastic systems via approximate simulators.

Jiang RM, Wrede F, Singh P, Hellander A, Petzold LR

BMC Bioinformatics 22 (1) 339 [2021-06-23; online 2021-06-23]

Approximate Bayesian Computation (ABC) has become a key tool for calibrating the parameters of discrete stochastic biochemical models. For higher dimensional models and data, its performance is strongly dependent on having a representative set of summary statistics. While regression-based methods have been demonstrated to allow for the automatic construction of effective summary statistics, their reliance on first simulating a large training set creates a significant overhead when applying these methods to discrete stochastic models for which simulation is relatively expensive. In this τ work, we present a method to reduce this computational burden by leveraging approximate simulators of these systems, such as ordinary differential equations and τ-Leaping approximations. We have developed an algorithm to accelerate the construction of regression-based summary statistics for Approximate Bayesian Computation by selectively using the faster approximate algorithms for simulations. By posing the problem as one of ratio estimation, we use state-of-the-art methods in machine learning to show that, in many cases, our algorithm can significantly reduce the number of simulations from the full resolution model at a minimal cost to accuracy and little additional tuning from the user. We demonstrate the usefulness and robustness of our method with four different experiments. We provide a novel algorithm for accelerating the construction of summary statistics for stochastic biochemical systems. Compared to the standard practice of exclusively training from exact simulator samples, our method is able to dramatically reduce the number of required calls to the stochastic simulator at a minimal loss in accuracy. This can immediately be implemented to increase the overall speed of the ABC workflow for estimating parameters in complex systems.

Prashant Singh

PubMed 34162329

DOI 10.1186/s12859-021-04255-9

Crossref 10.1186/s12859-021-04255-9

pii: 10.1186/s12859-021-04255-9
pmc: PMC8220802

Publications 7.1.2