Data-based optimal control of gene regulatory networks

View all posters

Aivar Sootla, Natalja Strelkowa, Damien Ernst, Mauricio Barahona, Guy-Bart Stan

Imperial College London, United Kingdom

We consider the problem of optimal exogenous feedback control of gene regulatory networks. Solutions to this problem are feedback control policies, i.e., functions computing the next optimal control input based on the currently measured system’s output. In our setting, these policies are inferred without using a mathematical model of the system, but directly from the measurements of the system’s response to external control inputs. The direct use of the measurements is beneficial since the modelling of gene regulatory networks is accompanied by a large degree of stochasticity, variability and uncertainty. Our approach to this control problem consists in adapting and further developing an established reinforcement learning algorithm called the fitted-Q iteration. To perform its computations, the fitted-Q iteration algorithm requires data sets consisting of inputs (e.g., a schedule of light pulses of fixed amplitude and wavelength, which affect the expression of certain genes in the gene regulatory network) and consequent outputs (e.g., fluorescence measurements as a proxy for protein concentrations). This data set can either be collected from wet-lab experiments or artificially created by computer simulations of stochastic dynamical models of the system. The developed algorithm is applicable to a wide range of biological systems due to its inherent ability to deal with highly stochastic system dynamics. In order to illustrate the performance of the approach, two benchmark control problems are considered: regulation of the toggle switch system, where the objective is to drive the concentrations of two specific proteins to prescribed constant levels, and reference tracking of the generalised repressilator system, where the objective is to force the system to follow a prescribed reference trajectory. In both cases, the objective is formulated as a trade-off between the minimal time response and the minimal gene expression burden imposed on the host cell.