De novo protein design – From novel backbone construction to fully designed sequences

View all posters

Michael Doran

United Kingdom

Proteins are attractive candidates for synthetic component, but the relatively limited range of structural diversity available in nature (compared to the possible range of structures) may lead to restrictions in designing new functions and interactions. From a topological point of view, the space of possible structures is vastly larger than the space which has apparently been used by nature, with predictions of the number of possible folds based on known structures converging on a number of about 2-3,000. By contrast, the number of possible folds — even given a restricted number of secondary structural elements — is larger by several orders of magnitude, providing many possibilities for combinations of functional sites and interaction surfaces. The periodic table of protein structures offers, uniquely in present descriptions of fold space, a means to describe many of these possible structures on the topological level. We propose a pipeline for taking these abstract structural descriptions all the way to full-atom models with amino acid sequences, starting with rough alpha-carbon only models and progressing through several stages of high-resolution refinement, sequence design and filtering according to known principles of protein structure, coarse-grained and high-resolution knowledge-based potentials and secondary/tertiary structure prediction methods. The resulting sequences are tested using a medium-throughput expression pipeline and analyzed using 2D NMR HSQC spectra, following which soluble folded proteins can be taken forward for full structural characterization.