Life is often presented as a pinnacle of complexity with the root of the difficulty lying in the multiplicity of its constituents and the intricacy of their interactions. Knowing every constituent and interaction is, however, unlikely to solve all problems. Proteins are a case in point: their physical principles are well established, their composition and structure are precisely known in many instances, and yet we generally do not know how to read the function of a protein from its sequence or how to design a sequence for a given function. But detailed knowledge may not be necessary for any of these tasks. Natural proteins have in fact homologs with similar functions despite sometimes very different sequences, indicating that many of their amino acids can be substituted without fundamentally altering their function.

More generally, an exhaustive characterization of living systems may neither be sufficient nor necessary for their understanding and engineering. Instead, a critical challenge for biology is to achieve a proper “coarse-grained”, low-dimensional description of living systems that captures the relative functional significance of their constituents and interactions.

Our team is taking two complementary approaches based on evolutionary principles to meet this challenge:

• A top-down analytic approach to decompose biomolecules into functional units by comparing statistically homologous systems with the premise that evolutionary conservation provides a generic measure of functional significance.

• A bottom-up synthetic approach to generate quantitative data both from controlled evolutionary experiments and from mathematical models to verify the consistency and sufficiency of the inferred coarse-grained descriptions.