Selections from minimal protein libraries

Next-generation sequencing data – We performed by phage display 3 rounds of selection and amplification of minimal protein libraries consisting of Vh antibody domains with 4 positions systematically varied to every 20 amino acid. We make available the data presented in the paper cited below: data.

A python implementation of our extreme value analysis – High-throughput selections from large libraries of biomolecules (proteins, RNAs) can identify the rare sequences that satisfy a given selective pressure. The “selectivity” (fitness) of each of these sequences can be estimated by next-generation sequencing from the ratio of their frequencies in the population before and after selection. We showed that the distribution of these selectivities could be interpreted and quantitatively characterized by extreme value theory, the mathematical theory of extreme events – here the rare finding of a biomolecule meeting the selective constraints: Jupyter notebook & associated data set.

Reference: S. Boyer, D. Biswas, A. K. Soshee, N. Scaramozzino, C. Nizak, O. Rivoire (2016). Hierarchy and extremes in selections from pools of randomized proteins.


A python implementation of the Statistical Coupling Analysis (SCA) – The Statistical Coupling Analysis (SCA) is an approach for characterizing the pattern of evolutionary constraints on and between amino acid positions in a protein family. Given a representative multiple sequence alignment of the family, the analysis provides methods for quantitatively measuring the overall functional constraint at each sequence position (the position-specific, or “first-order” analysis of conservation), and for measuring and analyzing the coupled functional constraint on all pairs of sequence positions (the pairwise-correlated, or “second-order” analysis of conservation). The premise is that extending the traditional definition of conservation to include correlations between positions will contribute to defining the architecture of functional interactions between amino acids, and more importantly, help define the basic physical principles underlying protein structure, function, and evolution: Download from GitHub.

Reference: O. Rivoire, K. Reynolds, R. Ranganathan (2016). Evolution-based functional decomposition of proteins.