Design principles of multi-map variation in biological systems

Complexity in biology is often described using a multi-map architecture, where the genotype, representing the encoded information, is mapped to the functional level, known as the phenotype, which is then connected to a latent phenotype we refer to as fitness. This underlying architecture governs the processes that drive evolution. Moreover, natural selection, along with other neutral forces, can modify these maps. At each hierarchical level, variation is observed. Here, I propose the need to establish principles that can aid in understanding the transformation of variation within this multi-map architecture. Specifically, I will introduce three, related to the presence of modulators, constraints, and the channeling of variation. By comprehending these design principles in various biological systems, we can gain better insights into the mechanisms underlying these maps and their evolutionary dynamics.

More here. Design principles of multi-map variation in biological systems


Disentangling protein metabolic costs in human cells and tissues

What factors define the necessary proteome in cells and tissues and also how transformed tumor tissues modulate it. While it is clear that intrinsic functional limitations will contribute to this definition, it is also disputed that energy limitations are important. In this manuscript we review the energetic model that emphasizes the metabolic costs of assembling proteins as a contributing determinant of the proteome.

The main results confirm the metabolic efficiency model, where highly expressed proteins in individual cells tend to be short and use biosynthetically cheap amino acids. However, there are cases of highly expressed proteins that are short but use expensive amino acids. We provide a framework decoupling protein length and biosynthetic cost to characterize the energy of tissue proteomes. In the context of tumor tissues, we find that  tumors overexpress short and "cheap" proteins, while underexpressing long and "expensive" ones, indicating a link between cancer progression and cost reduction. This framework sheds light on the significance of energy considerations in understanding the behavior of individual cells, tissues, and cancer progression.

More here. Disentangling protein metabolic costs in human cells and tissues

The limits of phenotype prediction

Perhaps the most unifying research program of all issues currently being addressed in biology is that of phenotype prediction. From an applied point of view, quantitative genetics have dominantly led this program over the years, with the recent explosion of genome-wide association studies (GWAS). GWAS combine genetic and phenotypic information from a training population to generate polygenic scores as tools to anticipate complex phenotypes, including diseases. But these tools are black-box models, which are still far from unraveling the biological foundations behind their successes and failures. More fundamentally, we do not know how particular features of the genotype-to-phenotype (GP) map eventually influence the emergence of the phenotype.

In this manuscript, we aim to understand prediction by “opening” the black box. We follow the approach of quantitative genetics to study the emergence of statistical associations and the ability of phenotypic prediction in metabolism. This work also clarifies the apparent paradox suggested by the result that linear statistical frameworks can have enormous validity in predicting phenotypes generated by non-linear GP maps and contributes to a broader call to develop an interpretable approach to black-box phenotype prediction models. A request that is extensible to other prediction contexts, as is the case of deep neural networks.

More here.
The limitations of phenotype prediction in metabolism


The costs of complex pleiotropic mutations.

The fitness cost of complex pleiotropic mutations is generally difficult to assess. On the one hand, it is necessary to identify which molecular properties are directly altered by the mutation. On the other, this alteration modifies the activity of many genetic targets with uncertain consequences.

Here, we examine the possibility of addressing these challenges by identifying unique predictors of these costs. To this aim, we consider mutations in the RNA polymerase (RNAP) in Escherichia coli as a model of complex mutations. Changes in RNAP modify the global program of transcriptional regulation, with many consequences. Among others is the difficulty to decouple the direct effect of the mutation from the response of the whole system to such mutation. A problem that we solve quantitatively with data of a set of constitutive genes, which better read the global program.

We provide a statistical framework that incorporates the direct effects and other molecular variables linked to this program as predictors, which leads to the identification that some genes are more suitable predictors than others. Therefore, we not only identified which molecular properties best anticipate costs in fitness, but we also present the paradoxical result that, despite pleiotropy, specific genes serve as better predictors. These results have connotations for the understanding of the architecture of robustness in biological systems.

More here.
Predicting the fitness costs of complex mutations


Immunity to SARS-CoV-2; an open question.

The duration of immunity to SARS-CoV-2 is uncertain. Delineating immune memory typically requires longitudinal serological studies that track antibody prevalence in the same cohort for an extended time. However, this information is needed in faster timescales. Notably, the dynamics of an epidemic where recovered patients become immune for any period should differ significantly from those of one where the recovered promptly become susceptible.

We have exploited this difference to provide a reliable protocol that can estimate immunity early in an epidemic. We verify this protocol with synthetic data, discuss its limitations, and then apply it to evaluate human immunity to SARS-CoV-2 in mortality data series from New York City. Our results indicate that New York’s mortality figures are incompatible with immunity lasting anything below 105 or above 211 days (90% CI.), and set an example on how to assess immune memory in emerging pandemics before serological studies can be deployed.

More here. Evidence for immunity to SARS-CoV-2 from epidemiological data series