The genetic information is stored in DNA in the form of a linear string of nucleotides. This information is read in the process of translation to produce proteins using a strict system of triplets. Each triplet of nucleotides (or codon) corresponds to an amino acid, the basic building block of proteins, or contains a stop instruction. If one or more nucleotides get added or deleted through mutations, it can easily happen that the number of nucleotides is not divisible by three anymore, and the reading-frame shifts. Apart from a few exceptions, these frameshifts have been assumed to produce “gibberish” sequences: potentially dangerous or nonfunctional proteins that differ significantly from what was originally intended, or are even prematurely cut.
However, proteins are not merely abstract sequences of letters, “they are physical objects, with physicochemical properties like charge or hydrophobicity. It is ultimately these properties that determine their structure, dynamics, and biological function”, says Bojan Zagrovic, senior author of the study. In the publication in PNAS, Lukas Bartonek, Daniel Braun and Bojan Zagrovic analyze the entire sets of proteins from exemplary organisms in all three domains of life (archaea, bacteria, eukaryotes). The authors demonstrate that in many cases the key physicochemical properties of the amino-acid sequences remain largely unaffected by frameshifts. These properties, like hydrophobicity or affinity for certain nucleotides, ultimately determine the key features of the protein, for example, how it folds or how it interacts with other partners.
What are the implications of these findings? The researchers hypothesize that they could be seen in an evolutionary context. Evolution is typically a slow process - proteins are changed and tweaked through gradual adjustments, by changing one amino acid after another. But, if, for example, environmental conditions change dramatically, it may be advantageous to rapidly explore very different protein sequences in a search for optimal adaptation. Here frameshifts could come into play. “Frameshift stability is embedded directly in the structure of the genetic code and could enable shortcuts in the evolutionary exploration of protein sequence space. Frameshifts could lead to completely novel protein sequences which retain the functionally relevant physicochemical properties of the sequences they derive from”, Lukas Bartonek and Daniel Braun, lead authors of the study, explain.
Original Publication:
Lukas Bartonek, Daniel Braun, and Bojan Zagrovic: "Frameshifting preserves key physicochemical properties of proteins". PNAS (2020)