The protein folding problem is a fascinating one. Protein are linear strings of amino acids. But they are not floppy strings. They take unique 3-D shapes and those shapes are critical for how they function. Clearly, there must be some rule or mechanism contained within the amino acid sequence that tells a protein how to bend but discovering those rules has not been easy. Now an AI algorithm, given a sequence oof amino acids, has been able to predict with considerable success the shape of that protein.
In 1972, Christian Anfinsen was awarded a Nobel Prize for his work showing that it should be possible to determine the shape of proteins based on the sequence of their amino acid building blocks.
Every two years, scores of teams from more than 20 countries blindly attempt to predict using computers the shape of a set of around 100 proteins from amino acid sequences alone.
At the same time, the 3-D structures are worked out in the lab by biologists using traditional techniques like X-ray crystallography and NMR spectroscopy, which determine the location of each atom relative to each other in the protein molecule.
A team of scientists from Casp (the Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction) then compares these predictions with 3-D structures solved using experimental methods.
In the latest round of the challenge, Casp-14, AlphaFold determined the shape of around two thirds of the proteins with accuracy comparable to laboratory experiments.
The assessors said accuracy with most of the other proteins was also high, though not quite at that level.
AlphaFold is based on a concept called deep learning. In this process, the structure of a folded protein is represented as a spatial graph.
The program then “learns” using information on the 3-D shapes of known proteins held in a worldwide database.
The article claims that the protein folding mystery has been solved. While this is a big step, I am not sure I would go that far. There is a difference between being able to predict something based on patterns extracted from a database and being able to predict it based on an underlying mechanism that one has unearthed. The former is what Aristotle might have called ‘know how’ while the latter is ‘know why’. It is the difference between predictions of planetary motion before Newtonian mechanics, that were based on patterns and rules like Kepler’s laws, and after Newtonian mechanics, that were based on the laws of motion and gravity.
As far as I am aware, we still do not know the underlying mechanism that determines how the protein folds. To me, that would constitute having really solved the problem,
Any biologists want to chime in?