10,000 protein structures have been analyzed to find the 2,000 most common structural motifs by scientists at the European Institute for Bioinformatics (EMBL-EBI) and the University of Oxford.
The massive database, known as Folddisco, allows researchers to search and identify patterns across proteins with unprecedented scale. It’s a huge leap forward in understanding how proteins function and interact with each other.
To put that in perspective, the human genome contains over 20,000 protein-coding genes, and each protein has its own unique structure and function. Proteins are the building blocks of life, and understanding their structures is key to understanding how they work.
The search algorithm, developed by researchers from the European Institute for Bioinformatics and the University of Oxford, was trained on over 10,000 protein structures. By analyzing these structures, they were able to identify the 2,000 most common structural motifs.
### What makes Folddisco special?
Folddisco uses a novel machine learning approach called motif search. This allows researchers to search for patterns in protein structures that are not necessarily linear or straightforward.
### What this means
For researchers, Folddisco provides a powerful tool for understanding protein function and evolution. It can help identify how proteins interact with each other, which can lead to new insights into disease mechanisms and potential treatments.
For the wider scientific community, Folddisco opens up new avenues for research into protein function and evolution. This could have major implications for fields such as medicine, biotechnology, and synthetic biology.
### Next steps
Folddisco is an open-access database, and researchers are encouraged to contribute their own protein structures and use the database to explore new avenues of research.



