share this!
2
7
Share
Email

March 12, 2019

Machine-learning model provides detailed insight on proteins

A novel machine-learning 'toolbox' that can read and analyse the sequences of proteins has been described today in the open-access journal eLife.

The study demonstrates that, when trained to read sequence data, artificial neural networks called Restricted Boltzmann Machines (RBM) can provide a wealth of information on protein structure, function and evolutionary features. It is believed to be the first method that can extract this level of detail from sequence data alone.

Proteins are formed of sequences of molecules called amino acids, which determine a given protein's structural and functional properties. But understanding which parts of the sequences are responsible for which properties is challenging. "Answering this question could have significant implications for pharmaceutical development," explains co-author Jérôme Tubiana, former Ph.D. student in the Physics Laboratory at l'École Normale Supérieure (ENS), Paris, France. "For example, it could help with the design of new proteins that have desired functions, or with predicting the future sequence evolution of proteins in living organisms, such as pathogens, and identifying appropriate drug targets."

To explore this question, Tubiana and his collaborators applied RBM to 20 protein 'families' - a group of proteins that share a common evolutionary origin. The researchers presented detailed results for four protein families, including two short protein domains called Kunitz and WW, one long chaperone protein called Hsp70, and synthetic lattice proteins for benchmarking.

They discovered that, after learning, the connections between the artificial neuronsin the RBM are interpretable and relate to the protein's structure, function (such asactivity) or phylogeny—the evolutionary relationships between protein sequences. Additionally, the team found that they could use RBM to design new protein sequences by composing and turning up or down the different artificial neural units at will.

"Our RBM model shows how machine-learning techniques can solve complex data recognition and draw conclusions from data in an interpretable way," says co-author Simona Cocco, CNRS Director of Research at the ENS Physics Laboratory. "This runs counter to the more complex, black-box models that are traditionally used in data science, as statistical analyses provided by these tools are largely uninterpretable. The interpretability of our method is a major benefit to scientists—it bears the promise of allowing them to generate proteins with desired functions in a controlled way."

"It will now be interesting to apply our model to proteins in pathogens," adds senior author Rémi Monasson, also CNRS Director of Research at the ENS Physics Laboratory, and Deputy Director of the Henri Poincaré Institute (CNRS/Sorbonne University), France. "Pathogens, particularly viruses, can often escape drugs through mutations that make treatments ineffective. Our method could be used to predict the mutational escape paths that are accessible to the functional protein from its current sequence, and help identify which combination of protein sites should be targeted by drugs to block all paths."

More information: Jérôme Tubiana et al, Learning protein constitutive motifs from sequence data, eLife (2019). DOI: 10.7554/eLife.39397

Journal information: eLife

Provided by eLife

Citation: Machine-learning model provides detailed insight on proteins (2019, March 12) retrieved 25 April 2024 from https://phys.org/news/2019-03-machine-learning-insight-proteins.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

New antibiotics are desperately needed—machine learning could help

9 shares

Feedback to editors

Machine-learning model provides detailed insight on proteins

Synthetic droplets cause a stir in the primordial soup: Chemotaxis research answers questions about biological movement

First experimental proof for brain-like computer with water and salt

Airborne single-photon lidar system achieves high-resolution 3D imaging

The magic of voices: Why we like some singers' voices and not others

Chemical rope trick at molecular level: Mechanism research helps when 'trial and error' fails

Targeted culling of starfish found to help Great Barrier Reef maintain or increase cover

How do birds flock? Researchers do the math to reveal previously unknown aerodynamic phenomenon

Archaeologists unearth top half of statue of Ramesses II

Scientists discover method to prevent coalescence in immiscible liquids

Recently discovered black hole is part of a nearby disrupted star cluster, study finds

Relevant PhysicsForums posts

The Cass Report (UK)

Major Evolution in Action

If theres a 15% probability each month of getting a woman pregnant...

Can four legged animals drink from beneath their feet?

Mold in Plastic Water Bottles? What does it eat?

Dolphins don't breathe through their esophagus

New antibiotics are desperately needed—machine learning could help

Predicting sequence from structure

Details of protein evolution investigated

Improved method for protein sequence comparisons is faster, more accurate, sensitive

'Digging up' 4-billion-year-old fossil protein structures to reveal how they evolved

A new algorithm to predict the dynamic language of proteins

Giant virus discovered in wastewater treatment plant infects deadly parasite

Study suggests that cells possess a hidden communication system

Hidden biosphere discovered beneath world's driest hot desert

Unveiling the mysteries of cell division in embryos with timelapse photography

Researchers unveil PI3K enzyme's dual accelerator and brake mechanisms

COVID-19 virus disrupts protein production: Researcher discusses her recent findings

Medical Xpress

Tech Xplore

Science X

Machine-learning model provides detailed insight on proteins

Synthetic droplets cause a stir in the primordial soup: Chemotaxis research answers questions about biological movement

First experimental proof for brain-like computer with water and salt

Airborne single-photon lidar system achieves high-resolution 3D imaging

The magic of voices: Why we like some singers' voices and not others

Chemical rope trick at molecular level: Mechanism research helps when 'trial and error' fails

Targeted culling of starfish found to help Great Barrier Reef maintain or increase cover

How do birds flock? Researchers do the math to reveal previously unknown aerodynamic phenomenon

Archaeologists unearth top half of statue of Ramesses II

Scientists discover method to prevent coalescence in immiscible liquids

Recently discovered black hole is part of a nearby disrupted star cluster, study finds

Relevant PhysicsForums posts

Related Stories

New antibiotics are desperately needed—machine learning could help

Predicting sequence from structure

Details of protein evolution investigated

Improved method for protein sequence comparisons is faster, more accurate, sensitive

'Digging up' 4-billion-year-old fossil protein structures to reveal how they evolved

A new algorithm to predict the dynamic language of proteins

Recommended for you

Giant virus discovered in wastewater treatment plant infects deadly parasite

Study suggests that cells possess a hidden communication system

Hidden biosphere discovered beneath world's driest hot desert

Unveiling the mysteries of cell division in embryos with timelapse photography

Researchers unveil PI3K enzyme's dual accelerator and brake mechanisms

COVID-19 virus disrupts protein production: Researcher discusses her recent findings

Newsletter sign up

Donate and enjoy an ad-free experience