by | Nov 30, 2023 | 0 comments

Unmasking Dark Matter: Ocean Genome Legacy Helps Reveal a Hidden World of Proteins 

We all know that protein is essential to life and that our muscles, vital organs, and enzymes—the tiny molecular machines that drive life’s processes—are all made of protein. In fact, your body contains about 20,000 different proteins, each with its own unique function. But did you ever wonder how many kinds of protein exist in the world?  

In a recent Nature article, an international group of scientists tackled this question using metagenomic data from many research labs worldwide, including the Ocean Genome Legacy (OGL) and the Bowen Labs at Northeastern University. Metagenomes—DNA fragments isolated from bulk environmental samples—contain diverse DNA from plants, animals, fungi, bacteria, archaea, and viruses. Typically, researchers identify new proteins encoded in this metagenomic DNA by comparing them to known proteins found in reference species. These proteins, sometimes called metagenomic “dark matter,” are generally excluded from the analyses.  

The new study reversed the typical approach by isolating only the “dark matter” from metagenomic datasets. Specifically, the scientists obtained all environmental DNA sequences from a public database funded by the U.S. Department of Energy. Then, they removed all sequences that matched known proteins. Afterward, researchers grouped sequences with no known protein matches by similarity, revealing thousands of novel protein families.  

After restricting the data to protein families with at least 100 representatives, the scientists found over 106,000 unknown families. That is about the same as the number of known families. In other words, they doubled the number of recognized protein families, proving that there are at least as many new proteins to discover as have already been described! So, scientists don’t have to worry about running out of proteins to study any time soon. 

A recent Nature publication predicts that there may be as many unidentified protein families as there are known families! Here are some examples of novel predicted protein structures the researchers found (Pavlopoulos et al., 2023). 

This study shows that metagenomic exploration is crucial to discovering new proteins and understanding their function. OGL hopes to contribute more to this exciting field of proteomics, helping to uncover more new proteins hidden around us! 

Are you interested in helping to advance exciting breakthroughs like this? Support OGL here.  


May is Biodiversity Month! 

At OGL, we are deeply committed to studying and preserving marine biodiversity.   Here’s how we are observing Marine Biodiversity Month:  Research Support: OGL conducts and supports cutting-edge research to discover new marine species and understand...

Deep-sea Genomes vs Deep-Sea Mining 

By Akancha Singh, Rosie Poulin, and Dan Distel Last month, an international team of researchers led by OGL collaborator Mercer Brugler from the University of South Carolina published the complete mitochondrial genomes of two deep-sea black corals in ZooKeys1. This...

A day in the life of an OGL student intern.

Ever wonder what it’s like to work in a marine research lab like Ocean Genome Legacy (OGL)? Let’s follow OGL’s newest student research assistant, co-op Mia Bender, COS‘25, through her week to find out!  This week, Mia has been dissecting lobsters to preserve...

OGL’s new species discovery is number one! 

This week, a publication by Ocean Genome Legacy researchers and colleagues announced the discovery of Vadumodiolus teredinicola, a new species of marine mussel.  This discovery includes several exciting firsts!  Left: Vadumodiolus teredinicola in life position within...

Wicked Worms from Under the Sea

Behold the mighty Bobbit worm, striking from the seafloor! Image Credit: Daniel Kwok CC BY-NC-ND 2.0 It’s October again and that means one thing: it’s the time of year for ghosts, goblins, and spooky monsters. Fantastical creatures like these might make the real world...