Clark University professor and fungal evolutionary biologist David Hibbett is one of the scientists erecting the tree — research limb by research limb.
Scientists have been building evolutionary trees for more than 150 years, ever since Charles Darwin drew the first sketches in his notebook. But despite significant progress in fleshing out the major branches of the tree of life, today there is still no central place where researchers can go to browse and download the entire tree.
“Where can you go to see their collective results in one resource? The surprising thing is you can’t — at least not yet,” said Karen Cranston, of the National Evolutionary Synthesis Center (NESCent).
But now, thanks to a three-year, $5.76 million grant from the National Science Foundation (NSF), a team of scientists and developers from 10 universities aims to make that a reality, through a new initiative dubbed Open Tree of Life.
“It is very exciting to be part of this large collaborative effort, which includes some tremendously talented evolutionary bioinformaticians, as well as scientists with expert knowledge on the diversity of various groups of organisms,” Hibbett said.
Hibbett, who is the Warren Litsky Endowed Chair in Biology, was a collaborator in the NSF-supported Assembling the Fungal Tree of Life project to develop a higher-level phylogenetic framework for the fungi.
The Open Tree of Life project continues the work that was started in the Assembling the Fungal Tree of Life project, which is winding up in August, Hibbett notes. The new project will also integrate the results of many evolutionary studies of fungi that were not part of AFTOL.
* To learn more about Hibbett’s work, visit his lab website. *
According to NESCent, figuring out how the millions of species on Earth are related to one another isn’t just important for pinpointing an aardvark’s closest cousins, or determining if hagfish are more closely related to sand dollars or sea squirts. Information about evolutionary relationships has helped scientists identify promising new medicines, develop hardier, higher-yielding crops, and fight infectious diseases such as HIV, anthrax and influenza.
If evolutionary trees are so widely used, why has assembling them across all of life been so hard to achieve? It’s not for lack of research, or data. Thanks in large part to advances in DNA sequencing, thousands of new phylogenetic trees are published in scientific journals each year —most of them focused on isolated branches of the tree of life, for everything from birds to botflies.
“There’s a fire hose of data,” said Cranston, principal investigator of the Open Tree of Life project. “[Over the years] scientists have published tens of thousands of evolutionary trees, but there’s been very little work to connect the dots and put them all together into a single resource.”
Part of the difficulty lies in the sheer enormity of the task. Assembling the branches for all two million named species of animals, plants, fungi and microbes — not to mention the countless more still being named or discovered — will require new tools for analyzing large data sets and stitching together vast numbers of published trees.
Another difficulty lies in how scientists typically disseminate their results. A tiny fraction of all evolutionary trees that have been published — researchers estimate a mere 4 percent —end up in a database in a digital form, NESCent reports. Instead, most of that knowledge is locked up in figures in journal articles, as PDFs or other file formats that are impossible for other researchers to download, reanalyze, or merge with new information.
What makes this project different from previous efforts, the researchers say, is its scope. “This is the first real attempt to put together the entire tree of life,” Cranston said.
According to Hibbett, the Clark lab team’s responsibilities in the Open Tree project will be to develop the fungal branch of the tree of life. There are about 100,000 described species of fungi, but it is estimated that there could be 1.5 million extant species, or maybe more. “Having a tree that contains all the known species will make it easier to place the new species as they are discovered.”
With a draft in hand, scientists will be able to go online and compare their trees to others that have already been published, or download it for further study. They’ll also be able to expand the tree, filling in the missing branches and placing newly named or discovered species among their relatives. Eventually, the team’s goal is to be able to detect when new trees are published and incorporate them automatically, so that the complete tree can be continuously updated.
If the project is to succeed, one of the biggest challenges will be encouraging more scientists to publish their results in digital form. Growing numbers of scientific journals now require authors to deposit phylogenetic data in a digital database, but many published trees never make it.
“A major focus of this project is to develop new software tools that make it possible to combine data from different sources. The Open Tree project illustrates how bioinformatics has become central to evolutionary biology and taxonomy,” Hibbett said.
At Clark, part of this project will involve the creation of a course on the tree of life, with both classroom and on-line components. Hibbett’s lab has hired a post-doctoral fellow, Dr. Romina Gazis, who will help develop the course, which will be offered to Clark students for the first time in spring 2013. Already engaged in the research is Clark undergraduate, Rachael Martin ’13, who is working with the lab this summer to gather data on the Boletales, a large group of mycorrhizal fungi that includes the edible porcini mushrooms.
“Displaying large trees is a hard problem that has so far resisted solution,” one scientist told The New York Times in Tree of Life Project Aims for Every Twig and Leaf. “We are still waiting for the equivalent of a Google Maps.”