"There are ambiguities with the alignments, you have to make certain judgment calls, and so an alignment that I do is not going to be the same as an alignment that somebody else does," said lead author Bryan Drew, a postdoctoral researcher in UF's biology department. "It's hard to assess a publication's validity in a lot of cases if you don't have access to the alignments. To me, that's the biggest problem with all of this."
Challenges include complicated mechanisms for uploading data and inconsistencies between journals some require or strongly recommend data be stored in an online database and others do not, Drew said. The most widely used, publicly accessible databases include GenBank, TreeBASE and Dryad. Most journals require DNA sequences be deposited in GenBank, but comparatively few require the sequence alignments to be publicly archived. When study co-authors emailed researchers to obtain missing information, a majority did not respond, and the co-authors were rarely successful in retrieving the data.
"A lot of the authors I contacted said their data was in TreeBASE, but they were unaware of the next step needed after acceptance by the journal the researchers didn't know they had to go back into TreeBASE and actually make the data available to the public," Drew said.
Elizabeth Kellogg, a professor in the department of biology at the University of Missouri-St. Louis who was not involved with the study, said she is not surprised about the large amount of missing information.
"They're absolutely right that when people are publishing papers, you want to document your results as much as you can," Kellogg said. "But many journals aren't requiring that extra step, so some researchers are only submitting the minimum to have their studies published. "There are databases for archiving, but some of their i
|Contact: Doug Soltis|
University of Florida