Navigation Links
New algorithm helps evaluate, rank scientific literature

Keeping up with current scientific literature is a daunting task, considering that hundreds to thousands of papers are published each day. Now researchers from North Carolina State University have developed a computer program to help them evaluate and rank scientific articles in their field.

The researchers use a text-mining algorithm to prioritize research papers to read and include in their Comparative Toxicogenomics Database (CTD), a public database that manually curates and codes data from the scientific literature describing how environmental chemicals interact with genes to affect human health.

"Over 33,000 scientific papers have been published on heavy metal toxicity alone, going as far back as 1926," explains Dr. Allan Peter Davis, a biocuration project manager for CTD at NC State who worked on the project and co-lead author of an article on the work. "We simply can't read and code them all. And, with the help of this new algorithm, we don't have to."

To help select the most relevant papers for inclusion in the CTD, Thomas Wiegers, a research bioinformatician at NC State and the other co-lead author of the report, developed a sophisticated algorithm as part of a text-mining process. The application evaluates the text from thousands of papers and assigns a relevancy score to each document. "The score ranks the set of articles to help separate the wheat from the chaff, so to speak," Wiegers says.

But how good is the algorithm at determining the best papers? To test that, the researchers text-mined 15,000 articles and sent a representative sample to their team of biocurators to manually read and evaluate on their own, blind to the computer's score. "The results were impressive," Davis says. The biocurators concurred with the algorithm 85 percent of the time with respect to the highest-scored papers.

Using the algorithm to rank papers allowed biocurators to focus on the most relevant papers, increasing productivity by 27 percent and novel data content by 100 percent. "It's a tremendous time-saving step," Davis explains. "With this we can allocate our resources much more effectively by having the team focus on the most informative papers."

There are always outliers in these types of experiments: occasions where the algorithm assigns a very high score to an article that a human biocurator quickly dismisses as irrelevant. The team that looked at those outliers was often able to see a pattern as to why the algorithm mistakenly identified a paper as important. "Now, we can go back and tweak the algorithm to account for this and fine-tune the system," Wiegers says.

"We're not at the point yet where a computer can read and extract all the relevant data on its own," Davis concludes, "but having this text-mining process to direct us toward the most informative articles is a huge first step."


Contact: Matt Shipman
North Carolina State University

Related medicine news :

1. New breast imaging algorithm brings breast cancer diagnosis and treatment to underserved area of Uganda
2. Google Rolls Out Another Panda Algorithm Update – Small Businesses Plan for the Future of SEO
3. Custom iPad Case Helps Businesses Improve Brand Awareness with UV Flatbed Printing
4. Sleep Apnea is a Worldwide Epidemic: Better Rest Solutions Helps Spread Awareness on Sleep Apnea Awareness Day
5. Rutland VT Dentist, Donald P McLaughlin, DDS, Helps Local Patients Overcome Sleep Apnea Without Surgery, CPAP
6. NATRENH Releases a New Package Which Helps Women Get a More Curvaceous Body and Get a Bigger Butt
7. Safety Services Company Helps Canadian Employees Avoid Death
8. DoD Selects Hensel Phelps / BlueForge for Elite Mentor-Protégé Program
9. How To Grow Taller
10. How “Taller4u” Helps People Increase Height Quickly – V-kool
11. MDCT helps better determine valve implant size for transcatheter aortic valve in patients with aortic stenosis
Post Your Comments:
(Date:11/30/2015)... , ... November 30, 2015 , ... ... Beard and Brooke Bennett are collaborating with brands across various categories through traditional ... influential figures make up an elite group of Gold Medal Moms who can ...
(Date:11/30/2015)... ... December 01, 2015 , ... The National ... 2015-2016 inductee into its VIP Woman of the Year Circle. She is recognized ... networking organization exclusively for professional women, boasting 850,000 members and over 200 operating ...
(Date:11/30/2015)... ... 30, 2015 , ... During the National Family Caregivers Month, ... two webinars on topics of ‘Medical and Palliative Care Decisions,’ and ‘Self-Care for ... . , With a loved one's diagnosis of mesothelioma, the closest family member ...
(Date:11/30/2015)... ... November 30, 2015 , ... MOSI recently added two state-of-the-art augmented reality ... to the Jurassic to their collection of interactive exhibits within the Kids In Charge! ... will allow guests to get closer than ever to a range of animals as ...
(Date:11/30/2015)... ... November 30, 2015 , ... ... across the United States to access life-saving information provided directly from top ... will consist of three individual conferences in three major cities: Houston, San Francisco, ...
Breaking Medicine News(10 mins):
(Date:11/30/2015)... 2015  Kevin Smith has been appointed Chief ... pioneer in wireless monitoring of vital signs.  As ... , Mr. Smith will be responsible for the ... He will also directly oversee partnering with US ... for SensiumVitals, the first early warning detection device ...
(Date:11/30/2015)... LAKE, N.J. and SAN DIEGO ... and Arena Pharmaceuticals, Inc. (NASDAQ: ARNA ) today ... has accepted for filing the New Drug Application (NDA) ... the extended release formulation will offer patients a chronic ... ® ) is currently approved as an ...
(Date:11/30/2015)... iCAD, Inc. (Nasdaq: ICAD ) ... solutions for advanced image analysis and workflow tools ... Radiological Society of North American (RSNA) 2015 Annual ... November 29 to December 4, 2015. The company ... automated breast density assessment solution, PowerLook® Advanced Mammography ...
Breaking Medicine Technology: