Navigation Links
Building a protein name dictionary from full text: a machine learning term extraction approach

The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature.

Results
We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM) to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text.
Conclusions
This dictionary term lookup method compares favorably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt.


'"/>

Source:BMC Bioinformatics


Page: 1

Related biology news :

1. Whats really making you sick? Plant pathologists offer the science behind Sick Building Syndrome
2. Building a human kinase gene repository
3. Building a better mouse model of lung cancer: FHIT counts
4. Protein folding: Building a strong foundation
5. New, automated tool successfully classifies and relates proteins in unprecedented way
6. New binding target for oncogenic viral protein
7. Controversial drug shown to act on brain protein to cut alcohol use
8. Timing is everything: First step in protein building revealed
9. UWs Rosetta software to unlock secrets of many human proteins
10. Researchers find how protein allows insects to detect and respond to pheromones
11. Signaling protein builds bigger, better bones in mice
Post Your Comments:
*Name:
*Comment:
*Email:


(Date:4/17/2017)... NXT-ID, Inc. (NASDAQ: NXTD ) ("NXT-ID" or the ... Annual Report on Form 10-K on Thursday April 13, 2017 with ... ... Relations section of the Company,s website at http://www.nxt-id.com  under "SEC ... . 2016 Year Highlights: Acquisition ...
(Date:4/11/2017)... Research and Markets has announced the addition ... their offering. ... tracking market to grow at a CAGR of 30.37% during the ... 2017-2021, has been prepared based on an in-depth market analysis with ... its growth prospects over the coming years. The report also includes ...
(Date:4/5/2017)... 2017  The Allen Institute for Cell Science today ... one-of-a-kind portal and dynamic digital window into the human ... first application of deep learning to create predictive models ... and a growing suite of powerful tools. The Allen ... future publicly available resources created and shared by the ...
Breaking Biology News(10 mins):
(Date:5/23/2017)... ... May 23, 2017 , ... A recent survey conducted by the Weed ... to control weed in 12 categories of broadleaf crops, fruits and vegetables, while common ... the U.S. and Canada participated in the 2016 survey, the second conducted by WSSA. ...
(Date:5/23/2017)... ... May 22, 2017 , ... A new Technology Hot Topics ... this August will feature high-level speakers on quantum devices, graphene electronic tattoo sensors, ... Photonics, the largest multidisciplinary optical sciences meeting in North America, will run 6-10 ...
(Date:5/23/2017)... ... May 23, 2017 , ... Bacterial biofilms, surface adherent communities of ... diverse pathologies ranging from food poisoning and catheter infections to gum disease and the ... tens of billions of dollars per year, there is currently a paucity of means ...
(Date:5/22/2017)... (PRWEB) , ... May 22, 2017 , ... ... joined with other leaders of the Maryland Biohealth community in developing and issuing ... recognized Top 3 U.S. BioHealth Innovation Hub by 2023. , ...
Breaking Biology Technology: