Navigation Links
Major grant funds UCSC researchers using big data to predict cancer outcomes

Despite some successes, predicting cancer outcomes based on the molecular signatures in cancer cells remains a major challenge. A new effort, funded by the National Cancer Institute and led by researchers at the University of California, Santa Cruz, aims to clear several key roadblocks that have stymied progress in this field.

The $3.5 million project will use the latest in "big data" technology to bridge the gap between the petabytes of raw genomic data in centralized repositories like UCSC's Cancer Genomics Hub (CGHub) and the higher levels of interpretive information that can lead to clinically useful predictions, such as which drugs are most effective against tumors with certain mutations. Project leader Joshua Stuart, an associate professor of biomolecular engineering at UCSC's Baskin School of Engineering, compares the raw genomic data to the binary code running on a computer.

"Your web browser doesn't understand zeros and ones. There are layers and layers of software programs between that and what you see on a web page. We need to do the same thing for DNA sequences to reach the higher levels of interpretation needed for scientific discovery," Stuart said.

Stuart's group will build a separate database, called the Biomedical Evidence Graph (BMEG), for storing and analyzing interpretive information derived from the raw sequence data stored in the CGHub. Like Facebook's social graph, the BMEG will use a graph database structure designed for lightning-fast access to complex, interconnected datasets.

"Our analyses can reveal connections between different tumor samples based on their molecular profiles, and the natural way to represent that in a database is with the graph structures used for Facebook and other social networks," Stuart said.

A UCSC team led by bioinformatics expert and BMEG co-investigator David Haussler established CGHub in 2012 to manage data from the Cancer Genome Atlas (TCGA) consortium and other NIH cancer genomics research programs. Because CGHub holds genome sequences from thousands of individual patients, access is strictly controlled and limited to researchers approved by NIH. But the BMEG will hold higher-level data derived from analyses of the raw genome sequences and will not require the same level of security restrictions.

"The idea is to build a shared knowledge base and create a playground where lots of researchers can interact, test their algorithms, and compare results," Stuart said. "TCGA researchers have built a lot of great tools for data analysis, and we need to get those installed in the BMEG so the rest of the world can engage in that higher level analysis. We want to establish the tools and data analysis pipelines that will be useful for current and future collections of data."

The BMEG complements a parallel project, called Medbook, which will link together patients, biopsy samples, doctors, and researchers into a social network framework. Stuart and recent Ph.D. graduate Ted Goldstein created Medbook for their work on a prostate cancer project funded by Stand Up To Cancer (SU2C). "The BMEG will provide patient-level genome information to Medbook, creating a powerful partnership between the two efforts," Stuart said.

The BMEG will enable outside researchers to securely analyze CGHub-related data without needing to transmit vast amounts of data over the internet. Currently, TCGA data analysis centers download huge files of raw data from CGHub for analysis on their own computers. BMEG will be co-located with the CGHub servers at the San Diego Supercomputer Center, and researchers will be able to run their analyses as apps on the BMEG platform.

Cancer genomics researchers involved in TCGA and related projects are still working to develop and refine the analytical tools needed to extract predictive evidence from their rapidly growing databases. This includes the very first layer of analysis, the identification of mutations in a patient's tumor genome. Different research groups use different algorithms for this, and CGHub scientists found that these algorithms give inconsistent answers.

"That was a big shocker to me when the TCGA and International Cancer Genome consortiums started comparing the algorithms used by different institutions," Stuart said. "You'd think we'd know which one to install, but there are more than a dozen of them and they all came up with different answers. Only in the last year have we sorted that out, and David Haussler has created a unified effort to identify mutations for TCGA."

The mutation data, together with information on gene locations, then feeds into higher-level analyses. Stuart's lab, for example, uses this information to identify the genetic pathways affected by a patient's mutations. Using knowledge about how genes work in groups by signaling and regulating each other, he is able to find connections between different mutations that affect the same pathways. In previous work on TCGA data, results from Stuart's pathway analysis proved to be better predictors of overall survival in glioblastoma patients than a lower level of genomic analysis, and pathway signatures also revealed novel connections between mutations and drug response in breast cancers.

As with the mutation-finding algorithms, higher-level analyses are likely to differ among research groups. To find out who has the best algorithm for any particular type of analysis, they have to be compared side by side in blinded competitions. For the BMEG project, Stuart has teamed up with Adam Margolin of Sage Bionetworks, a nonprofit bioinformatics company, to run a series of such competitions.

"We've done this before, and it will be fun. We'll run competitions and then hold a conference at the end to go over the results," Stuart said. "To participate, people will give us their code and we'll run it on a test dataset. That's how we'll build out this enterprise, starting with the first level of data analysis and building up to predictive algorithms for things like patient survival and response to drug treatments."

By establishing a robust pipeline for analysis of cancer genomics data, the BMEG will help researchers discover clinically useful molecular signatures. In addition to their work with TCGA, Stuart and Haussler are involved in several other collaborative cancer genomics projects that will feed into the BMEG. These include the Library of Integrated Network-based Cellular Signatures (LINCS), the Stand Up To Cancer (SU2C) dream teams, and the I-SPY breast cancer trial. These projects provide a wealth of curated genomic and clinical data that can be used for developing and assessing cancer genomics tools.

The lead architect on the BMEG project at UCSC is Kyle Ellrott, a software developer who worked closely with Stuart and Goldstein to craft the original proposal. Ellrott has been leading the coordination of datasets and results for TCGA's new "Pan-Cancer" project to compare many different forms of cancer. He has broad experience in applying cutting-edge computational tools to cancer genome analysis.


Contact: Tim Stephens
University of California - Santa Cruz

Related medicine news :

1. Scientists identify major source of cells defense against oxidative stress
2. Majority of Californias Medi-Cal caregivers live in or near poverty
3. Study reveals major funding shortfall and high death rates for emergency laparotomy
4. Groundbreaking Nigeria summit results in major commitment to reduce child deaths
5. Majority of states fail to address youth exposure to alcohol marketing
6. Researchers determine vitamin D blood level for reducing major medical risks in older adults
7. Dengue Fever a Major Cost Burden in Puerto Rico
8. Living Near Major Roads May Shorten Heart Attack Survival
9. Bias found in mental health drug research presented at major psychiatric meeting
10. The Tennessee Car Accident Lawyers at Michael D. Ponce & Associates Alert Public of CDC Survey Revealing Majority of High School Seniors Admitting to Texting Behind Wheel
11. Infant Formula Can Be a Major Source of BPA: Experts
Post Your Comments:
(Date:10/12/2015)... ... October 12, 2015 , ... ... 19, 2015, high school aged female athletes, particularly in the area of track ... motion and mechanical overuse injuries than male peers in the same age group. ...
(Date:10/12/2015)... AL (PRWEB) , ... October 12, 2015 , ... American ... continues to expand its footprint by opening its 151st medical center. Located at 606A ... , “We’re excited to continue fulfilling our mission of making quality health care ...
(Date:10/12/2015)... ... October 12, 2015 , ... Advera Health Analytics, Inc . is ... Managed Care Organizations (MCOs) in order to provide health plans, payers, systems, and hospitals ... process. In addition, the DSM is also being offered to healthcare media organizations to ...
(Date:10/12/2015)... ... October 12, 2015 , ... SQM Group is proud ... contact channel benchmarking study. Be a part of this insightful and leading-edge ... customer experience, customer journey, contact channel execution and intelligence, workforce optimization, and tools ...
(Date:10/12/2015)... (PRWEB) , ... October 12, 2015 , ... For the ... year earning the highest distinction with its first Platinum Award. , The Bizrate ... status, but also earned a score of at least 9.0 or above out of ...
Breaking Medicine News(10 mins):
(Date:10/12/2015)... Okla. , Oct. 12, 2015  Millions of ... free from the shackles of tobacco. An April ... at Kings College London showed electronic cigarettes to be ... smoking. But more than a decade after the technology ... it remained unchanged.    --> ...
(Date:10/12/2015)... , Oct. 12, 2015 ... improve and enhance the quality of images produced ... substances (contrast media) used in magnetic resonance imaging, ... These devices allow the radiologists to interpret the ... conditions of the body. Contrast media can be ...
(Date:10/12/2015)... -- In a Sutter Institute of Medical Research study published ... Psychiatry, the blood product intravenous immunoglobulin, or IVIG, was ... patients in the early, pre-dementia phase of Alzheimer,s disease. ... extracted from the plasma of more than 1,000 blood ... found in patients with Alzheimer,s disease. Sutter Institute of ...
Breaking Medicine Technology: