Navigation Links
Big genomics data, big scientific impact: New challenges for further development of life science

November 28, Shenzhen and Hong Kong, China BGI, the world's largest genomics organization, today announced its latest advances in the analysis, management and dissemination of "Big Genomics Data" at their 3rd bioinformatics software and data release conference. Released at the conference include new bioinformatics analysis pipelines and software, including SOAPhecate v2.5 and SOAPgaea, as well as an updated version of EasyGenomics, one cloud-based bioinformatics solution. Additionally, BGI's big-data journal GigaScience also provided an update on its integrated GigaDB database and reported on plans for its data analysis platform based on the Galaxy workflow system.

Genomics and next-generation sequencing (NGS) technologies have revolutionized life sciences research. With the cost of DNA sequencing steadily plummeting in price, the amount of data generated with NGS technologies continues to grow at an unprecedented pace. This has led in recent years to an over 400,000% increase in daily sequencing data generation. In the age of "Big Genomics Data", the handling, storing and sharing of these tremendous volumes of data has become a significant research bottleneck.

For dealing with big data efficiently, BGI has integrated the Apache Hadoop MapReduce framework into algorithms for NGS analyses. Based on this framework, they have also developed two new algorithms: SOAP-Hecate and SOAP-Gaea. These algorithms are two of the key components of the flexible green cloud computing infrastructure at BGI for de novo Assembly and NGS Analysis. They have successfully been applied into analyzing the sequencing data in clinical and biological research with a fast turnaround time, high efficiency and low cost.

In the conference, Yan Li, Director of Bioinformatics Products from the BGI, announced a new version of distributed genome assembler SOAP-Hecate v2.5. This software not only outputs linearized sequences, but is also a flexible and easy-to-use platform to study the complexity and characteristics of a genome. The scalability of SOAP-Hecate enables researchers to control the assembly time by choosing different size of the cluster. SOAP-Hecate v2.5 is able to finish the assembly of a human whole genome within two days.

Yan Li also presented the application of SOAP-Gaea in genetics research. The new version of SOAP-Gaea is integrated fully into a computational pipeline for variation detection. At present, the application of SOAP-Gaea in Cancer genomics studies has reduced the analysis time from two weeks to two days. EasyGenomics- Next Generation Bioinformatics on the Cloud

"Our goal is to make NGS analysis easier and faster, and get answers everyone can trust." said, Dr. Xing Xu, Director of Cloud Computing Products at BGI. Dr. Xu gave an overview of EasyGenomics and its advantages at the conference. He says "EasyGenomics is a robust cloud-based bioinformatics solution for NGS data analysis, which is engineered with BGI's state-of-the-art technologies and platforms. Through this excellent platform, researchers have the ability to access these informatics resources at their fingertips".

"EasyGenomics can liberate scientists from the burden of handling and processing the huge amount of NGS data." says Dr. Xu. As the sequencing cost drops sharply, a vast amounts of NGS data has been and will continue to be generated, BGI for example generating terabytes of data per day. This flood of data becomes one of the biggest obstacles for researchers to further explore the mysteries of life science, and so Dr. Xu added, "EasyGenomics emerges as the times require."

According to Xu's introduction, EasyGenomics offers a SaaS (Software as a Service) based NGS bioinformatics analysis solution based upon BGI's cloud infrastructure and cutting-edge Aspera fasp file transport technology. Users only need a standard web browser to access the services of EasyGenomics from anywhere in the world. With EasyGenomics, users are freed from tedious resource maintenance and management as well as counter-intuitive command-line tools. Xu said,

"Performing bioinformatics analyses becomes as easy as doing online shopping with just a few clicks." EasyGenomics offers high-speed data exchange solutions that are 10-100X faster than conventional FTP. This robust platform contains popular workflows for whole genome and exome sequencing analysis as well as RNA-seq, de novo assembly, and small RNA data analysis tools. It also provides extensive quality control statistics and various informative reports, such as reports on sequencing quality, mapping, coverage, and enrichment statistics, allowing users to access data quality, evaluate analysis performance, and identify potential issues.

GigaDB and Galaxy - Revolutionizing Data Dissemination, Organization and Analysis

GigaDB is the database of the journal GigaScience, which is a repository to host publicly available, large-scale data sets, each with a unique Digital Object Identifier (DOI) for their citation and tracking. GigaScience was launched by BGI and its publishing partner BioMed Central. It is an online, open-access and open-data journal that publishes studies involving large-scale data from the life and biomedical sciences. This big-data journal provides a new model in scientific publishing: combining article and data publication.

In the conference, Peter Li from GigaScience is presenting what can be expected from the new version of GigaDB to be released in early 2013. This newly updated database will provide a user-friendly interface for querying and downloading data sets. It currently contains over 39 data sets, many previously unpublished from the BGI, including genomic, mass spectrometry, transcriptomic, epigenomic and metagenomic data.

In addition to GigaDB, Peter says they and the CUHK-BGI Innovation Institute of Trans-omics (CBIIT) have been developing a data analysis platform based on the Galaxy workflow system through which we will be making the tools and data processing pipelines reported in GigaScience available for the research community. As a pilot project, they have integrated the next generation sequencing data analysis tools from the BGI SOAP package suite of tools into their Galaxy platform to provide an opportunity for the community to use its functionality from within automated pipelines.

Peter said, "We expect to make GigaScience workflows available from myExperiment, an online repository of scientific workflows, in the near future. GigaDB will eventually be integrated with our Galaxy-based data analysis platform so the results in our papers and the data hosted by us can be analyzed and viewed in a more reproducible and reusable manner."


Contact: Jia Liu
BGI Shenzhen

Related biology news :

1. Bridging the gap between genomics and education
2. Ontario Genomics Institute invests in validation of a novel protein interaction technology at U of T
3. New genomics study shows ancestry could help solve disease riddles
4. Physiological Genomics journal announces a major restructuring
5. 2 pioneering plant genomics efforts given a funding boost by National Science Foundation
6. ShanghaiBio Corporation Partners with Ingenuity Systems to Address Challenges in Analysis and Interpretation of Genomics Data
7. Elsevier launches new open access journal: Applied and Translational Genomics
8. The genomics symposium to boost the further development of cancer research
9. ACRG and BGI report findings from genomics research on recurrent hepatitis B virus integration
10. In search for a vaccine, IU biologist receives $2.3 million to explore chlamydia genomics
11. BGI, University of Helsinki and Wuhan University sign a MOU concerning cooperation on genomics
Post Your Comments:
(Date:10/29/2015)... ARBOR, Mich. , Oct. 29, 2015 /PRNewswire/ ... Eurofins Genomics for U.S. distribution of its DNA ... DNA-seq kit and Rubicon,s new ThruPLEX Plasma-seq kit. ... to enable the preparation of NGS libraries for ... plasma for diagnostic and prognostic applications in cancer ...
(Date:10/27/2015)... YORK , Oct. 27, 2015 In ... major issues of concern for various industry verticals such ... is due to the growing demand for secure & ... in various ,sectors, such as hacking of bank accounts, ... for electronic equipment such as PC,s, laptops, and smartphones ...
(Date:10/26/2015)... Calif. and LAS VEGAS ... Nok Nok Labs , an innovator in modern authentication ... , today announced the launch of its latest version ... unified platform enabling organizations to use standards-based authentication that ... Nok Nok S3 Authentication Suite is ideal for organizations ...
Breaking Biology News(10 mins):
(Date:11/24/2015)...  Clintrax Global, Inc., a worldwide provider of clinical research services ... that the company has set a new quarterly earnings record in ... growth posted for Q3 of 2014 to Q3 of 2015.   ... , with the establishment of an Asia-Pacific ... United Kingdom and Mexico , ...
(Date:11/24/2015)... ... , ... This fall, global software solutions leader SAP and AdVenture Capital brought ... pitch their BIG ideas to improve health and wellness in their schools. , Now, ... the title of SAP's Teen Innovator, an all-expenses paid trip to Super Bowl 50, ...
(Date:11/24/2015)... November 24, 2015 SHPG ) announced today ... the Piper Jaffray 27 th Annual Healthcare Conference in ... 2015, at 8:30 a.m. EST (1:30 p.m. GMT). --> ... Officer, will participate in the Piper Jaffray 27 th Annual ... on Tuesday, December 1, 2015, at 8:30 a.m. EST (1:30 p.m. ...
(Date:11/24/2015)... TEL AVIV, Israel , Nov. 24, 2015  Tikcro Technologies Ltd. ... be held on December 29, 2015 at 11:00 a.m. Israel ... & Co., Electra Tower, 98 Yigal Allon Street, 36 th Floor, ... , election of Eric Paneth and Izhak Tamir ... and Rami Skaliter as external directors; , approval of an ...
Breaking Biology Technology: