线上二八杠

Skip to main content

The Genomic Data Commons

Launch the Genomic Data Commons

线上二八杠Access, analyze, and visualize genomic and clinical data through the .

Join the Monthly Support Webinar

Have questions or need help getting started? . Held every last Monday of the month, 2-3 p.m. ET.

The  is a unified knowledge base that promotes sharing of genomic and clinical data between researchers and facilitates precision medicine in oncology.

Cancer is fundamentally a disease of the genome, caused by mutations and other harmful genomic changes that alter its function and contribute to the malignant behavior of cancer cells. Genomic aberrations can influence the aggressiveness of tumors and the response of tumors to particular drugs.

Cancer genomics makes use of advanced DNA sequencing technology to give scientists enormous power to uncover how these genomic changes drive cancer formation and growth.

The GDC contains genomic data from more than 33,000 patients with cancer. About 14,500 of those case are derived from large-scale NCI programs, such as The Cancer Genome Atlas (TCGA) and .

The GDC also includes data from about 18,000 cancer patients provided by Foundation Medicine, Inc., a molecular information company, and nearly 1,000 patients with multiple myeloma contributed by the Multiple Myeloma Research Foundation (MMRF), a nonprofit advocacy organization. 

线上二八杠These represent some of the largest and most comprehensive cancer genomics datasets in the world, together comprising more than three petabytes of data.

A Data Sharing Platform to Promote Precision Oncology: The Genomic Data Commons

By providing an expandable data sharing platform to the cancer research community, the GDC aims to accelerate discoveries in cancer research and promote precision medicine in oncology.

Breaking Down Research Barriers

线上二八杠As the success of the landmark TCGA program demonstrates, releasing the knowledge available in these huge datasets requires collaboration and data sharing across the cancer research community.

In today's cancer research framework, several barriers prevent most researchers from fully exploiting all of the genomic data that is available. Below is how the GDC addresses each of those barriers to facilitate progress:

  • Genomic data from different projects, clinical trials, and cancer types are siloed in different locations with local management systems, making it difficult to share the data.

The GDC brings genomics datasets and associated clinical data into one location that any researcher may access.

  • The data are often generated using different methods, so that even if researchers can access two different datasets, they cannot use both in a single study.

The GDC “harmonizes” the data, enabling datasets generated from different protocols to be studied side-by-side. Combining these datasets also increases their potential analytical power.

  • Sophisticated analysis tools that allow researchers to derive useful knowledge from large, complex data sets are not available to all researchers.

The GDC offers Data Analysis, Visualization, and Exploration (DAVE) tools to empower the broader cancer research community. The latest analytic technologies are applied to GDC data, allowing researchers to select a custom cohort of patients to study, perform a variety of analyses, and produce publication-ready figures using an interactive web interface.

The GDC makes these data and tools available from secure servers operated by the University of Chicago Center for Data Intensive Science, and the are making GDC data available in a secure cloud computing environment. By democratizing access to these resources, GDC makes it possible for any researcher to ask new and fundamental questions about cancer.

A Strong Foundation for Cancer Research

The NCI GDC is more than just a data repository; it is a cancer analysis system that continues to evolve by encouraging independent groups such as clinical research consortia, companies, and advocacy organizations to contribute their own cancer genomic data to the GDC. Submitters can use GDC tools to analyze their data and compare their results with other data sets in the GDC. Within 6 months, contributed data are made available to qualified researchers, thereby expanding the genomic and clinical data available to the cancer research community, deepening our understanding of cancer mechanisms and enabling advances in cancer diagnosis and treatment.

线上二八杠The GDC will also house data from a new era of NCI programs that will sequence the DNA of patients enrolled in NCI clinical trials. These datasets will lead to a much deeper understanding of which therapies are most effective for individual cancer patients.

线上二八杠With each new addition, the GDC will evolve into a smarter, more comprehensive knowledge base that will foster important discoveries in cancer research and increase the success of cancer treatment for patients.

This NCI initiative is being built and managed by the University of Chicago , in collaboration with the  and under a subcontract with Leidos Biomedical Research.

Learn more about the GDC from CCG Director Louis M. Staudt, M.D., Ph.D., and other GDC experts in their article, "," published in the New England Journal of Medicine.

Visit the  to find the latest information and utilize its data access, analysis, and sharing resources.