CRDC, NCI Cloud Resources, and Data Nodes
The goal of the National Cancer Institute’s Cancer Research Data Commons (CRDC) is to empower researchers to accelerate data-driven scientific discovery by connecting diverse datasets with analytical tools in the cloud. The CRDC is built upon an expandable data science infrastructure that provides secure access to many different data across scientific domains via Data Commons Framework. The CRDC enables users to search and aggregate data across repositories via the Cancer Data Aggregator using a common data model, CRDC-H. The ability to combine diverse data types and perform cross-domain analysis of large cancer datasets can lead to new discoveries in cancer prevention, treatment and diagnosis, further supporting the goals of precision medicine and the Cancer Moonshot℠. The CRDC will encompass and connect multiple cloud-based data repositories and serve as a central location to support public data sharing for NCI-funded programs.
https://datacommons.cancer.gov/cancer-research-data-commons
National Cancer Institute (NCI) Cancer Research Data Commons (CRDC) provides an ecosystem that enables access to many NCI funded programs including through interconnected infrastructures like the below (text from CRDC):
Data Commons: These are repositories of related, harmonized, and accessible data, with an analysis infrastructure built for elastic compute capability and interoperability. Examples include:
Cloud Resources: Each Cloud Resource provides users secure access to CRDC data through NCI DCFS and a platform to analyze data and store research results in a secure, compliant cloud environment. The three Cloud Resources are:
FireCloud Powered by Terra
“Broad Institute FireCloud is a NCI Cloud Resource project powered by Terra for biomedical researchers to access data, run analysis tools, and collaborate.”
FireCloud: https://firecloud.terra.bio
Resource Portal: https://firecloud.terra.bio
Institute for Systems Biology ISB Cloud
“The ISB Cancer Genomics Cloud is democratizing access to NCI Cancer Data (TCGA, TARGET, CCLE) and coupling it with unprecedented computational power to allow researchers to explore and analyze this vast data-space.”
ISB-CGC: https://portal.isb-cgc.org
Resource Portal: https://portal.isb-cgc.org
Seven Bridges Cancer Genomics Cloud
“The Cancer Genomics Cloud (CGC), powered by Seven Bridges, is one of three systems funded by the National Cancer Institute to explore the paradigm of colocalizing massive public datasets, like The Cancer Genomics Atlas (TCGA), alongside secure and scalable computational resources to analyze them. The CGC makes more than two petabytes of multi-dimensional data available immediately to authorized researchers. You can add your own data to analyze alongside the public datasets using predefined analytical workflows or your own tools. Every execution is fully reproducible, and collaborating with your team is simple and secure.”
Seven Bridges CGC: http://www.cancergenomicscloud.org/
Resource Portal: https://cgc-accounts.sbgenomics.com/