Architecture of the DCF

narrow-middle-arch-18-v4.jpg

The DCF is based on the narrow middle architecture for a data commons. This architecture requires to standardize very few core services that are essential for the system functioning and making data Findable, Accessible, Interoperable, and Reusable (FAIR). Other services are not standardized and therefore different approaches could be used depending on the goals, costs, and functionality.

DCF Services

IndexD - permanent digital IDs service

The Indexd service provides permanent digital IDs for data objects. These IDs can be used to retrieve the data or query the metadata associated with the object. The Indexd service tracks the locations and hash of every file in the data commons object store.  It exports RESTful APIs for registering a new file, and retrieving data for an existing file.

Indexd GitHub repository is publicly available at https://github.com/uc-cdis/indexd

Fence & Arborist – authentication and authorization services

The Fence and the Arborist services control access to the metadata, submission, indexing, and data itself.

Fence is an authentication (AuthN) and authorization (AuthZ) service which utilizes OpenID Connect flow (an extension of OAuth2) to generate tokens for clients. It can also provide tokens directly to a user. Clients and users may then use those tokens (JWT) with other Gen3 Data Commons services to access protected endpoints that require specific permissions. Fence can be configured to support different Identity Providers (IDPs) for AuthN. At the moment, supported IDPs include Google, Cognito, Synapse, Microsoft, ORCID, and Researcher Auth Service (RAS). Fence works together with Arborist to implement attribute-based access control for commons users.

Arborist is an attribute-based access control policy engine, designed for use with the Fence. Arborist tracks resources requiring access control, along with actions which users may perform to operate on these resources, and roles, which aggregate permissions to perform one or more actions. Finally, policies tie together a set of resources with a set of roles; when granted to a user, a policy grants authorization to act as one of the roles over one of the resources. Resources are arranged hierarchically like a filesystem, and access to one resource implies access to its subresources.

Fence & Arborist provide centralized authentication and authorization services for CRDC Framework Services via RAS and dbGaP.

Fence GitHub repository is publicly available at https://github.com/uc-cdis/fence
Arborist GitHub repository is publicly available at https://github.com/uc-cdis/arborist

API

All of the DCF services support powerful APIs which allow them to interact with each other and external users. These APIs enable extensible application development for future services and users. You can integrate APIs into your services and utilize our open-source software libraries to develop new tools for sharing and analyzing data with your group, collaborators, or the broader community.

Learn more about using API at https://gen3.org/resources/user/using-api/