Skip to main content

Governance & Access

How the ETAF Data Commons is designed around controlled access, consent, transparency, and responsible data use — and the aspirational roadmap for how the initiative develops over time.

Designed around trust, consent, and responsible use

Governance design is central to the ETAF Data Commons from the outset. The following describes the intended approach — all policies are under development and subject to refinement in consultation with contributing cohorts, ethics experts, and the broader research community. Language such as "planned," "intended," and "under development" reflects that nothing is final.

Controlled access

Data are not intended for unrestricted public download. Access will be project-based and subject to a data access review process.

In development

De-identified data

Individual-level data are intended to be de-identified prior to inclusion. Participant privacy and responsible use are central design principles.

Secure cloud analysis environment

Analysis is expected to take place within a secure cloud environment. Individual-level data will not be broadly downloadable outside approved secure workspaces.

In development

Consent constraints respected

Cohort-specific consent and use constraints will be tracked and respected. Cohorts retain the right to define the conditions under which their data may be used.

Transparent documentation

Versioned data releases, cohort-level documentation, and reproducible workflow support are planned. Acknowledgment and publication policies will be developed with contributing cohorts.

Governance under development

Data access review procedures, training requirements, and publication policies are being developed. Forthcoming governance documentation will define these processes in detail.

In development

Where we are and where we're going

The ETAF Data Commons is in early development. The following is an aspirational roadmap representing current intentions. Timelines and priorities are subject to change as the initiative develops, secures support, and incorporates feedback from cohort leaders and stakeholders.

Active · 2026

Initiative planning and scientific foundation

  • Formation of initiative leadership and articulation of scientific vision
  • Development of scientific white paper and rationale document
  • Early conversations with potential contributing cohorts
  • Protocol development for data standards and metadata
  • Website and public presence established
Coming Soon

Governance framework and cohort inclusion protocol

  • Data access governance model finalized
  • Cohort participation criteria and onboarding procedures published
  • Data use agreement templates drafted
  • Ethics review frameworks for multi-cohort integration
Coming Soon

Phenotype harmonization blueprint

  • Priority phenotypic domains identified with scientific working groups
  • Cross-cohort variable mapping documentation
  • Harmonization protocols and decision guidelines
  • Item-level and cohort-specific data retention policies
Coming Soon

Genomic data processing plan

  • Genotyping and imputation standards defined
  • Quality control pipeline development
  • DNA banking pathway guidance for participating cohorts
  • Family-based genomic analysis infrastructure planning
Future

Secure controlled-access platform

  • Cloud-based secure analysis environment deployed
  • Initial cohort data integrated and documented
  • Versioned data release framework established
  • Pilot access granted to approved research teams
Future

Researcher onboarding and training resources

  • Researcher application portal launched
  • Training materials and technical documentation published
  • Data access review board fully operational
  • Community of practice established
Future

Cohort expansion and versioned data releases

  • Additional cohorts onboarded through established protocols
  • Versioned data releases issued with release notes and changelogs
  • Expanded phenotype domains and genomic data coverage
  • Long-term sustainability model secured

This roadmap is aspirational and will evolve. Timelines depend on funding, cohort engagement, governance development, and other factors. Updates will be posted as the initiative progresses.

What the seed phase would accomplish

The ETAF Data Commons is seeking seed funding to establish the foundational infrastructure, governance structures, and protocols that would allow the initiative to begin accepting and integrating cohorts. The following describes what a funded seed phase would accomplish. Seed funding has not yet been awarded.

Infrastructure and governance

  • Establish and test the secure cloud analysis environment
  • Develop and finalize the data access governance framework
  • Draft and execute data-use agreement templates
  • Convene a data access review committee
  • Define cohort onboarding criteria and step-by-step accession procedures
  • Develop researcher application and project review workflows

Science and harmonization

  • Develop phenotype harmonization protocols with scientific working groups
  • Establish genotyping, imputation, and quality-control standards
  • Pilot the onboarding process with a small number of seed-phase cohorts
  • Produce reproducible analysis pipelines for family-based genetic analyses
  • Generate a first versioned data release for pilot access and review
  • Publish pre-registration, protocol, and cohort documentation

Current status: The initiative is actively pursuing seed funding and welcomes conversations with funders. Cohort accession, researcher access, and formal governance will begin only after seed funding is secured and the seed-phase infrastructure is tested and reviewed. No commitments regarding timelines can be made at this stage.

Proposed funding and sustainability model

A multi-component funding model is envisioned to sustain the initiative over time. The following reflects current planning and is subject to change based on funding outcomes, governance input, and the needs of participating cohorts.

Central infrastructure costs

Core infrastructure — including the secure cloud platform, coordination staff, governance operations, data access review, and shared analysis pipelines — is expected to be funded through competitive grants, institutional partnerships, and potentially subscription or service-fee arrangements with institutions accessing the resource. Grant funding from NIH, Wellcome, and other major funders is the primary intended mechanism for the seed phase.

Cohort-specific costs

Costs specific to individual cohort onboarding — such as local harmonization work, consent review, data transfer, and cohort-level quality control — are expected to be borne partly by contributing cohorts (where capacity permits) and partly through central support mechanisms developed as the initiative matures. The initiative aims to reduce participation barriers, particularly for smaller or resource-limited cohorts, as a core equity commitment.

The long-term sustainability model is under development. Input from cohort leaders, participating institutions, and funders will shape how costs are distributed and how the initiative achieves durable financial support beyond the seed phase.

Common questions

Is data currently available for researcher access?
No. The ETAF Data Commons is in an early planning phase. No cohort data have been integrated and no researcher applications are being accepted. The resource does not yet exist as an operational data platform. Access will be announced when infrastructure and governance are in place.
Is the initiative currently accepting new cohorts?
No. The initiative welcomes expressions of interest and early conversations with cohort leaders, but formal cohort accession is not yet open. Broader participation is expected to begin only after seed funding has been secured and the seed-phase infrastructure, governance, and harmonization processes have been tested with a small number of initial cohorts.
What does it mean if a cohort has provided a letter of support?
A letter of support from a cohort leader expresses scientific interest in the initiative and willingness to participate should the initiative move forward. It is not a formal agreement to share data, a commitment to participate, or a guarantee that the cohort will ultimately meet participation criteria or consent requirements. Formal participation agreements will be executed separately, at a later stage, after governance and accession procedures are established.
How will data access work when the resource is operational?
Access is planned to be controlled and project-based — not open or publicly downloadable. Researchers will apply for access on a study-by-study basis, subject to data-use agreement review and approval by a data access committee. Analysis is expected to take place within a secure cloud environment. Cohort-specific consent constraints will be respected and encoded in the access system. Full details are under development and will be released as the governance framework is finalized.