Research Data Management & FAIR Principles

Evaluate research data management practices including FAIR principles (Findable, Accessible, Interoperable, Reusable), data integrity, and regulatory compliance

BiotechHealthcareResearch20 minutes20 questions

1. Data Management Planning

Do you have a Data Management Plan (DMP) for your research projects?*

NIH/NSF requirement: DMP covering data types, standards, sharing, preservation

Does your DMP specify data retention periods and archival responsibilities?*

Minimum 3 years post-publication for NIH; varies by funder and regulation

Are roles and responsibilities for data management clearly defined?*

Data stewards, custodians, and access control authorities

2. FAIR: Findable

Are datasets assigned globally unique and persistent identifiers (DOIs)?*

F1: Persistent identifiers enable long-term citation and retrieval

Is data described with rich metadata including study design, methods, and variables?*

F2: Comprehensive metadata increases discoverability

Are datasets registered or indexed in searchable repositories (e.g., PubMed, Dryad, Zenodo)?*

F4: Indexed in searchable resource for discovery

3. FAIR: Accessible

Are data accessible via standardized protocols (HTTP, FTP, API)?*

A1: Open, free, and universally implementable protocols

Is metadata accessible even when data is no longer available?*

A2: Metadata persists to document what existed

Are access controls and authentication procedures clearly specified?*

A1.2: Authorization/authentication when required (e.g., controlled access for PHI)

4. FAIR: Interoperable

Do you use standardized vocabularies and ontologies (e.g., SNOMED, LOINC, MeSH)?*

I2: Controlled vocabularies enable integration across datasets

Are data stored in open, non-proprietary formats (CSV, JSON, XML)?*

I1: Machine-readable formats that don't require specialized software

Does metadata include references to related datasets and publications?*

I3: Qualified references link data to context

5. FAIR: Reusable

Are data released with clear usage licenses (CC0, CC-BY, custom DUA)?*

R1.1: Explicit license increases reuse confidence

Is detailed provenance information provided (collection methods, processing, quality)?*

R1.2: Data lineage and quality metrics enable appropriate reuse

Do you provide analysis code/scripts alongside datasets?*

Reproducibility: share processing pipelines and statistical code

6. Data Integrity & Security

Do you maintain audit trails of data creation, modification, and deletion?*

ALCOA+: Attributable, Legible, Contemporaneous, Original, Accurate

Are data backups performed regularly with tested recovery procedures?*

3-2-1 rule: 3 copies, 2 different media, 1 offsite

Is sensitive data (PHI, PII) encrypted at rest and in transit?*

HIPAA/GDPR: AES-256 encryption, TLS 1.2+ for transmission

Do you perform data quality checks and validation during collection/entry?*

Range checks, consistency checks, duplicate detection

Are data retention and destruction policies documented and followed?*

Regulatory requirements, funder policies, institutional policies

Please answer all required questions to see your results