GEDFix Data Integrity Platform

GEDFix is a production-grade pipeline for cleaning, standardizing, and verifying GEDCOM 5.5.1 datasets. It preserves relationships while normalizing names, places, and dates, with explicit verification and rollback paths for zero-data-loss processing.

Core System

The platform combines a reusable Python package, CLI interface, and workflow scripts for scan, fix, verification, and controlled export. Every transformation is traceable through report artifacts, making it usable in professional genealogy, archive migration, and compliance-sensitive family-history workloads.

Its operating principle is integrity-first: no destructive normalization without backup, diff visibility, and relationship validation gates.

Capabilities

  • Issue detection and prioritized diagnostics
  • Date, place, and identity normalization
  • Relationship-safe deduplication and merge
  • Verification scripts with rollback guarantees

Why It Matters

GEDCOM files accumulate errors across decades of software migrations and manual edits. GEDFix treats genealogical data as a trust asset — every correction is auditable, every merge preserves family structure, and every export meets archival standards without silent data loss.

Data Integrity

Every transformation is diffed, logged, and reversible — no silent overwrites or orphaned records.

Relationship Safety

Deduplication and merge logic validates family linkages before committing any structural change.

Compliance-Ready

Report artifacts and rollback paths satisfy archival and professional genealogy audit requirements.