Skip to content
Atlas Civica
Roadmap · v1

Transparency

Methodology & data dictionary

A full account of how the corpus is built: harvest sources and coverage definitions, the Atlas Civica Format, the FRBR versioning model, the ML labeling pipeline, and the normative scoring.

What’s coming

  • Coverage tiers: what “lit / pullable / trace / dark” mean, exactly
  • The Atlas Civica Format field-by-field, with stable identifier scheme
  • Versioning: Work / Expression / Manifestation + content-hash diffs
  • ML pipeline: llama-3.3-70b teacher labels → ModernBERT distillation plan
  • Normative dimensions & their limits — directional model estimates, not legal advice
  • How the function classes and normative scores are defined and validated

The underlying data already exists in the engine — this surface is being built next.