Canonical characteristic value

canonical_characteristics.value is a jsonb column pairing a canonical product with a characteristic definition. Iter-1 value shape is deliberately minimal — Phase 2 LLM enrichment wraps richer provenance.

Schema (iter-1)

{
  "raw": <any JSON-compatible primitive or object>,
  "unit": "mm" | null,
  "origin": "manual" | "etm-auto" | "llm-enrichment"
}
  • raw — supplier-native or operator-entered value. String, number, bool, or nested object.
  • unit — optional short code matching units.code.
  • origin — free-form tag tracking the data source.

No explicit null handling: missing value ⇒ no canonical_characteristics row rather than a row with null raw.

Phase 2 additions (not implemented yet)

  • confidence float 0-1 — LLM-emitted confidence score.
  • provenance_trail array — chain of enrichment sources.
  • review_status — admin approval flag.

These fields arrive additively; existing iter-1 values stay valid.

Insert path

Ingestion writes via catalog/canonical/app.CanonicalService .AssignCharacteristic (admin path) or — in Phase 2 — via normalization → chardict resolution → canonical characteristic upsert. Iter-1 normalization emits CharacteristicRaw values only; mapping to canonical characteristic_id is deferred to Phase 2 LLM.

References

  • ADR-0031 microkernel sub-modules
  • ADR-0003 event-sourced canonical products (deferred to Phase 2)
  • ADR-0034 canonical as read-model (iter-1 deviation)