Findings Ledger Schema (Sprint 120)

Owners: Findings Ledger Guild • Vuln Explorer Guild
Status: Draft schema delivered 2025-11-03 for LEDGER-29-001

1. Storage profile

ConcernDecisionNotes
EnginePostgreSQL 14+ with UTF-8, jsonb, and partitioning supportAligns with shared data plane; deterministic ordering enforced via primary keys.
TenancyRange/list partition on tenant_id for ledger + projection tablesSimplifies retention and cross-tenant anchoring.
Time zoneAll timestamps stored as timestamptz UTCCanonical JSON uses ISO-8601 (yyyy-MM-ddTHH:mm:ss.fffZ).
HashingSHA-256 (lower-case hex) over canonical JSONImplemented client-side and verified by DB constraint.
MigrationsSQL files under src/Findings/StellaOps.Findings.Ledger/migrationsApplied via DatabaseMigrator (part of platform toolchain).

2. Ledger event model

Events are immutable append-only records representing every workflow change. Records capture the original event payload, cryptographic hashes, and actor metadata.

2.1 ledger_events

ColumnTypeDescription
tenant_idtextTenant partition key.
chain_iduuidLogical chain grouping (per tenant/policy combination).
sequence_nobigintMonotonic sequence within a chain (gapless).
event_iduuidGlobally unique event identifier.
event_typeledger_event_typeEnumerated type (see §2.2).
policy_versiontextPolicy digest (e.g., SHA-256).
finding_idtextStable finding identity (artifactId + vulnId + policyVersion).
artifact_idtextAsset identifier (image digest, SBOM id, etc.).
source_run_iduuidPolicy run that produced the event (nullable).
actor_idtextOperator/service initiating the mutation.
actor_typetextsystem, operator, integration.
occurred_attimestamptzDomain timestamp supplied by source.
recorded_attimestamptzIngestion timestamp (defaults to now()).
event_bodyjsonbCanonical payload (see §2.3).
event_hashchar(64)SHA-256 over canonical payload envelope.
previous_hashchar(64)Hash of prior event in chain (all zeroes for first).
merkle_leaf_hashchar(64)Leaf hash used for Merkle anchoring (hash over `event_hash
evidence_bundle_reftextOptional reference to evaluation/job evidence bundle (DSSE or capsule id).

Constraints & indexes

PRIMARY KEY (tenant_id, chain_id, sequence_no);
UNIQUE (tenant_id, event_id);
UNIQUE (tenant_id, chain_id, event_hash);
CHECK (event_hash ~ '^[0-9a-f]{64}$');
CHECK (previous_hash ~ '^[0-9a-f]{64}$');
CREATE INDEX ix_ledger_events_finding ON ledger_events (tenant_id, finding_id, policy_version);
CREATE INDEX ix_ledger_events_type ON ledger_events (tenant_id, event_type, recorded_at DESC);
CREATE INDEX ix_ledger_events_finding_evidence_ref ON ledger_events (tenant_id, finding_id, recorded_at DESC) WHERE evidence_bundle_ref IS NOT NULL;

Partitions: top-level partitioned by tenant_id (list) with a default partition. Optional sub-partition by month on recorded_at for large tenants. PostgreSQL requires the partition key in unique constraints; global uniqueness for event_id is enforced as (tenant_id, event_id) with application-level guards maintaining cross-tenant uniqueness.

2.2 Event types

CREATE TYPE ledger_event_type AS ENUM (
  'finding.created',
  'finding.status_changed',
  'finding.severity_changed',
  'finding.tag_updated',
  'finding.comment_added',
  'finding.assignment_changed',
  'finding.accepted_risk',
  'finding.remediation_plan_added',
  'finding.attachment_added',
  'finding.closed'
);

Additional types can be appended via migrations; canonical JSON must include event_type key.

2.3 Canonical ledger JSON

Canonical payload envelope (before hashing):

{
  "event": {
    "id": "3ac1f4ef-3c26-4b0d-91d4-6a6d3a5bde10",
    "type": "finding.status_changed",
    "tenant": "tenant-a",
    "chainId": "5fa2b970-9da2-4ef4-9a63-463c5d98d3cc",
    "sequence": 42,
    "policyVersion": "sha256:5f38...",
    "finding": {
      "id": "artifact:sha256:abc|pkg:cpe:/o:vendor:product",
      "artifactId": "sha256:abc",
      "vulnId": "CVE-2025-1234"
    },
    "actor": {
      "id": "user:alice@tenant",
      "type": "operator"
    },
    "occurredAt": "2025-11-03T15:12:05.123Z",
    "payload": {
      "previousStatus": "affected",
      "status": "triaged",
      "justification": "Ticket SEC-1234 created",
      "ticket": {
        "id": "SEC-1234",
        "url": "https://tracker/sec-1234"
      }
    }
  }
}

Canonicalisation rules:

  1. Serialize using UTF-8, no BOM.
  2. Sort object keys lexicographically at every level.
  3. Represent enums/flags as lower-case strings.
  4. Timestamps formatted as yyyy-MM-ddTHH:mm:ss.fffZ (millisecond precision, UTC).
  5. Numbers use decimal notation; omit trailing zeros.
  6. Arrays maintain supplied order.

Hash pipeline:

canonical_json = CanonicalJsonSerializer.Serialize(envelope)
sha256_bytes = SHA256(canonical_json)
event_hash = HexLower(sha256_bytes)

merkle_leaf_hash = HexLower(SHA256(event_hash || '-' || sequence_no)).

3. Merkle anchoring

Anchoring batches events per tenant across fixed windows (default: 1,000 events or 15 minutes). Anchors are stored in ledger_merkle_roots.

ColumnTypeDescription
tenant_idtextTenant key.
anchor_iduuidAnchor identifier.
window_starttimestamptzInclusive start of batch.
window_endtimestamptzExclusive end.
sequence_startbigintFirst sequence included.
sequence_endbigintLast sequence included.
root_hashchar(64)Merkle root (SHA-256).
leaf_countintegerNumber of events aggregated.
anchored_attimestamptzTimestamp root stored/signed.
anchor_referencetextOptional reference to external ledger (e.g., Rekor UUID).

Indexes: PRIMARY KEY (tenant_id, anchor_id), UNIQUE (tenant_id, root_hash), INDEX ix_merkle_sequences ON ledger_merkle_roots (tenant_id, sequence_end DESC).

4. Projection tables

4.1 findings_projection

Stores the latest verdict/state per finding.

ColumnTypeDescription
tenant_idtextPartition key.
finding_idtextMatches ledger payload.
policy_versiontextActive policy digest.
statustexte.g., affected, triaged, accepted_risk, resolved.
severitynumeric(6,3)Normalised severity score (0-10).
risk_scorenumeric(6,3)Risk scoring result (0-10) from Risk Engine/Policy.
risk_severitytextRisk category (e.g., low, medium, high, critical).
risk_profile_versiontextRisk profile hash/version used for scoring.
risk_explanation_iduuidReference to risk explanation document.
labelsjsonbKey-value metadata (tags, KEV flag, runtime signals).
current_event_iduuidLedger event that produced this state.
explain_reftextReference to explain bundle or object storage key.
policy_rationalejsonbArray of policy rationale references (explain bundle IDs, remediation notes).
updated_attimestamptzLast projection update.
cycle_hashchar(64)Deterministic hash of projection record (used in export bundles).

Primary key: (tenant_id, finding_id, policy_version).

Indexes:

  • ix_projection_status on (tenant_id, status, severity DESC).
  • ix_projection_risk on (tenant_id, risk_severity, risk_score DESC).
  • ix_projection_labels_gin using labels GIN for KEV/runtime filters.

4.2 finding_history

Delta view derived from ledger events for quick UI queries.

ColumnTypeDescription
tenant_idtextPartition key.
finding_idtextFinding identity.
policy_versiontextPolicy digest.
event_iduuidLedger event ID.
statustextStatus after event.
severitynumeric(6,3)Severity after event (nullable).
actor_idtextActor performing change.
commenttextOptional summary/message.
occurred_attimestamptzDomain event timestamp.

Materialized view or table updated by projector. Indexed by (tenant_id, finding_id, occurred_at DESC).

4.3 triage_actions

Audit table for operator actions needing tailored queries.

ColumnTypeDescription
tenant_idtextPartition key.
action_iduuidPrimary key.
event_iduuidSource ledger event.
finding_idtextFinding identity.
action_typeledger_action_typee.g., assign, comment, attach_evidence, link_ticket.
payloadjsonbStructured action body (canonical stored separately).
created_attimestamptzTimestamp stored.
created_bytextActor ID.

ledger_action_type enum mirrors CLI/UX operations.

CREATE TYPE ledger_action_type AS ENUM (
  'assign',
  'comment',
  'attach_evidence',
  'link_ticket',
  'remediation_plan',
  'status_change',
  'accept_risk',
  'reopen',
  'close'
);

### 4.4 `ledger_projection_offsets`

Checkpoint store for the projection background worker. Ensures idempotent replays across restarts.

| Column | Type | Description |
|--------|------|-------------|
| `worker_id` | `text` | Logical worker identifier (defaults to `default`). |
| `last_recorded_at` | `timestamptz` | Timestamp of the last projected ledger event. |
| `last_event_id` | `uuid` | Event identifier paired with `last_recorded_at` for deterministic ordering. |
| `updated_at` | `timestamptz` | Last time the checkpoint was persisted. |

Seed row inserted on migration ensures catch-up from epoch (`1970-01-01T00:00:00Z` with empty GUID).

### 4.5 `ledger_attestations`

Deterministic view of DSSE verification results used by `/v1/ledger/attestations`. Rows are written by the provenance/verification pipeline and keyed per tenant.

| Column | Type | Description |
|--------|------|-------------|
| `tenant_id` | `text` | Partition key. |
| `attestation_id` | `uuid` | Primary key within tenant. |
| `artifact_id` | `text` | OCI digest or SBOM identifier verified. |
| `finding_id` | `text` | Optional finding linkage. |
| `verification_status` | `text` | `verified`, `failed`, or `unknown`. |
| `verification_time` | `timestamptz` | When verification completed. |
| `dsse_digest` | `text` | Lower-case SHA-256 of DSSE envelope. |
| `rekor_entry_id` | `text` | Optional transparency log UUID. |
| `evidence_bundle_ref` | `text` | Optional evidence bundle reference. |
| `ledger_event_id` | `uuid` | Ledger event that linked the attestation. |
| `recorded_at` | `timestamptz` | Ingestion timestamp used for paging. |
| `merkle_leaf_hash` | `text` | Leaf hash for anchoring proofs. |
| `root_hash` | `text` | Anchor root hash. |
| `cycle_hash` | `text` | Projection cycle hash for determinism. |
| `projection_version` | `text` | Projection version identifier. |

Ordering and pagination: `ORDER BY recorded_at ASC, attestation_id ASC` with cursor token `{recordedAt, attestationId, filtersHash}`. Indexes: PK `(tenant_id, attestation_id)`, paging index `(tenant_id, recorded_at, attestation_id)`, lookups on `(tenant_id, artifact_id, recorded_at DESC)` and `(tenant_id, verification_status, recorded_at DESC)`.

## 5. Hashing & verification

1. Canonical serialize the envelope (§2.3).
2. Compute `event_hash` and store along with `previous_hash`.
3. Build Merkle tree per anchoring window using leaf hash `SHA256(event_hash || '-' || sequence_no)`.
4. Persist root in `ledger_merkle_roots` and, when configured, submit to external transparency log (Rekor v2). Store receipt/UUID in `anchor_reference`.
5. Projection rows compute `cycle_hash = SHA256(canonical_projection_json)` where canonical projection includes fields `{tenant_id, finding_id, policy_version, status, severity, labels, current_event_id}` with sorted keys.

Verification flow for auditors:

- Fetch event, recompute canonical hash, validate `previous_hash` chain.
- Reconstruct Merkle path from stored leaf hash; verify matches recorded root.
- Cross-check projection `cycle_hash` matches ledger state derived from last event.

## 6. Fixtures & migrations

- Initial migration script: `src/Findings/StellaOps.Findings.Ledger/migrations/001_initial.sql`.
- Sample canonical event: `seed-data/findings-ledger/fixtures/ledger-event.sample.json` (includes pre-computed `eventHash`, `previousHash`, and `merkleLeafHash` values).
- Sample projection row: `seed-data/findings-ledger/fixtures/finding-projection.sample.json` (includes canonical `cycleHash` for replay validation).

Fixtures follow canonical key ordering and include precomputed hashes to validate tooling.

## 7. Projection worker

- `LedgerProjectionWorker` consumes ledger events via `PostgresLedgerEventStream`, applying deterministic reductions with `LedgerProjectionReducer`.
- Checkpoint state is stored in `ledger_projection_offsets`, allowing replay from any point in time.
- Batch processing is configurable via `findings:ledger:projection` (`batchSize`, `idleDelay`).
- Each event writes:
  - `findings_projection` (upserted current state with `cycle_hash`).
  - `finding_history` (timeline entry keyed by event ID).
  - `triage_actions` when applicable (status change, comment, assignment, remediation, attachment, accept-risk, close).

## 8. Next steps

- Integrate Policy Engine batch evaluation with the projector (`LEDGER-29-004`).
- Align Vulnerability Explorer queries with the new projection state and timeline endpoints.
- Externalise Merkle anchor publishing to transparency log once anchoring cadence is finalised.
|  |  | Array of policy rationale references (explain bundle IDs, remediation notes). |