Findings Ledger Schema (Sprint 120)
Owners: Findings Ledger Guild • Vuln Explorer Guild
Status: Draft schema delivered 2025-11-03 for LEDGER-29-001
1. Storage profile
| Concern | Decision | Notes |
|---|---|---|
| Engine | PostgreSQL 14+ with UTF-8, jsonb, and partitioning support | Aligns with shared data plane; deterministic ordering enforced via primary keys. |
| Tenancy | Range/list partition on tenant_id for ledger + projection tables | Simplifies retention and cross-tenant anchoring. |
| Time zone | All timestamps stored as timestamptz UTC | Canonical JSON uses ISO-8601 (yyyy-MM-ddTHH:mm:ss.fffZ). |
| Hashing | SHA-256 (lower-case hex) over canonical JSON | Implemented client-side and verified by DB constraint. |
| Migrations | SQL files under src/Findings/StellaOps.Findings.Ledger/migrations | Applied via DatabaseMigrator (part of platform toolchain). |
2. Ledger event model
Events are immutable append-only records representing every workflow change. Records capture the original event payload, cryptographic hashes, and actor metadata.
2.1 ledger_events
| Column | Type | Description |
|---|---|---|
tenant_id | text | Tenant partition key. |
chain_id | uuid | Logical chain grouping (per tenant/policy combination). |
sequence_no | bigint | Monotonic sequence within a chain (gapless). |
event_id | uuid | Globally unique event identifier. |
event_type | ledger_event_type | Enumerated type (see §2.2). |
policy_version | text | Policy digest (e.g., SHA-256). |
finding_id | text | Stable finding identity (artifactId + vulnId + policyVersion). |
artifact_id | text | Asset identifier (image digest, SBOM id, etc.). |
source_run_id | uuid | Policy run that produced the event (nullable). |
actor_id | text | Operator/service initiating the mutation. |
actor_type | text | system, operator, integration. |
occurred_at | timestamptz | Domain timestamp supplied by source. |
recorded_at | timestamptz | Ingestion timestamp (defaults to now()). |
event_body | jsonb | Canonical payload (see §2.3). |
event_hash | char(64) | SHA-256 over canonical payload envelope. |
previous_hash | char(64) | Hash of prior event in chain (all zeroes for first). |
merkle_leaf_hash | char(64) | Leaf hash used for Merkle anchoring (hash over `event_hash |
evidence_bundle_ref | text | Optional reference to evaluation/job evidence bundle (DSSE or capsule id). |
Constraints & indexes
PRIMARY KEY (tenant_id, chain_id, sequence_no);
UNIQUE (tenant_id, event_id);
UNIQUE (tenant_id, chain_id, event_hash);
CHECK (event_hash ~ '^[0-9a-f]{64}$');
CHECK (previous_hash ~ '^[0-9a-f]{64}$');
CREATE INDEX ix_ledger_events_finding ON ledger_events (tenant_id, finding_id, policy_version);
CREATE INDEX ix_ledger_events_type ON ledger_events (tenant_id, event_type, recorded_at DESC);
CREATE INDEX ix_ledger_events_finding_evidence_ref ON ledger_events (tenant_id, finding_id, recorded_at DESC) WHERE evidence_bundle_ref IS NOT NULL;
Partitions: top-level partitioned by tenant_id (list) with a default partition. Optional sub-partition by month on recorded_at for large tenants. PostgreSQL requires the partition key in unique constraints; global uniqueness for event_id is enforced as (tenant_id, event_id) with application-level guards maintaining cross-tenant uniqueness.
2.2 Event types
CREATE TYPE ledger_event_type AS ENUM (
'finding.created',
'finding.status_changed',
'finding.severity_changed',
'finding.tag_updated',
'finding.comment_added',
'finding.assignment_changed',
'finding.accepted_risk',
'finding.remediation_plan_added',
'finding.attachment_added',
'finding.closed'
);
Additional types can be appended via migrations; canonical JSON must include event_type key.
2.3 Canonical ledger JSON
Canonical payload envelope (before hashing):
{
"event": {
"id": "3ac1f4ef-3c26-4b0d-91d4-6a6d3a5bde10",
"type": "finding.status_changed",
"tenant": "tenant-a",
"chainId": "5fa2b970-9da2-4ef4-9a63-463c5d98d3cc",
"sequence": 42,
"policyVersion": "sha256:5f38...",
"finding": {
"id": "artifact:sha256:abc|pkg:cpe:/o:vendor:product",
"artifactId": "sha256:abc",
"vulnId": "CVE-2025-1234"
},
"actor": {
"id": "user:alice@tenant",
"type": "operator"
},
"occurredAt": "2025-11-03T15:12:05.123Z",
"payload": {
"previousStatus": "affected",
"status": "triaged",
"justification": "Ticket SEC-1234 created",
"ticket": {
"id": "SEC-1234",
"url": "https://tracker/sec-1234"
}
}
}
}
Canonicalisation rules:
- Serialize using UTF-8, no BOM.
- Sort object keys lexicographically at every level.
- Represent enums/flags as lower-case strings.
- Timestamps formatted as
yyyy-MM-ddTHH:mm:ss.fffZ(millisecond precision, UTC). - Numbers use decimal notation; omit trailing zeros.
- Arrays maintain supplied order.
Hash pipeline:
canonical_json = CanonicalJsonSerializer.Serialize(envelope)
sha256_bytes = SHA256(canonical_json)
event_hash = HexLower(sha256_bytes)
merkle_leaf_hash = HexLower(SHA256(event_hash || '-' || sequence_no)).
3. Merkle anchoring
Anchoring batches events per tenant across fixed windows (default: 1,000 events or 15 minutes). Anchors are stored in ledger_merkle_roots.
| Column | Type | Description |
|---|---|---|
tenant_id | text | Tenant key. |
anchor_id | uuid | Anchor identifier. |
window_start | timestamptz | Inclusive start of batch. |
window_end | timestamptz | Exclusive end. |
sequence_start | bigint | First sequence included. |
sequence_end | bigint | Last sequence included. |
root_hash | char(64) | Merkle root (SHA-256). |
leaf_count | integer | Number of events aggregated. |
anchored_at | timestamptz | Timestamp root stored/signed. |
anchor_reference | text | Optional reference to external ledger (e.g., Rekor UUID). |
Indexes: PRIMARY KEY (tenant_id, anchor_id), UNIQUE (tenant_id, root_hash), INDEX ix_merkle_sequences ON ledger_merkle_roots (tenant_id, sequence_end DESC).
4. Projection tables
4.1 findings_projection
Stores the latest verdict/state per finding.
| Column | Type | Description |
|---|---|---|
tenant_id | text | Partition key. |
finding_id | text | Matches ledger payload. |
policy_version | text | Active policy digest. |
status | text | e.g., affected, triaged, accepted_risk, resolved. |
severity | numeric(6,3) | Normalised severity score (0-10). |
risk_score | numeric(6,3) | Risk scoring result (0-10) from Risk Engine/Policy. |
risk_severity | text | Risk category (e.g., low, medium, high, critical). |
risk_profile_version | text | Risk profile hash/version used for scoring. |
risk_explanation_id | uuid | Reference to risk explanation document. |
labels | jsonb | Key-value metadata (tags, KEV flag, runtime signals). |
current_event_id | uuid | Ledger event that produced this state. |
explain_ref | text | Reference to explain bundle or object storage key. |
policy_rationale | jsonb | Array of policy rationale references (explain bundle IDs, remediation notes). |
updated_at | timestamptz | Last projection update. |
cycle_hash | char(64) | Deterministic hash of projection record (used in export bundles). |
Primary key: (tenant_id, finding_id, policy_version).
Indexes:
ix_projection_statuson(tenant_id, status, severity DESC).ix_projection_riskon(tenant_id, risk_severity, risk_score DESC).ix_projection_labels_ginusinglabelsGIN for KEV/runtime filters.
4.2 finding_history
Delta view derived from ledger events for quick UI queries.
| Column | Type | Description |
|---|---|---|
tenant_id | text | Partition key. |
finding_id | text | Finding identity. |
policy_version | text | Policy digest. |
event_id | uuid | Ledger event ID. |
status | text | Status after event. |
severity | numeric(6,3) | Severity after event (nullable). |
actor_id | text | Actor performing change. |
comment | text | Optional summary/message. |
occurred_at | timestamptz | Domain event timestamp. |
Materialized view or table updated by projector. Indexed by (tenant_id, finding_id, occurred_at DESC).
4.3 triage_actions
Audit table for operator actions needing tailored queries.
| Column | Type | Description |
|---|---|---|
tenant_id | text | Partition key. |
action_id | uuid | Primary key. |
event_id | uuid | Source ledger event. |
finding_id | text | Finding identity. |
action_type | ledger_action_type | e.g., assign, comment, attach_evidence, link_ticket. |
payload | jsonb | Structured action body (canonical stored separately). |
created_at | timestamptz | Timestamp stored. |
created_by | text | Actor ID. |
ledger_action_type enum mirrors CLI/UX operations.
CREATE TYPE ledger_action_type AS ENUM (
'assign',
'comment',
'attach_evidence',
'link_ticket',
'remediation_plan',
'status_change',
'accept_risk',
'reopen',
'close'
);
### 4.4 `ledger_projection_offsets`
Checkpoint store for the projection background worker. Ensures idempotent replays across restarts.
| Column | Type | Description |
|--------|------|-------------|
| `worker_id` | `text` | Logical worker identifier (defaults to `default`). |
| `last_recorded_at` | `timestamptz` | Timestamp of the last projected ledger event. |
| `last_event_id` | `uuid` | Event identifier paired with `last_recorded_at` for deterministic ordering. |
| `updated_at` | `timestamptz` | Last time the checkpoint was persisted. |
Seed row inserted on migration ensures catch-up from epoch (`1970-01-01T00:00:00Z` with empty GUID).
### 4.5 `ledger_attestations`
Deterministic view of DSSE verification results used by `/v1/ledger/attestations`. Rows are written by the provenance/verification pipeline and keyed per tenant.
| Column | Type | Description |
|--------|------|-------------|
| `tenant_id` | `text` | Partition key. |
| `attestation_id` | `uuid` | Primary key within tenant. |
| `artifact_id` | `text` | OCI digest or SBOM identifier verified. |
| `finding_id` | `text` | Optional finding linkage. |
| `verification_status` | `text` | `verified`, `failed`, or `unknown`. |
| `verification_time` | `timestamptz` | When verification completed. |
| `dsse_digest` | `text` | Lower-case SHA-256 of DSSE envelope. |
| `rekor_entry_id` | `text` | Optional transparency log UUID. |
| `evidence_bundle_ref` | `text` | Optional evidence bundle reference. |
| `ledger_event_id` | `uuid` | Ledger event that linked the attestation. |
| `recorded_at` | `timestamptz` | Ingestion timestamp used for paging. |
| `merkle_leaf_hash` | `text` | Leaf hash for anchoring proofs. |
| `root_hash` | `text` | Anchor root hash. |
| `cycle_hash` | `text` | Projection cycle hash for determinism. |
| `projection_version` | `text` | Projection version identifier. |
Ordering and pagination: `ORDER BY recorded_at ASC, attestation_id ASC` with cursor token `{recordedAt, attestationId, filtersHash}`. Indexes: PK `(tenant_id, attestation_id)`, paging index `(tenant_id, recorded_at, attestation_id)`, lookups on `(tenant_id, artifact_id, recorded_at DESC)` and `(tenant_id, verification_status, recorded_at DESC)`.
## 5. Hashing & verification
1. Canonical serialize the envelope (§2.3).
2. Compute `event_hash` and store along with `previous_hash`.
3. Build Merkle tree per anchoring window using leaf hash `SHA256(event_hash || '-' || sequence_no)`.
4. Persist root in `ledger_merkle_roots` and, when configured, submit to external transparency log (Rekor v2). Store receipt/UUID in `anchor_reference`.
5. Projection rows compute `cycle_hash = SHA256(canonical_projection_json)` where canonical projection includes fields `{tenant_id, finding_id, policy_version, status, severity, labels, current_event_id}` with sorted keys.
Verification flow for auditors:
- Fetch event, recompute canonical hash, validate `previous_hash` chain.
- Reconstruct Merkle path from stored leaf hash; verify matches recorded root.
- Cross-check projection `cycle_hash` matches ledger state derived from last event.
## 6. Fixtures & migrations
- Initial migration script: `src/Findings/StellaOps.Findings.Ledger/migrations/001_initial.sql`.
- Sample canonical event: `seed-data/findings-ledger/fixtures/ledger-event.sample.json` (includes pre-computed `eventHash`, `previousHash`, and `merkleLeafHash` values).
- Sample projection row: `seed-data/findings-ledger/fixtures/finding-projection.sample.json` (includes canonical `cycleHash` for replay validation).
Fixtures follow canonical key ordering and include precomputed hashes to validate tooling.
## 7. Projection worker
- `LedgerProjectionWorker` consumes ledger events via `PostgresLedgerEventStream`, applying deterministic reductions with `LedgerProjectionReducer`.
- Checkpoint state is stored in `ledger_projection_offsets`, allowing replay from any point in time.
- Batch processing is configurable via `findings:ledger:projection` (`batchSize`, `idleDelay`).
- Each event writes:
- `findings_projection` (upserted current state with `cycle_hash`).
- `finding_history` (timeline entry keyed by event ID).
- `triage_actions` when applicable (status change, comment, assignment, remediation, attachment, accept-risk, close).
## 8. Next steps
- Integrate Policy Engine batch evaluation with the projector (`LEDGER-29-004`).
- Align Vulnerability Explorer queries with the new projection state and timeline endpoints.
- Externalise Merkle anchor publishing to transparency log once anchoring cadence is finalised.
| | | Array of policy rationale references (explain bundle IDs, remediation notes). |