Epistemic model¶

riverbank's epistemic model goes beyond simple confidence scores. It tracks how much to trust each fact, why, and what's explicitly missing.

The nine epistemic status values¶

Every compiled fact carries a pgc:epistemicStatus property. The nine values represent a complete lifecycle from raw observation to validated knowledge:

Status	Meaning	Assigned when
`observed`	Raw observation from source text	Initial extraction before confidence scoring
`extracted`	Machine-extracted with confidence score	After LLM extraction with evidence
`inferred`	Derived via reasoning (OWL, Datalog)	pg-ripple inference engine produces new facts
`verified`	Human-reviewed and accepted	Label Studio reviewer accepts the extraction
`deprecated`	Previously valid, now superseded	Source updated; old fact invalidated
`normative`	Prescribed by policy or standard	Extracted from normative/regulatory text
`predicted`	Model prediction, not yet confirmed	Ensemble prediction below verification threshold
`disputed`	Multiple sources disagree	`explain-conflict` detects contradictions
`speculative`	Hypothetical or uncertain claim	Extracted from hedged language ("might", "could")

How statuses flow¶

stateDiagram-v2
    [*] --> observed
    observed --> extracted: LLM extraction + evidence
    extracted --> verified: Human review (accept)
    extracted --> disputed: Contradiction detected
    extracted --> deprecated: Source updated
    extracted --> inferred: Reasoning engine
    inferred --> verified: Human confirmation
    predicted --> verified: Confirmed by evidence
    verified --> deprecated: Source retracted
    disputed --> verified: Conflict resolved

Querying by status¶

SELECT ?s ?p ?o ?status
WHERE {
  ?s ?p ?o .
  ?s pgc:epistemicStatus ?status .
  FILTER(?status = "verified")
}

Status in RAG context¶

When rag_context() formats graph facts into LLM prompts, it includes the epistemic status. This allows downstream models to weight verified facts higher than speculative ones.

Negative knowledge¶

Recording what is explicitly not present is as important as recording what is. pgc:NegativeKnowledge represents a deliberate absence — "we looked for X and confirmed it does not exist in the source."

Why this matters¶

Without negative knowledge:

"No error-handling path found" is indistinguishable from "we didn't look"
A query for error-handling paths returns empty — is that correct or incomplete?

With negative knowledge:

The absence is recorded explicitly with a reason
Queries can distinguish "confirmed absent" from "not yet extracted"

Structure¶

_:nk1 a pgc:NegativeKnowledge ;
    pgc:aboutSubject <http://example.org/step/3> ;
    pgc:negatedPredicate <http://procedural-knowledge.example/hasErrorHandlingPath> ;
    pgc:reason "No error-handling path was found for this procedure step." ;
    pgc:fromFragment <http://example.org/fragment/runbook-deploy#step3> ;
    pgc:compiledAt "2024-12-01T10:30:00Z"^^xsd:dateTime .

Absence rules in profiles¶

The absence_rules field in a compiler profile automatically generates negative knowledge when a predicate is expected but not found:

absence_rules:
  - predicate: "http://procedural-knowledge.example/hasErrorHandlingPath"
    summary: "No error-handling path was found for this procedure step."

Argument graphs¶

pgc:ArgumentRecord captures structured arguments for or against a claim:

_:arg1 a pgc:ArgumentRecord ;
    pgc:claim <http://example.org/fact/acme-founded-1995> ;
    pgc:evidence <http://example.org/fragment/history#p3> ;
    pgc:objection "Company registry shows 1997" ;
    pgc:rebuttal "Registry date is incorporation, not founding" ;
    pgc:strength 0.75 .

Structure¶

Property	Purpose
`pgc:claim`	The fact being argued about
`pgc:evidence`	Fragment supporting the claim
`pgc:objection`	Counter-argument text
`pgc:rebuttal`	Response to the objection
`pgc:strength`	Argument strength `[0.0, 1.0]`

Argument graphs enable the riverbank explain-conflict command to show not just what contradicts, but why, and how the contradiction might be resolved.

Assumption registry¶

pgc:AssumptionRecord records explicit assumptions made during compilation:

_:asn1 a pgc:AssumptionRecord ;
    pgc:assumptionText "All dates in source are UTC" ;
    pgc:scope <http://example.org/source/timezones.md> ;
    pgc:impact "Date comparisons may be incorrect if local times are used" ;
    pgc:status "active" .

Assumptions are tracked so that when an assumption is invalidated (e.g., you discover dates are actually local time), you can find all facts that depended on it.

Coverage maps¶

pgc:CoverageMap tracks which concepts have been extracted and which remain unaddressed:

Compiled concepts with high confidence → covered
Compiled concepts with low confidence → partially covered
Concepts mentioned in competency questions but absent from the graph → uncovered

Query coverage:

SELECT ?concept ?coverage
WHERE {
  ?cm a pgc:CoverageMap ;
      pgc:concept ?concept ;
      pgc:coverageLevel ?coverage .
}