Live calibration
Running accuracy and calibration on real submitted claims, as labeled by Veridi administrators. See also: baseline calibration on the 100-claim validation set.
User perception (submitter feedback) — Praxis
Submitter feedback on expectation match, reasoning, and evidence. Perception, not ground truth — diverges from admin judgment in informative ways.
1/3 Praxis claims (33%) with submitter feedback
1 ratings
1 ratings
Expectation match
| Response | Count | Share |
|---|---|---|
| feedback.match.lower | 0 | 0% |
| Matched my expectation | 0 | 0% |
| feedback.match.higher | 0 | 0% |
| I had no prior expectation | 1 | 100% |
Reasoning rating distribution
| Rating | Count | Share |
|---|---|---|
| 1 | 0 | 0% |
| 2 | 0 | 0% |
| 3 | 0 | 0% |
| 4 | 0 | 0% |
| 5 | 0 | 0% |
| 6 | 0 | 0% |
| 7 | 0 | 0% |
| 8 | 0 | 0% |
| 9 | 0 | 0% |
| 10 | 1 | 100% |
Evidence rating distribution
| Rating | Count | Share |
|---|---|---|
| 1 | 0 | 0% |
| 2 | 0 | 0% |
| 3 | 0 | 0% |
| 4 | 0 | 0% |
| 5 | 0 | 0% |
| 6 | 0 | 0% |
| 7 | 0 | 0% |
| 8 | 0 | 0% |
| 9 | 1 | 100% |
| 10 | 0 | 0% |
Calibration feedback loop
Brier-lite scoring on outcomes from the last 30 / 60 / 90 days. Predicted is the system's confidence at recommendation time; actual is the realized outcome (per the methodology's outcome → ground-truth map). Lower Brier = better-calibrated predictions.
Calibration loop not yet running for this methodology — the 90-day window has fewer than 5 resolvable outcomes.
Outcome submissions
User-reported outcomes for Praxis action plans: was the action taken and did it sustain?
Need at least 5 outcomes for distribution to be meaningful.
Not enough outcomes yet. Check back as more users opt in to outcome tracking and report back at the scheduled intervals.