Feature

Drill down to every verse with surgical precision.

The Bible Bench Explorer provides a hierarchical drill-down interface that takes you from a whole-Bible fidelity summary all the way to the individual character differences in a single verse — with diff visualization and normalization previews at every step.

Forensic visibility into model outputs

The Explorer is built for researchers who need to understand not just whether a model scored well, but exactly why it did or didn't.

Hierarchical Drill-Down

Navigate the full scripture hierarchy — Bible → Book → Chapter → Verse — with fidelity scores aggregated at every level so you can zero in on problem areas fast.

Verse-Level Comparison

Side-by-side view of the canonical verse text and the model's response, with character-level diff highlighting that shows every substitution, omission, and addition.

Fidelity Score Visualization

Each verse, chapter, and book carries a computed fidelity score (0–100). Heatmaps and sorted lists surface the lowest-scoring verses across any model or campaign.

Raw vs. Processed Response

Toggle between the raw LLM response and the processed (normalized) form that was actually used for scoring — so you can audit exactly how normalization affects results.

Client-Side Normalization Preview

Experiment with text normalization transforms in real time, previewing how different processing strategies would affect fidelity scores without persisting any changes.