Header SoftwareOpenGuard

OpenGuard

A guardrail framework for safe and secure interaction with Large Language Models

OpenGuard

OpenGuard is a guardrail framework that enforces user-defined rules to ensure safe and secure interactions with Large Language Models (LLMs). OpenGuard allows detecting factual hallucination, instances where generated outputs deviate from factual truth.

The OpenGuard hallucination detection checker integrates Beam Search Sampling (BSS) with Semantic Consistency Analysis to systematically identify hallucinations. BSS generates multiple candidate responses, capturing the model’s confidence distribution across different plausible answers. These responses are then clustered based on semantic similarity, followed by Natural Language Inference (NLI) to evaluate entailment and contradiction relationships.

To quantify hallucinations, we introduce a scoring mechanism that combines token probabilities with semantic similarity metrics, offering a more precise measure of factual consistency. In cases where Beam Search Sampling (BSS) produces only a single response, we employ a Chain-of-Verification (CoVe) mechanism to enhance self-consistency checks.

OpenGuard provides a structured and reliable methodology to enhance the trustworthiness of LLM-generated content, making it an essential tool for responsible LLM deployment.

Download

All available downloads can be found under DepAI/OpenGuard.

Dokumentation

The current documentation is available at DepAI/OpenGuard.

 Radouane Bouchekir

Your contact

Radouane Bouchekir

+49 89 3603522 262
bouchekir@fortiss.org