LLM 가드레일 튜토리얼 – Textstat + LangChain으로 환각·장황함 측정 및 제어하기

핵심 개념: 복잡도 예산
1. 설치
2. API 토큰 설정 (Google Colab)
3. LangChain 파이프라인 설정
4. 가드레일 함수 구현
5. 실행 예시
결과 해석과 한계
확장 방향
참고 자료

llm-guardrails의 핵심 문제인 LLM 장황함(verbosity)과 환각(hallucination)을 Textstat 라이브러리로 수치화하고, 복잡도 예산(complexity budget)을 초과할 경우 LangChain 재프롬프팅 루프를 자동 실행하는 파이프라인을 구현한다. Google Colab에서 바로 실행할 수 있는 예제 코드를 포함한다.

핵심 개념: 복잡도 예산

ARI(Automated Readability Index)는 텍스트를 이해하는 데 필요한 미국 학년 수준을 추정하는 지표다. 예를 들어 ARI 10.0은 미국 10학년(고등학교 1학년) 수준이다.

이 점수를 “복잡도 예산”으로 사용한다. LLM 응답이 예산을 초과하면 더 간결하게 다시 쓰도록 재프롬프팅한다. 장황함을 줄이면 모델이 핵심 사실에 집중하게 되어 환각 위험도 낮아진다는 점이 핵심 가설이다.

1. 설치

pip install textstat langchain_huggingface langchain_community

2. API 토큰 설정 (Google Colab)

Hugging Face API 토큰이 필요하다. huggingface.co/settings/tokens에서 무료로 생성할 수 있다. Colab의 좌측 메뉴 “Secrets” 아이콘에서 HF_TOKEN이라는 이름으로 저장한다.

from google.colab import userdata

HF_TOKEN = userdata.get('HF_TOKEN')

if not HF_TOKEN:
    print("WARNING: HF_TOKEN을 찾을 수 없습니다.")
else:
    print("Hugging Face 토큰 로드 완료.")

3. LangChain 파이프라인 설정

경량 로컬 모델 distilgpt2를 사용해 LangChain 파이프라인을 구성한다. 더 강력한 요약 성능이 필요하면 google/flan-t5-small로 교체할 수 있다(다만 리소스 요구량이 더 높다).

import textstat
from langchain_core.prompts import PromptTemplate
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain_community.llms import HuggingFacePipeline

model_id = "distilgpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=100,
    device=0  # GPU 없으면 CPU로 폴백
)

llm = HuggingFacePipeline(pipeline=pipe)

4. 가드레일 함수 구현

ARI 점수를 측정하고, 복잡도 예산을 초과하면 재프롬프팅하는 safe_summarize 함수다.

def safe_summarize(text_input, complexity_budget=10.0):
    # 1단계: 초기 요약 생성
    base_prompt = PromptTemplate.from_template(
        "Provide a comprehensive summary of the following: {text}"
    )
    chain = base_prompt | llm
    summary = chain.invoke({"text": text_input})

    # 2단계: 가독성 측정
    ari_score = textstat.automated_readability_index(summary)
    print(f"초기 ARI 점수: {ari_score:.2f}")

    # 3단계: 복잡도 예산 초과 시 재프롬프팅
    if ari_score > complexity_budget:
        print("복잡도 예산 초과 — 간소화 가드레일 실행 중...")
        simplification_prompt = PromptTemplate.from_template(
            "The following text is too verbose. Rewrite it concisely "
            "using simple vocabulary, stripping away flowery language:\n\n{text}"
        )
        simplify_chain = simplification_prompt | llm
        summary = simplify_chain.invoke({"text": summary})
        new_ari = textstat.automated_readability_index(summary)
        print(f"재작성 후 ARI 점수: {new_ari:.2f}")
    else:
        print("복잡도 예산 이내 — 재프롬프팅 불필요.")

    return summary

5. 실행 예시

의도적으로 난해하게 작성된 샘플 텍스트로 파이프라인을 테스트한다.

sample_text = """
The inextricably intertwined permutations of cognitive computational arrays within the 
realm of Large Language Models often precipitate a cascade of unnecessarily labyrinthine 
lexical structures. This propensity for circumlocution, whilst seemingly indicative of 
profound erudition, frequently obfuscates the foundational semantic payload, thereby 
rendering the generated discourse significantly less accessible to the quintessential layperson.
"""

final_output = safe_summarize(sample_text, complexity_budget=10.0)
print("\n--- 최종 출력 ---")
print(final_output)

결과 해석과 한계

distilgpt2는 경량 모델이라 요약 품질이 낮아 ARI 점수 감소폭이 크지 않을 수 있다.
google/flan-t5-small 등 요약 전문 모델로 교체하면 더 유의미한 변화를 확인할 수 있다.
이 패턴은 복잡도 예산 개념을 프로덕션에 적용할 때 출발점으로 적합하다. 실제 배포 시에는 GPT-4o나 Claude 같은 강력한 모델과 조합하면 효과가 크게 높아진다.

확장 방향

다중 지표 조합: ARI 외에 Flesch-Kincaid, 문장 수, 단어 수 등을 함께 사용해 더 정교한 가드레일 구성
환각 탐지 추가: RAG 파이프라인에서 검색된 청크와 응답의 겹침 비율을 측정하는 단계 추가
비동기 처리: 재프롬프팅 루프를 비동기로 실행해 지연 시간 최소화

참고 자료

Guardrails for LLMs: Measuring AI ‘Hallucination’ and Verbosity — KDnuggets (2026-05-11)

Like?

AI Sparkup