BRIDGING THE BLACK-BOX GAP: EXPLAINABLE AI FOR LARGE LANGUAGE MODELS

Authors

  • Garige Anil Kumar Author

Keywords:

Large Language Models, Explainable AI, Model Interpretability, Faithful Explanations, Transformer Models, Trustworthy AI, Bias Detection

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse natural language processing tasks; however, their transformer-based architectures operate as complex black-box systems with limited transparency into internal reasoning processes. Despite their ability to generate coherent and seemingly logical explanations, such outputs are often not guaranteed to be faithful representations of the model’s true decision pathways, raising critical concerns regarding trust, accountability, bias propagation, hallucination, and regulatory compliance in high-stakes applications. This paper addresses the interpretability gap by proposing a structured Explainable AI (XAI) framework designed to bridge the black-box nature of modern LLMs. The proposed approach integrates intrinsic interpretability mechanisms with post-hoc attribution techniques to produce explanations that are human-understandable, verifiable, and aligned with internal model behavior. A multi-dimensional evaluation strategy is introduced, incorporating faithfulness assessment, robustness testing, explanation consistency analysis, and bias sensitivity measurement. Experimental validation on benchmark natural language tasks demonstrates that the proposed framework enhances explanation reliability without significantly degrading predictive performance. By advancing scalable and verifiable explainability mechanisms, this work contributes toward the development of trustworthy, transparent, and ethically responsible Large Language Models suitable for real-world deployment in safety-critical domains.

Downloads

Download data is not yet available.

Author Biography

  • Garige Anil Kumar

    Assistant Professor, DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING, SREE CHAITANYA COLLEGE OF ENGINEERING, KARIMNAGAR.

Downloads

Published

2026-03-22