BRIDGING THE BLACK-BOX GAP: EXPLAINABLE AI FOR LARGE LANGUAGE MODELS

Garige Anil Kumar

BRIDGING THE BLACK-BOX GAP: EXPLAINABLE AI FOR LARGE LANGUAGE MODELS

Authors

Garige Anil Kumar Author

Keywords:

Large Language Models, Explainable AI, Model Interpretability, Faithful Explanations, Transformer Models, Trustworthy AI, Bias Detection

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse natural language processing tasks; however, their transformer-based architectures operate as complex black-box systems with limited transparency into internal reasoning processes. Despite their ability to generate coherent and seemingly logical explanations, such outputs are often not guaranteed to be faithful representations of the model’s true decision pathways, raising critical concerns regarding trust, accountability, bias propagation, hallucination, and regulatory compliance in high-stakes applications. This paper addresses the interpretability gap by proposing a structured Explainable AI (XAI) framework designed to bridge the black-box nature of modern LLMs. The proposed approach integrates intrinsic interpretability mechanisms with post-hoc attribution techniques to produce explanations that are human-understandable, verifiable, and aligned with internal model behavior. A multi-dimensional evaluation strategy is introduced, incorporating faithfulness assessment, robustness testing, explanation consistency analysis, and bias sensitivity measurement. Experimental validation on benchmark natural language tasks demonstrates that the proposed framework enhances explanation reliability without significantly degrading predictive performance. By advancing scalable and verifiable explainability mechanisms, this work contributes toward the development of trustworthy, transparent, and ethically responsible Large Language Models suitable for real-world deployment in safety-critical domains.

Downloads

Download data is not yet available.

Author Biography

Garige Anil Kumar

Assistant Professor, DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING, SREE CHAITANYA COLLEGE OF ENGINEERING, KARIMNAGAR.

Downloads

FULL ARTICLE

Published

2026-03-22

Issue

Vol. 2 No. 1 (2026): Volume 2, Issue 1, January-March, 2026

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

All articles published in the Journal of Engineering Excellence (JEE) are licensed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0).

Under this license, authors retain full copyright of their work while granting permission for anyone to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or author — provided that the original work is properly cited.

This open-access license ensures maximum dissemination and impact of the published research by allowing free and immediate access to scholarly work.

For more details, please refer to the official license page:
???? https://creativecommons.org/licenses/by/4.0/

BRIDGING THE BLACK-BOX GAP: EXPLAINABLE AI FOR LARGE LANGUAGE MODELS

Authors

Keywords:

Abstract

Downloads

Author Biography

Downloads

Published

Issue

Section

License

ISSN

Cover Page

Downloads

Information

Publisher Information