Paper Number
ECIS2026-1690
Paper Type
CRP
Abstract
Knowledge-enhanced clinical prediction models treat all sources uniformly, ignoring reliability differences between expert-curated ontologies (Unified Medical Language System, UMLS) and Large Language Model (LLM)-generated knowledge susceptible to hallucinations, risking erroneous predictions and overwhelming explanations. Deploying LLM-generated knowledge in safety-critical clinical settings without accounting for hallucination risk can compromise prediction reliability. To address this, we introduce a confidence-weighted framework that explicitly models source reliability during multi-source knowledge graph (KG) integration. We constructed hybrid knowledge graphs combining UMLS (33,541 triples) with GPT-4-generated relationships (51,418 triples), assigning confidence scores based on source provenance and frequency. We developed confidence-weighted graph neural networks with temporal attention, training on MIMIC-III for mortality and readmission prediction. Expert validation assessed LLM-generated knowledge quality and revealed 99.6% LLM-generated triple validity. Our approach achieved 13.89% AUPRC for mortality prediction and 73.26% for readmission. Confidence-weighted explanations reduce spurious relationships, producing focused clinical insights suitable for decision-making. Explicit confidence modelling enables robust heterogeneous knowledge integration while mitigating LLM hallucination risks, providing a pathway for deploying LLM-augmented medical AI in high-stakes clinical environments
Recommended Citation
Putri, Diyah Utami Kusumaning; Quirchmayr, Gerald; and Weippl, Edgar, "Increasing Reliability Of Clinical Predictions With Confidence-Weighted Multi-Source Knowledge Graphs" (2026). ECIS 2026 Proceedings. 5.
https://aisel.aisnet.org/ecis2026/hit/hit/5
Increasing Reliability Of Clinical Predictions With Confidence-Weighted Multi-Source Knowledge Graphs
Knowledge-enhanced clinical prediction models treat all sources uniformly, ignoring reliability differences between expert-curated ontologies (Unified Medical Language System, UMLS) and Large Language Model (LLM)-generated knowledge susceptible to hallucinations, risking erroneous predictions and overwhelming explanations. Deploying LLM-generated knowledge in safety-critical clinical settings without accounting for hallucination risk can compromise prediction reliability. To address this, we introduce a confidence-weighted framework that explicitly models source reliability during multi-source knowledge graph (KG) integration. We constructed hybrid knowledge graphs combining UMLS (33,541 triples) with GPT-4-generated relationships (51,418 triples), assigning confidence scores based on source provenance and frequency. We developed confidence-weighted graph neural networks with temporal attention, training on MIMIC-III for mortality and readmission prediction. Expert validation assessed LLM-generated knowledge quality and revealed 99.6% LLM-generated triple validity. Our approach achieved 13.89% AUPRC for mortality prediction and 73.26% for readmission. Confidence-weighted explanations reduce spurious relationships, producing focused clinical insights suitable for decision-making. Explicit confidence modelling enables robust heterogeneous knowledge integration while mitigating LLM hallucination risks, providing a pathway for deploying LLM-augmented medical AI in high-stakes clinical environments