Paper Number

ECIS2026-1690

Paper Type

CRP

Abstract

Knowledge-enhanced clinical prediction models treat all sources uniformly, ignoring reliability differences between expert-curated ontologies (Unified Medical Language System, UMLS) and Large Language Model (LLM)-generated knowledge susceptible to hallucinations, risking erroneous predictions and overwhelming explanations. Deploying LLM-generated knowledge in safety-critical clinical settings without accounting for hallucination risk can compromise prediction reliability. To address this, we introduce a confidence-weighted framework that explicitly models source reliability during multi-source knowledge graph (KG) integration. We constructed hybrid knowledge graphs combining UMLS (33,541 triples) with GPT-4-generated relationships (51,418 triples), assigning confidence scores based on source provenance and frequency. We developed confidence-weighted graph neural networks with temporal attention, training on MIMIC-III for mortality and readmission prediction. Expert validation assessed LLM-generated knowledge quality and revealed 99.6% LLM-generated triple validity. Our approach achieved 13.89% AUPRC for mortality prediction and 73.26% for readmission. Confidence-weighted explanations reduce spurious relationships, producing focused clinical insights suitable for decision-making. Explicit confidence modelling enables robust heterogeneous knowledge integration while mitigating LLM hallucination risks, providing a pathway for deploying LLM-augmented medical AI in high-stakes clinical environments

Share

COinS
 
Jun 14th, 12:00 AM

Increasing Reliability Of Clinical Predictions With Confidence-Weighted Multi-Source Knowledge Graphs

Knowledge-enhanced clinical prediction models treat all sources uniformly, ignoring reliability differences between expert-curated ontologies (Unified Medical Language System, UMLS) and Large Language Model (LLM)-generated knowledge susceptible to hallucinations, risking erroneous predictions and overwhelming explanations. Deploying LLM-generated knowledge in safety-critical clinical settings without accounting for hallucination risk can compromise prediction reliability. To address this, we introduce a confidence-weighted framework that explicitly models source reliability during multi-source knowledge graph (KG) integration. We constructed hybrid knowledge graphs combining UMLS (33,541 triples) with GPT-4-generated relationships (51,418 triples), assigning confidence scores based on source provenance and frequency. We developed confidence-weighted graph neural networks with temporal attention, training on MIMIC-III for mortality and readmission prediction. Expert validation assessed LLM-generated knowledge quality and revealed 99.6% LLM-generated triple validity. Our approach achieved 13.89% AUPRC for mortality prediction and 73.26% for readmission. Confidence-weighted explanations reduce spurious relationships, producing focused clinical insights suitable for decision-making. Explicit confidence modelling enables robust heterogeneous knowledge integration while mitigating LLM hallucination risks, providing a pathway for deploying LLM-augmented medical AI in high-stakes clinical environments