Abstract

Adapting large language models (LLMs) to formal, low-resource domains-such as public procurement or regulatory writing-remains a significant challenge, particularly in non-English contexts. We present a lightweight hybrid framework that combines symbolic 3-gram Markov models with neural generation using DistilGPT2. The approach introduces symbolic guidance in two stages: domain-specific few-shot prompting and decoding-time probability adjustment. This enables domain-consistent generation without model retraining. Evaluated on Polish public procurement documents and deployed on CPU-only infrastructure, the method improves domain fidelity, structure, and semantics, as measured by BLEU, ROUGE-L, and BERTScore. The proposed framework offers a scalable, inference-only alternative to fine-tuning for generating formal texts under strict resource constraints.

Recommended Citation

Gontar, Z. & Gontar, B. (2025). Hybrid Symbolic-Neural Domain Adaptation via SymbSteer. Markov-Guided Prompting and Decoding for Resource-Efficient Language Model SteeringIn I. Luković, S. Bjeladinović, B. Delibašić, D. Barać, N. Iivari, E. Insfran, M. Lang, H. Linger, & C. Schneider (Eds.), Empowering the Interdisciplinary Role of ISD in Addressing Contemporary Issues in Digital Transformation: How Data Science and Generative AI Contributes to ISD (ISD2025 Proceedings). Belgrade, Serbia: University of Gdańsk, Department of Business Informatics & University of Belgrade, Faculty of Organizational Sciences. ISBN: 978-83-972632-1-5. https://doi.org/10.62036/ISD.2025.60

Paper Type

Short Paper

DOI

10.62036/ISD.2025.60

Share

COinS
 

Hybrid Symbolic-Neural Domain Adaptation via SymbSteer. Markov-Guided Prompting and Decoding for Resource-Efficient Language Model Steering

Adapting large language models (LLMs) to formal, low-resource domains-such as public procurement or regulatory writing-remains a significant challenge, particularly in non-English contexts. We present a lightweight hybrid framework that combines symbolic 3-gram Markov models with neural generation using DistilGPT2. The approach introduces symbolic guidance in two stages: domain-specific few-shot prompting and decoding-time probability adjustment. This enables domain-consistent generation without model retraining. Evaluated on Polish public procurement documents and deployed on CPU-only infrastructure, the method improves domain fidelity, structure, and semantics, as measured by BLEU, ROUGE-L, and BERTScore. The proposed framework offers a scalable, inference-only alternative to fine-tuning for generating formal texts under strict resource constraints.