Abstract

Service providers increasingly use chatbots to extend their portfolio. However, out-of-the-box large language models (LLM) have limited domain knowledge that hinders their usage for particular use cases. Thus, service providers must utilize domain adoption to allow the implementation of more advanced chatbots. Recent LLM implementations apply retrieval-augmented generation (RAG) to give chatbot access to domain knowledge, e.g., in the form of a vector search additional to the LLM generation process. We compare three RAG implementations and their implications on performance and economic costs. We find that default RAG implementations might lack accuracy when considering specific characteristics in the vector search, e.g., a product category. Fine-tuning a lightweight transformer model for the structured extraction of information can increase the RAG performance while limiting costs. Besides, it has a low impact on the additional execution time compared to other RAG implementations.

Share

COinS