Vignette studies combine survey research with examples-based experiments to achieve higher external validity while maintaining experimental control (Atzmüller & Steiner, 2010). The method typically involves showing participants several examples and asking them to rate those examples on scales of interest. In the realm of conversational agent and chatbot research, the technique involves showing users screenshots of conversations—ostensibly with either a chatbot or a human—then asking participants to rate various aspects of the interaction, such as anthropomorphism, humanness, and/or social presence. Our research aim is to determine if the findings from vignette-based chatbot studies are congruent with the results when an actual chatbot interaction occurs. Historically, the development of effective chatbots has been extremely challenging, requiring extensive natural language processing and development to create even simple interactions. This difficulty made vignettes an attractive option for studying the impact of chatbot features and design elements on user perceptions (Seeger et al., 2021). Modern advances in large language models (LLMs) have made the process of developing useful chatbots significantly faster, speeding the development process of highly capable interactive agents from months to hours. Using these new capabilities, we propose to replicate and extend several prior studies of human-chatbot interaction that were originally conducted using vignettes. From our experience in chatbot research, we have found valuable insights from experimenting with live chatbots. We hope replicating vignette-based chatbot studies will lead to deeper understanding of human-chatbot interaction. In our investigation, we will identify high impact chatbot research that employs vignettes. We will identify several studies for which an LLM-based chatbot can be developed for use in a vignette-like study. Using the same manipulations as the original studies, we will create a scenario wherein participants will either view a vignette or interact with the chatbot for a brief time, then answer the original study’s measures. Our purpose is to validate the efficacy of vignettes when studying human-chatbot interaction. Future research should continue to explore the discrepancies between indirect and direct interaction with chatbots and how it affects user experience. There may be unique strengths that each form of experimentation brings that the other cannot provide.