Abstract

IS researchers have long relied on self-report instruments—surveys, interviews, and think-aloud protocols—to study technology interaction. These methods are susceptible to recall bias, social desirability effects, and an inability to capture real-time behavioral signals (Compeau et al., 2012; Venkatesh et al., 2013). This limitation is especially consequential in educational IS research, where how learners cognitively and affectively engage with learning management systems (LMS), intelligent tutoring systems, and AI-powered tools matters as much as adoption itself. We propose a framework leveraging AI-powered video analysis—facial expression recognition (FER), gaze tracking, and body posture estimation—as a complementary IS research method. Grounded in Cognitive Load Theory (CLT; Sweller, 1988) and multimodal learning analytics (Blikstein, 2013; Worsley & Blikstein, 2015), we argue that video-derived behavioral signals provide a more granular, temporally continuous window into technology interaction than self-report alone. CLT supplies the bridge: observable cues such as furrowed brows, gaze aversion, and postural shifts are established correlates of cognitive load—a construct central to educational IS use. We outline a multi-method study in which participants interact with an AI-based educational platform under webcam observation. Computer vision models using OpenFace and MediaPipe (Baltrusaitis et al., 2018; Lugaresi et al., 2019) will extract facial action units, gaze vectors, and body pose keypoints at the frame level. These signals will be mapped to IS constructs (cognitive load, engagement) and affective states (e.g., frustration), then triangulated against post-session surveys to establish convergent validity. The study is presently in the instrument design and IRB approval phase, with pilot data collection anticipated for Fall 2026. This research-in-progress anticipates three contributions. First, it formalizes AI video analysis as a rigorous IS research method with a replicable protocol. Second, it advances construct validity by mapping video-derived behavioral signals to established IS constructs. Third, it surfaces ethical considerations—informed consent, data sensitivity, and IRB protocols—that the IS community must address proactively as AI video tools proliferate in research and organizational settings.

Share

COinS