Paper Type
Complete
Abstract
Generative AI is reshaping software development across the software development life cycle (SDLC), including planning, implementation, review, testing, delivery, and maintenance. By reducing the marginal cost of producing code, generative AI increases the relative importance of verification, integration, coordination, and risk management. This shift creates a measurement challenge: traditional activity proxies (e.g., lines of code, commits) can rise even when end-to-end delivery capacity is constrained by review throughput, quality assurance, and architectural fit. Building on multidimensional productivity theory (SPACE and DevEx) and empirical evidence that AI’s effects vary by task structure, developer experience, and workflow context, this study examines how organizations currently measure developer productivity in AI-assisted environments and how practitioners believe measurement should evolve. We report results from an anonymous mixed methods survey conducted in February 2026, combining descriptive statistics with thematic coding of open-ended responses. Respondents expressed only moderate confidence that current metrics reflect performance under AI assistance and reported widespread reliance on easy-to-instrument volume measures. In contrast, they strongly preferred AI-era indicators that capture effectiveness of AI usage, time saved, task complexity, and the verification and integration work increasingly central to developer contribution. A large majority favored portfolio-based evaluation to reduce gaming and preserve tradeoffs between speed, quality, and impact. We conclude that AI-era productivity should be assessed as system delivery under verification and coordination constraints, not as individual output volume.
Paper Number
1881
Recommended Citation
Luo, Elaine and Guo, Hong, "AI-Era Software Developer Productivity and Performance Metrics" (2026). AMCIS 2026 Proceedings. 17.
https://aisel.aisnet.org/amcis2026/ai_systdesign/ai_systdesign/17
AI-Era Software Developer Productivity and Performance Metrics
Generative AI is reshaping software development across the software development life cycle (SDLC), including planning, implementation, review, testing, delivery, and maintenance. By reducing the marginal cost of producing code, generative AI increases the relative importance of verification, integration, coordination, and risk management. This shift creates a measurement challenge: traditional activity proxies (e.g., lines of code, commits) can rise even when end-to-end delivery capacity is constrained by review throughput, quality assurance, and architectural fit. Building on multidimensional productivity theory (SPACE and DevEx) and empirical evidence that AI’s effects vary by task structure, developer experience, and workflow context, this study examines how organizations currently measure developer productivity in AI-assisted environments and how practitioners believe measurement should evolve. We report results from an anonymous mixed methods survey conducted in February 2026, combining descriptive statistics with thematic coding of open-ended responses. Respondents expressed only moderate confidence that current metrics reflect performance under AI assistance and reported widespread reliance on easy-to-instrument volume measures. In contrast, they strongly preferred AI-era indicators that capture effectiveness of AI usage, time saved, task complexity, and the verification and integration work increasingly central to developer contribution. A large majority favored portfolio-based evaluation to reduce gaming and preserve tradeoffs between speed, quality, and impact. We conclude that AI-era productivity should be assessed as system delivery under verification and coordination constraints, not as individual output volume.
Comments
AI SYSTEM