The Nation's research enterprise faces a shortage of data scientists. Expanding the pipeline of data science students, particularly from underrepresented populations, requires educational institutions to increase awareness of data science and inspire a passion for data in students as they begin their academic careers. In this tutorial we discuss the development and delivery of a free seminar designed to provide hands-on lessons in the use of both Apache Spark and Jupyter notebooks to students from any academic background in an approachable, no-risk environment. An explanation of the seminar resources, exercises, and implementation guidelines are included, as are lessons learned from several successful seminars held both in-person and virtually at two institutions of high education.
Jensen, Scott; Albert, Leslie J.; and Huerta, Esperanza, "Data Science for All: Apache Spark & Jupyter Notebooks" (2021). Proceedings of the 2021 Pre-ICIS SIGDSA Symposium. 6.