Location
Hilton Hawaiian Village, Honolulu, Hawaii
Event Website
https://hicss.hawaii.edu/
Start Date
3-1-2024 12:00 AM
End Date
6-1-2024 12:00 AM
Description
data interpretation. Building such tools requires labeled training datasets. We tested whether a gamified crowdsourcing approach can produce clinical expert-quality lung ultrasound clip labels. 2,384 lung ultrasound clips were retrospectively collected. Six lung ultrasound experts classified 393 of these clips as having no B-lines, one or more discrete B-lines, or confluent B-lines to create two sets of reference standard labels: a training and test set. Sets trained users on a gamified crowdsourcing platform, and compared concordance of the resulting crowd labels to the concordance of individual experts to reference standards, respectively. 99,238 crowdsourced opinions were collected from 426 unique users over 8 days. Mean labeling concordance of individual experts relative to the reference standard was 85.0% ± 2.0 (SEM), compared to 87.9% crowdsourced label concordance (p=0.15). Scalable, high-quality labeling approaches such as crowdsourcing may streamline training dataset creation for machine learning model development.
Recommended Citation
Duggan, Nicole M.; Jin, Mike; Duhaime, Erik; Kapur, Tina; Duran Mendicuti, Maria Alejandra; Hallisey, Stephen; Bernier, Denie; Selame, Lauren; Asgari-Targhi, Ameneh; Fischetti, Chanel; Lucassen, Ruben; and Samir, Anthony E., "Expert-quality Dataset Labeling via Gamified Crowdsourcing on Point-of-Care Lung Ultrasound Data" (2024). Hawaii International Conference on System Sciences 2024 (HICSS-57). 4.
https://aisel.aisnet.org/hicss-57/hc/emergency_care/4
Expert-quality Dataset Labeling via Gamified Crowdsourcing on Point-of-Care Lung Ultrasound Data
Hilton Hawaiian Village, Honolulu, Hawaii
data interpretation. Building such tools requires labeled training datasets. We tested whether a gamified crowdsourcing approach can produce clinical expert-quality lung ultrasound clip labels. 2,384 lung ultrasound clips were retrospectively collected. Six lung ultrasound experts classified 393 of these clips as having no B-lines, one or more discrete B-lines, or confluent B-lines to create two sets of reference standard labels: a training and test set. Sets trained users on a gamified crowdsourcing platform, and compared concordance of the resulting crowd labels to the concordance of individual experts to reference standards, respectively. 99,238 crowdsourced opinions were collected from 426 unique users over 8 days. Mean labeling concordance of individual experts relative to the reference standard was 85.0% ± 2.0 (SEM), compared to 87.9% crowdsourced label concordance (p=0.15). Scalable, high-quality labeling approaches such as crowdsourcing may streamline training dataset creation for machine learning model development.
https://aisel.aisnet.org/hicss-57/hc/emergency_care/4