Location
Online
Event Website
https://hicss.hawaii.edu/
Start Date
3-1-2023 12:00 AM
End Date
7-1-2023 12:00 AM
Description
High-quality labelled datasets represent a cornerstone in the development of deep learning models for land use classification. The high cost of data collection, the inherent errors introduced during data mapping efforts, the lack of local knowledge, and the spatial variability of the data hinder the development of accurate and spatially-transferable deep learning models in the context of agriculture. In this paper, we investigate the use of Isolation Forest (IF), an anomaly detection algorithm, to reduce noise in a large-scale, low-resolution alternative ground truth dataset used to train land use deep learning models. We use a modest-size, high-resolution and high-fidelity manually collected ground-truth dataset to calibrate Isolation Forest parameters and evaluate our approach, highlighting the relatively low cost of the methodology. Our data-centric methodology demonstrates the efficacy of deep learning methods coupled with IF to create mid-resolution land-use models and map products for agriculture using an alternative ground-truth dataset. Moreover, we compare our deep learning approach with a traditional algorithm used in remote sensing and evaluate the spatial transferability of the created models. Finally, we reflect upon the lessons learnt and future work.
Recommended Citation
García Pereira, Agustín; Porwol, Lukasz; and Ojo, Adegboyega, "Using Isolation Forest and Alternative Data Products to Overcome Ground Truth Data Scarcity for Improved Deep Learning-based Agricultural Land Use Classification Models" (2023). Hawaii International Conference on System Sciences 2023 (HICSS-56). 2.
https://aisel.aisnet.org/hicss-56/li/data_analytics/2
Using Isolation Forest and Alternative Data Products to Overcome Ground Truth Data Scarcity for Improved Deep Learning-based Agricultural Land Use Classification Models
Online
High-quality labelled datasets represent a cornerstone in the development of deep learning models for land use classification. The high cost of data collection, the inherent errors introduced during data mapping efforts, the lack of local knowledge, and the spatial variability of the data hinder the development of accurate and spatially-transferable deep learning models in the context of agriculture. In this paper, we investigate the use of Isolation Forest (IF), an anomaly detection algorithm, to reduce noise in a large-scale, low-resolution alternative ground truth dataset used to train land use deep learning models. We use a modest-size, high-resolution and high-fidelity manually collected ground-truth dataset to calibrate Isolation Forest parameters and evaluate our approach, highlighting the relatively low cost of the methodology. Our data-centric methodology demonstrates the efficacy of deep learning methods coupled with IF to create mid-resolution land-use models and map products for agriculture using an alternative ground-truth dataset. Moreover, we compare our deep learning approach with a traditional algorithm used in remote sensing and evaluate the spatial transferability of the created models. Finally, we reflect upon the lessons learnt and future work.
https://aisel.aisnet.org/hicss-56/li/data_analytics/2