Location
Hilton Hawaiian Village, Honolulu, Hawaii
Event Website
https://hicss.hawaii.edu/
Start Date
3-1-2024 12:00 AM
End Date
6-1-2024 12:00 AM
Description
Missing values in urban data can be caused by sensor or software failures, data quality issues, interference from weather events, incomplete data collection, or varying data use regulations; any missing data can render the entire dataset unusable for downstream applications. In our work, we adapt image inpainting techniques to impute large, irregular missing regions in urban settings characterized by temporal dependency and spatial skew. To incorporate temporal information, we adapt computer vision techniques for image inpainting to operate on 3D histograms (2D space + 1D time) commonly used for data exchange in urban settings. To combat spatial skew of urban data --- small dense regions surrounded by large sparse areas, we 1) train simultaneously in space and time, and 2) focus attention on dense regions by biasing the masks used for training to the dense resgions in the data. We evaluate the core model and these two extensions using the NYC taxi data and the NYC bikeshare data, simulating different conditions for missing data. We show that the core model is effective qualitatively and quantitatively, that biased masking during training reduces error, and that the number of timesteps during learning exhibits a tradeoff between model performance and resolution of transient events.
Recommended Citation
Han, Bin and Howe, Bill, "Geospatial Imputation of Urban Mobility Data with Self-Supervised Learning" (2024). Hawaii International Conference on System Sciences 2024 (HICSS-57). 3.
https://aisel.aisnet.org/hicss-57/li/data_analytics/3
Geospatial Imputation of Urban Mobility Data with Self-Supervised Learning
Hilton Hawaiian Village, Honolulu, Hawaii
Missing values in urban data can be caused by sensor or software failures, data quality issues, interference from weather events, incomplete data collection, or varying data use regulations; any missing data can render the entire dataset unusable for downstream applications. In our work, we adapt image inpainting techniques to impute large, irregular missing regions in urban settings characterized by temporal dependency and spatial skew. To incorporate temporal information, we adapt computer vision techniques for image inpainting to operate on 3D histograms (2D space + 1D time) commonly used for data exchange in urban settings. To combat spatial skew of urban data --- small dense regions surrounded by large sparse areas, we 1) train simultaneously in space and time, and 2) focus attention on dense regions by biasing the masks used for training to the dense resgions in the data. We evaluate the core model and these two extensions using the NYC taxi data and the NYC bikeshare data, simulating different conditions for missing data. We show that the core model is effective qualitatively and quantitatively, that biased masking during training reduces error, and that the number of timesteps during learning exhibits a tradeoff between model performance and resolution of transient events.
https://aisel.aisnet.org/hicss-57/li/data_analytics/3