Location

Hilton Waikoloa Village, Hawaii

Event Website

http://www.hicss.hawaii.edu

Start Date

1-4-2017

End Date

1-7-2017

Description

This paper investigates the use of data streaming analytics to better predict the presence of human factors in aviation incidents with new incident reports. As new incidents data become available, the fresh information can help not only evaluate but also improve existing models. First, we use four algorithms in batch learning to establish a baseline for comparison purposes. These are NaiveBayes (NB), Cost Sensitive Classifier (CSC), Hoeffdingtree (VFDT), and OzabagADWIN (OBA). The traditional measure of the classification accuracy rate is used to test their performance. The results show that among the four, NB and CSC are the best classification algorithms. Then we test the classifiers in a data stream setting. The two performance measure methods Holdout and Interleaved Test-Then-Train or Prequential are used in this setting. The Kappa statistic charts of Prequential measure with a sliding window show that NB exhibits the best performance, and is better than the other algorithms. The two different measure methods, batch learning with 10-fold cross validation and data stream with Prequential measure, get one consistent result. CSC is a suitable for unbalanced data in batch learning, but it is not best in Kappa statistic for data stream. Valid incremental algorithms need to be developed for the data stream with unbalanced labels.

Share

COinS
 
Jan 4th, 12:00 AM Jan 7th, 12:00 AM

Identification of Human Factors in Aviation Incidents Using a Data Stream Approach

Hilton Waikoloa Village, Hawaii

This paper investigates the use of data streaming analytics to better predict the presence of human factors in aviation incidents with new incident reports. As new incidents data become available, the fresh information can help not only evaluate but also improve existing models. First, we use four algorithms in batch learning to establish a baseline for comparison purposes. These are NaiveBayes (NB), Cost Sensitive Classifier (CSC), Hoeffdingtree (VFDT), and OzabagADWIN (OBA). The traditional measure of the classification accuracy rate is used to test their performance. The results show that among the four, NB and CSC are the best classification algorithms. Then we test the classifiers in a data stream setting. The two performance measure methods Holdout and Interleaved Test-Then-Train or Prequential are used in this setting. The Kappa statistic charts of Prequential measure with a sliding window show that NB exhibits the best performance, and is better than the other algorithms. The two different measure methods, batch learning with 10-fold cross validation and data stream with Prequential measure, get one consistent result. CSC is a suitable for unbalanced data in batch learning, but it is not best in Kappa statistic for data stream. Valid incremental algorithms need to be developed for the data stream with unbalanced labels.

http://aisel.aisnet.org/hicss-50/da/business_intelligence_case_studies/3