Paper Number

1039

Paper Type

Completed

Description

Machine learning techniques are successful for optical character recognition tasks, especially in recognizing handwriting. However, recognizing Vietnamese handwriting is challenging with the presence of extra six distinctive tonal symbols and vowels. Such a challenge is amplified given the handwriting of health workers in an emergency care setting, where staff is under constant pressure to record the well-being of patients. In this study, we aim to digitize the handwriting of Vietnamese health workers. We develop a complete handwritten text recognition pipeline that receives scanned documents, detects, and enhances the handwriting text areas of interest, transcribes the images into computer text, and finally auto-corrects invalid words and terms to achieve high accuracy. From experiments with medical documents written by 30 doctors and nurses from the Tetanus Emergency Care unit at the Hospital for Tropical Diseases, we obtain promising results of 2% and 12% for Character Error Rate and Word Error Rate, respectively.

Comments

16-HealthCare

Share

COinS
 
Dec 11th, 12:00 AM

A Vietnamese Handwritten Text Recognition Pipeline for Tetanus Medical Records

Machine learning techniques are successful for optical character recognition tasks, especially in recognizing handwriting. However, recognizing Vietnamese handwriting is challenging with the presence of extra six distinctive tonal symbols and vowels. Such a challenge is amplified given the handwriting of health workers in an emergency care setting, where staff is under constant pressure to record the well-being of patients. In this study, we aim to digitize the handwriting of Vietnamese health workers. We develop a complete handwritten text recognition pipeline that receives scanned documents, detects, and enhances the handwriting text areas of interest, transcribes the images into computer text, and finally auto-corrects invalid words and terms to achieve high accuracy. From experiments with medical documents written by 30 doctors and nurses from the Tetanus Emergency Care unit at the Hospital for Tropical Diseases, we obtain promising results of 2% and 12% for Character Error Rate and Word Error Rate, respectively.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.