Thor: A Deep Learning Approach for Face Mask Detection to Prevent the COVID-19 Pandemic

Document Type

Conference Proceeding

Publication Date



With the rapid worldwide spread of Coronavirus (COVID-19 and COVID-20), wearing face masks in public becomes a necessity to mitigate the transmission of this or other pandemics. However, with the lack of on-ground automated prevention measures, depending on humans to enforce face mask-wearing policies in universities and other organizational buildings, is a very costly and time-consuming measure. Without addressing this challenge, mitigating highly airborne transmittable diseases will be impractical, and the time to react will continue to increase. Considering the high personnel traffic in buildings and the effectiveness of countermeasures, that is, detecting and offering unmasked personnel with surgical masks, our aim in this paper is to develop automated detection of unmasked personnel in public spaces in order to respond by providing a surgical mask to them to promptly remedy the situation. Our approach consists of three key components. The first component utilizes a deep learning architecture that integrates deep residual learning (ResNet-50) with Feature Pyramid Network (FPN) to detect the existence of human subjects in the videos (or video feed). The second component utilizes Multi-Task Convolutional Neural Networks (MT-CNN) to detect and extract human faces from these videos. For the third component, we construct and train a convolutional neural network classifier to detect masked and unmasked human subjects. Our techniques were implemented in a mobile robot, Thor, and evaluated using a dataset of videos collected by the robot from public spaces of an educational institute in the U.S. Our evaluation results show that Thor is very accurate achieving an F_{1} score of 87.7% with a recall of 99.2% in a variety of situations, a reasonable accuracy given the challenging dataset and the problem domain.