loading page

High-accuracy fine-tuned vision transformer model for diagnosing COVID-19 from chest X-ray images
  • +3
  • Tianyi Chen,
  • Ian Philippi,
  • Quoc Bao Phan,
  • Linh Nguyen,
  • Carlo daCunha,
  • Tuy Nguyen
Tianyi Chen
School of Informatics, Computing, and Cyber Systems, Northern Arizona University
Ian Philippi
School of Informatics, Computing, and Cyber Systems, Northern Arizona University
Quoc Bao Phan
School of Informatics, Computing, and Cyber Systems, Northern Arizona University
Linh Nguyen
School of Informatics, Computing, and Cyber Systems, Northern Arizona University
Carlo daCunha
School of Informatics, Computing, and Cyber Systems, Northern Arizona University
Tuy Nguyen
School of Informatics, Computing, and Cyber Systems, Northern Arizona University

Corresponding Author:[email protected]

Author Profile

Abstract

This research investigates the application of machine learning for diagnosing COVID-19 from chest X-rays. We analyze various popular architectures, including efficient neural networks (EfficientNet), multiscale vision transformers (MViT), efficient vision transformers (EfficientViT), and vision transformers (ViT), on a dataset categorized into COVID, lung opacity, normal, and viral pneumonia. While multiscale models demonstrate a tendency to overfit, our proposed fine-tuning ViT model achieves significant accuracy, reaching 95.79% in four-class classification, 99.57% in a clinically relevant three-class grouping, and similarly high performance in binary classification. Validation through quantitative metrics and visualization solidifies the model's effectiveness. Comparative analysis showcases the superiority of our approach. Overall, these findings showcase the potential of ViT for accurate COVID-19 diagnosis, contributing to the advancement of medical image analysis.
24 Dec 2023Submitted to TechRxiv
02 Jan 2024Published in TechRxiv