Document Image Classification with Vision Transformers
Özet
Document image classification has received huge interest in business automation processes. Therefore, document image classification plays an important role in the document image processing (DIP) systems. And it is necessary to develop an effective framework for this task. Many methods have been proposed for the classification of document images in literature. In this paper we propose an efficient document image classification task that uses vision transformers (ViTs) and benefits from visual information of the document. Transformers are models developed for natural language processing tasks. Due to its high performances, their structures have been modified and they have started to be applied on different problems. ViT is one of these models. ViTs have demonstrated imposing performance in computer vision tasks compared with baselines. Since, scans the image and models the relation between the image patches using multi-head self-attention Experiments are conducted on a real-world dataset. Despite the limited size of training data available, our method achieves acceptable performance while performing document image classification. © 2022, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
Kaynak
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICSTCilt
436 LNICSTKoleksiyonlar
İlgili Öğeler
Başlık, yazar, küratör ve konuya göre gösterilen ilgili öğeler.
-
Development of hybrid artificial intelligence based automatic sleep/awake detection
Bozkurt, Mehmet Recep; Uçar, Muhammed Kürşad; Bozkurt, Ferda; Bilgin, Cahit (Inst Engineering Technology-Iet, 2020)Background and Objective: Obstructive Sleep Apnea is a disease that causes respiratory arrest in sleep and reduces sleep quality. The diagnosis of the disease is made by the physician in two stages by examining the patient ... -
Automated classification of choroidal neovascularization, diabetic macular edema, and drusen from retinal OCT images using vision transformers: a comparative study
Akça, Said; Garip, Zeynep; Ekinci, Ekin; Atban, Furkan (Springer Science and Business Media Deutschland GmbH, 2024)Classifying retinal diseases is a complex problem because the early problematic areas of retinal disorders are quite small and conservative. In recent years, Transformer architectures have been successfully applied to solve ... -
Enhancing Prostate Cancer Classification by Leveraging Key Radiomics Features and Using the Fine-Tuned Linear SVM Algorithm
Varan, Metin; Azimjonov, Jahongir; Macal, Bilgen (Institute of Electrical and Electronics Engineers Inc., 2023)This paper focuses on enhancing machine learning (ML)-based diagnosis and clinical decision-making by leveraging radiomics data, which provides a quantitative description of grayscale medical images such as MRI, CT, PET, ...