Document Image Classification with Vision Transformers

Sevim S.; Omurca S.İ.; Ekinci E.

dc.contributor.author	Sevim S.
dc.contributor.author	Omurca S.İ.
dc.contributor.author	Ekinci E.
dc.date.accessioned	2023-03-14T20:29:01Z
dc.date.available	2023-03-14T20:29:01Z
dc.date.issued	2022
dc.identifier.isbn	9.78303E+12
dc.identifier.issn	1867-8211
dc.identifier.uri	https://doi.org/10.1007/978-3-031-01984-5_6
dc.identifier.uri	https://hdl.handle.net/20.500.14002/1565
dc.description	1st International Congress of Electrical and Computer Engineering, ICECENG 2022 -- 9 February 2022 through 12 February 2022 -- -- 277759	en_US
dc.description.abstract	Document image classification has received huge interest in business automation processes. Therefore, document image classification plays an important role in the document image processing (DIP) systems. And it is necessary to develop an effective framework for this task. Many methods have been proposed for the classification of document images in literature. In this paper we propose an efficient document image classification task that uses vision transformers (ViTs) and benefits from visual information of the document. Transformers are models developed for natural language processing tasks. Due to its high performances, their structures have been modified and they have started to be applied on different problems. ViT is one of these models. ViTs have demonstrated imposing performance in computer vision tasks compared with baselines. Since, scans the image and models the relation between the image patches using multi-head self-attention Experiments are conducted on a real-world dataset. Despite the limited size of training data available, our method achieves acceptable performance while performing document image classification. © 2022, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.	en_US
dc.description.sponsorship	Kocaeli Üniversitesi: FBA-2020-2152	en_US
dc.description.sponsorship	Acknowledgments. This work has been supported by the Kocaeli University Scientific Research and Development Support Program (BAP) in Turkey under project number FBA-2020-2152.	en_US
dc.language.iso	eng	en_US
dc.publisher	Springer Science and Business Media Deutschland GmbH	en_US
dc.relation.ispartof	Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Deep learning	en_US
dc.subject	Document image classification	en_US
dc.subject	Transformers	en_US
dc.subject	Vision transformers	en_US
dc.subject	Classification (of information)	en_US
dc.subject	Deep learning	en_US
dc.subject	Information retrieval systems	en_US
dc.subject	Natural language processing systems	en_US
dc.subject	Automation process	en_US
dc.subject	Business automation	en_US
dc.subject	Deep learning	en_US
dc.subject	Document image classification	en_US
dc.subject	Document image processing	en_US
dc.subject	Document images	en_US
dc.subject	Images classification	en_US
dc.subject	Performance	en_US
dc.subject	Transformer	en_US
dc.subject	Vision transformer	en_US
dc.subject	Image classification	en_US
dc.title	Document Image Classification with Vision Transformers	en_US
dc.type	conferenceObject	en_US
dc.department	Belirlenecek	en_US
dc.identifier.doi	10.1007/978-3-031-01984-5_6
dc.identifier.volume	436 LNICST	en_US
dc.identifier.startpage	68	en_US
dc.identifier.endpage	81	en_US
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı	en_US
dc.authorscopusid	57219157474
dc.authorscopusid	55014691600
dc.authorscopusid	55293166200
dc.identifier.scopus	2-s2.0-85130276747	en_US

Bu öğenin dosyaları:

Dosyalar	Boyut	Biçim	Göster
Bu öğe ile ilişkili dosya yok.

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Scopus İndeksli Yayınlar Koleksiyonu [1179]
Scopus Indexed Publications Collection

Basit öğe kaydını göster

Document Image Classification with Vision Transformers

Bu öğenin dosyaları:

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

İlgili Öğeler

Development of hybrid artificial intelligence based automatic sleep/awake detection ﻿

Automated classification of choroidal neovascularization, diabetic macular edema, and drusen from retinal OCT images using vision transformers: a comparative study ﻿

Enhancing Prostate Cancer Classification by Leveraging Key Radiomics Features and Using the Fine-Tuned Linear SVM Algorithm ﻿

Development of hybrid artificial intelligence based automatic sleep/awake detection

Automated classification of choroidal neovascularization, diabetic macular edema, and drusen from retinal OCT images using vision transformers: a comparative study

Enhancing Prostate Cancer Classification by Leveraging Key Radiomics Features and Using the Fine-Tuned Linear SVM Algorithm