Gelişmiş Arama

Basit öğe kaydını göster

dc.contributor.authorSevim S.
dc.contributor.authorOmurca S.İ.
dc.contributor.authorEkinci E.
dc.date.accessioned2023-03-14T20:29:01Z
dc.date.available2023-03-14T20:29:01Z
dc.date.issued2022
dc.identifier.isbn9.78303E+12
dc.identifier.issn1867-8211
dc.identifier.urihttps://doi.org/10.1007/978-3-031-01984-5_6
dc.identifier.urihttps://hdl.handle.net/20.500.14002/1565
dc.description1st International Congress of Electrical and Computer Engineering, ICECENG 2022 -- 9 February 2022 through 12 February 2022 -- -- 277759en_US
dc.description.abstractDocument image classification has received huge interest in business automation processes. Therefore, document image classification plays an important role in the document image processing (DIP) systems. And it is necessary to develop an effective framework for this task. Many methods have been proposed for the classification of document images in literature. In this paper we propose an efficient document image classification task that uses vision transformers (ViTs) and benefits from visual information of the document. Transformers are models developed for natural language processing tasks. Due to its high performances, their structures have been modified and they have started to be applied on different problems. ViT is one of these models. ViTs have demonstrated imposing performance in computer vision tasks compared with baselines. Since, scans the image and models the relation between the image patches using multi-head self-attention Experiments are conducted on a real-world dataset. Despite the limited size of training data available, our method achieves acceptable performance while performing document image classification. © 2022, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.en_US
dc.description.sponsorshipKocaeli Üniversitesi: FBA-2020-2152en_US
dc.description.sponsorshipAcknowledgments. This work has been supported by the Kocaeli University Scientific Research and Development Support Program (BAP) in Turkey under project number FBA-2020-2152.en_US
dc.language.isoengen_US
dc.publisherSpringer Science and Business Media Deutschland GmbHen_US
dc.relation.ispartofLecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICSTen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectDeep learningen_US
dc.subjectDocument image classificationen_US
dc.subjectTransformersen_US
dc.subjectVision transformersen_US
dc.subjectClassification (of information)en_US
dc.subjectDeep learningen_US
dc.subjectInformation retrieval systemsen_US
dc.subjectNatural language processing systemsen_US
dc.subjectAutomation processen_US
dc.subjectBusiness automationen_US
dc.subjectDeep learningen_US
dc.subjectDocument image classificationen_US
dc.subjectDocument image processingen_US
dc.subjectDocument imagesen_US
dc.subjectImages classificationen_US
dc.subjectPerformanceen_US
dc.subjectTransformeren_US
dc.subjectVision transformeren_US
dc.subjectImage classificationen_US
dc.titleDocument Image Classification with Vision Transformersen_US
dc.typeconferenceObjecten_US
dc.departmentBelirleneceken_US
dc.identifier.doi10.1007/978-3-031-01984-5_6
dc.identifier.volume436 LNICSTen_US
dc.identifier.startpage68en_US
dc.identifier.endpage81en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.authorscopusid57219157474
dc.authorscopusid55014691600
dc.authorscopusid55293166200
dc.identifier.scopus2-s2.0-85130276747en_US


Bu öğenin dosyaları:

DosyalarBoyutBiçimGöster

Bu öğe ile ilişkili dosya yok.

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster