Technical Reports
Media Publications
Convolutional Neural Networks for Document Image Classification

Le Kang, Jayant Kumar, Peng Ye, Yi Li and David Doermann


This paper presents a Convolutional Neural Net- work (CNN) for document image classification. In particular, document image classes are defined by the structural similarity. Previous approaches rely on hand-crafted features for capturing structural information. In contrast, we propose to learn features from raw image pixels using CNN. The use of CNN is motivated by the the hierarchical nature of document layout. Equipped with rectified linear units and trained with dropout, our CNN performs well even when document layouts present large inner-class variations. Experiments on public challenging datasets demonstrate the effectiveness of the proposed approach.

Reference: International Conference on Pattern Recognition (ICPR 2014), pp. 3168-3172, August 2014. (BibTex)

Manuscript: (PDF)

home | language group | media group | sponsors & partners | publications | seminars | contact us | staff only
© Copyright 2001, Language and Media Processing Laboratory, University of Maryland, All rights reserved.