Artículos de revistas
Semantic information extraction from images of complex documents
Fecha
2012-12Registro en:
APPLIED INTELLIGENCE, DORDRECHT, v. 37, n. 4, supl. 1, Part 1, pp. 543-557, DEC, 2012
0924-669X
10.1007/s10489-012-0348-x
Autor
Peanho, Claudio Antonio
Stagni, Henrique
Silva, Flavio Soares Correa da
Institución
Resumen
Even though the digital processing of documents is increasingly widespread in industry, printed documents are still largely in use. In order to process electronically the contents of printed documents, information must be extracted from digital images of documents. When dealing with complex documents, in which the contents of different regions and fields can be highly heterogeneous with respect to layout, printing quality and the utilization of fonts and typing standards, the reconstruction of the contents of documents from digital images can be a difficult problem. In the present article we present an efficient solution for this problem, in which the semantic contents of fields in a complex document are extracted from a digital image.