Buscar
Mostrando ítems 1-10 de 8384
Visualizing the document pre-processing effects in text mining process
(2018-01-01)
Text mining is an important step to categorize textual data by using data mining techniques. As most obtained textual data is unstructured, it needs to be processed before applying mining algorithms – that process is known ...
Unsupervised multi-language handwritten text line segmentation
(Journal of Intelligent & Fuzzy Systems, 2018)
Analysis of document pre-processing effects in text and opinion mining
(2018-04-20)
Typically, textual information is available as unstructured data, which require processing so that data mining algorithms can handle such data; this processing is known as the pre-processing step in the overall text mining ...
Document Image Processing
Document Image Processing allows systems like OCR, writer identification, writer recognition, check processing, historical document processing, etc., to extract useful information from document images. What we call a ...
Handwritten Arabic Documents Segmentation into Text Lines using Seam Carving
Inspired from human perception and common text documents characteristics based on readability constraints, an Arabic text line segmentation approach is proposed using seam carving. Taking the gray scale of the image as ...
Hybrid visualization approach to show documents similarity and content in a single view
(2018-05-23)
Multidimensional projection techniques can be employed to project datasets from a higher to a lower dimensional space (e.g., 2D space). These techniques can be used to present the relationships of dataset instances based ...
Detection of Text Lines of Handwritten Arabic Manuscripts using Markov Decision Processes
In a character recognition systems, the segmentation phase is critical since the accuracy of the recognition depend strongly on it. In this paper we present an approach based on Markov Decision Processes to extract text ...
Segmentation of Arabic Handwritten Documents into Text Lines using Watershed Transform
A crucial task in character recognition systems is the segmentation of the document into text lines and especially if it is handwritten. When dealing with non-Latin document such as Arabic, the challenge becomes greater ...
Document Level Emotion Tagging: Machine Learning and Resource Based Approach
(Revista Computación y Sistemas; Vol. 15 No. 2, 2011-12-13)
Abstract. The present task involves the identification of emotions from Bengali blog documents using two separate approaches. The first one is a machine learning approach that accumulates document level information from ...