Identificação de escritores usando dissimilaridade em bases multi-script

Noya, Guilherme Pereira

bachelorThesis

Fecha

2017-06-21

Registro en:

NOYA, Guilherme Pereira. Identificação de escritores usando dissimilaridade em bases multi-script. 2017. 46 f. Trabalho de Conclusão de Curso (Graduação) - Universidade Tecnológica Federal do Paraná, Campo Mourão, 2017.

http://repositorio.utfpr.edu.br/jspui/handle/1/6035

https://repositorioslatinoamericanos.uchile.cl/handle/2250/5242764

Autor

Noya, Guilherme Pereira

Institución

Universidade Tecnológica Federal do Paraná (Brasil)

Resumen

Context: In Pattern Recognition, single-scripts situations were vastly studied. Recently, researchers are trying to evaluate multi-script problems. As of new studies are published, this branch is revealing to be more complex and challenging than single-script scenarios. There are some variations to writer-dependent and writer-independent approaches, but a recent study using dissimilarity (a writer-independent approach) applied in a multi-script problem showed promising results. Objective: The objective is to evaluate the performance of the dissimilarity approach in multi-script and single-script scenarios, and also to evaluate the identification rate in cases where the train set and the test set belong to different datasets. Method: Four multi-script datasets are used. The textures are generated from these datasets’ documents and divided in blocks, then the features are extracted with the LBP, RLBP and LPQ texture descriptors. The dissimilarity vectors are calculated from the feature vectors and then the different experiments are executed, in the desired configurations. Finally, the results are combined, in order to obtain a final decision about the classification of the writers. Results: For the writer-dependent approach, the single-script experiments had a better performance than when using multi-script, specially with LPQ. The dissimilarity improved the results in every case, reaching an accuracy of 100% in identification in some of them. The use of LPQ also presented excellent results in transfer learning. Conclusions: The experiments showed variations within the approaches used. The identification rates show that a multi-script configuration is more complex, and the use of dissimilarity provided a huge gain in performance in most of the datasets. It was also showed that when training on one dataset and testing on another, the performance remains satisfactory. Some questions were raised that can originate new studies.

Materias

Sistemas de reconhecimento de padrões

Escrita - Identificação

Computação

Pattern recognition systems

Writing - Identification

Computer science

Mostrar el registro completo del ítem