Visual Words Dictionaries And Fusion Techniques For Searching People Through Textual And Visual Attributes

Using personal traits for searching people is paramount in several application areas and has attracted an ever-growing attention from the scientific community over the past years. Some practical applications in the realm of digital forensics and surveillance include locating a suspect or finding missing people in a public space. In this paper, we aim at assigning describable visual attributes (e.g., white chubby male wearing glasses and with bangs) as labels to images to describe their appearance and performing visual searches without relying on image annotations during testing. For that, we create mid-level image representations for face images based on visual dictionaries linking visual properties in the images to describable attributes. In addition, we take advantage of machine learning techniques for combining different attributes and performing a query. First, we propose three methods for building the visual dictionaries. Method #1 uses a sparse-sampling scheme to obtain low-level features with a clustering algorithm to build the visual dictionaries. Method #2 uses dense-sampling to obtain low-level features and random selection to build the visual dictionaries while Method #3 uses dense-sampling to obtain low-level features followed by a clustering algorithm to build the visual dictionaries. Thereafter, we train 2-class classifiers for the describable visual attributes of interest which assign to each image a decision score used to obtain its ranking. For more complex queries (2+ attributes), we use three state-of-the-art approaches for combining the rankings: (1) product of probabilities, (2) rank aggregation and (3) rank position. To date, we have considered fifteen attribute classifiers and, consequently, their direct counterparts theoretically allowing 2 15=32,768 different combined queries (the actual number is smaller since some attributes are contradictory or mutually exclusive). Notwithstanding, the method is easily extensible to include new attributes. Experimental results show that Method #3 greatly improves retrieval precision for some attributes in comparison with other methods in the literature. Finally, for combined attributes, product of probabilities, rank aggregation and rank position yield complementary results for rank fusion and the final decision making suggesting interesting possible combinations for further work. © 2013 Elsevier B.V. All rights reserved.

2010/05647-4; Microsoft Research

Bay, H., Tuytelaars, T., Gool, L.V., Surf: Speeded up robust features (2006) European Conference on Computer Vision (ECCV), pp. 1-14

Boureau, Y., Bach, F., Lecun, Y., Ponce, J., Learning mid-level features for recognition (2010) IEEE Intl. Conference on Computer Vision And, Pattern Recognition, pp. 2559-2566

Carkacloglu, A., Yarman-Vural, F., SASI: A generic texture descriptor for image retrieval (2003) Pattern Recognition, 36 (11), pp. 2615-2633. , DOI 10.1016/S0031-3203(03)00171-7

Cottrell, G.W., Metcalfe, J., Empath: Face, emotion, and gender recognition using holons (1990) Neural Information Processing Systems (NIPS), pp. 564-571

Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C., Visual categorization with bags of keypoints (2004) European Conference on Computer Vision (ECCV), pp. 1-14

Datta, A., Feris, R., Vaquero, D., Hierarchical ranking of facial attributes (2011) IEEE International Conference on Face and Gesture (F&G), pp. 36-42

Do Valle Jr., E.A., Local-descriptor matching for image identification systems (2008) Ph.D. Thesis, , Université de Cergy-Pontoise École Doctorale Sciences et Ingénierie, Cergy-Pontoise, France (June)

Fabian, J., Pires, R., Rocha, A., Searching for people through textual and visual attributes (2012) 25th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2012, pp. 276-282

Fei-Fei, L., Perona, P., A bayesian hierarchical model for learning natural scene categories (2005) IEEE Intl. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 524-531

Ferrari, V., Zisserman, A., Learning visual attributes (2007) Neural Information Processing Systems (NIPS), pp. 1-8

Golomb, B., Lawrence, D., Sejnowski, T., Sexnet: A neural network identifies sex from human faces (1990) Neural Information Processing Systems (NIPS), pp. 572-577

Gonzalez, R., Woods, R., (2007) Digital Image Processing, , third ed. Prentice-Hall

Haralick, R.M., Shanmugam, K., Textural features for image classification (1973) IEEE Transactions on Systems, Man, and Cybernetics (SMC-3), 6 (1), pp. 610-621

Heflin, B., Scheirer, W., Rocha, A., Boult, T.E., (2011) Pattern Recognition, Machine Intelligence and Biometrics: Expanding Frontiers, No. ISBN 978-3-642-22406-5 in 1, pp. 361-387. , Springer, Ch. A Look at Eye Detection for Unconstrained Environments

Hong, B.-W., Soatto, S., Ni, K., Chan, T., The scale of a texture and its application to segmentation (2008) IEEE Intl. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8

Huang, G., Ramesh, M., Berg, T., (2007) E. Learned-miller, Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments.

Jelinek, H.F., Pires, R., Padilha, R., Goldenstein, S., Wainer, J., Bossomaier, T., Rocha, A., Data fusion for multi-lesion diabetic retinopathy detection (2012) IEEE International Symposium on Computer-based Medical System (CBMS), , Rome, Italy, (in press)

Jurie, F., Triggs, B., Creating efficient codebooks for visual recognition (2005) Proceedings of the IEEE International Conference on Computer Vision, I, pp. 604-610. , DOI 10.1109/ICCV.2005.66, 1541309, Proceedings - 10th IEEE International Conference on Computer Vision, ICCV 2005

Kemeny, J., Mathematics without numbers (1959) Daedalus, 88 (4), pp. 577-591

Kumar, N., Belhumeur, P., Nayar, S., Facetracer: A search engine for collections of images with faces (2008) European Conference on Computer Vision (ECCV), pp. 340-353

Kumar, N., Berg, A.C., Belhumeur, P., Nayar, S., Attribute and simile classifiers for face verification (2009) IEEE International Conference on Computer Vision (ICCV), pp. 365-372

Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K., Describable visual attributes for face verification and image search (2011) IEEE Transactions on Pattern Analysis and Machine Intelligence (T.PAMI), 33 (10), pp. 1962-1977

Lam, L., Suen, C.Y., Optimal combinations of pattern classifiers (1995) PRL, 16 (9), pp. 945-954

Lampert, C., Nickisch, H., Harmeling, S., Learning to detect unseen object classes by between-class attribute transfer (2009) IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 951-958

Lowe, D., Distinctive image features from scale-invariant keypoints (2004) International Journal of Computer Vision (IJCV), 60 (2), pp. 91-110

Nowak, E., Jurie, F., Triggs, B., Sampling strategies for bag-of-features image classification (2006) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3954, pp. 490-503. , DOI 10.1007/11744085-38, Computer Vision - ECCV 2006, 9th European Conference on Computer Vision, Proceedings

Nuray, R., Can, F., Automatic ranking of information retrieval systems using data fusion (2006) Information Processing and Management, 42 (3), pp. 595-614. , DOI 10.1016/j.ipm.2005.03.023, PII S0306457305000555

Park, U., Liao, S., Klare, B., Voss, J., Jain, A.K., Face finder: Filtering a large face database using scars, marks and tattoos (2011) Tech. Rep. TR11, , Michigan State Univ

Pedronette, D., Da, R., Torres, S., Exploiting contextual information for image re-ranking and rank aggregation (2012) International Journal of Multimedia Information Retrieval (JMIR), 1 (1), pp. 115-128

Pedronette, D., Da, R., Torres, S., Exploiting pairwise recommendation and clustering strategies for image re-ranking (2012) Information Sciences (IS), 207 (1), pp. 19-34

Penatti, O.A.B., Valle, E., Torres, R.S., Comparative study of global color and texture descriptors for web image retrieval (2012) Journal of Visual Communication and Image Representation, 23 (2), pp. 359-380

Pires, R., Wainer, J., Jelinek, H.F., Rocha, A., Retinal image quality analysis for automatic diabetic retinopathy detection (2012) 25th Conference on Graphics, Patterns and Images (SIBGRAPI), , Ouro Preto, Brazil (in press)

Presti, L.L., Cascia, M.L., Entropy-based localization of textured regions (2011) Intl. Conference on Image Analysis and Processing (ICIAP), pp. 616-625

Roberts, F., (1976) Discrete Mathematical Models with Applications to Social, Biological, and Environmental Problems, , Prentice Hall

Rocha, A., Carvalho, T., Jelinek, H.F., Goldenstein, S., Wainer, J., Points of interest and visual dictionaries for automatic retinal lesion detection (2012) IEEE Transactions on Biomedical Engineering (T.BME), 59 (8), pp. 2244-2253

Scheirer, W., Rocha, A., Michaels, R., Boult, T.E., Extreme value theory for recognition score normalization (2010) European Conference on Computer Vision (ECCV), pp. 481-495

Scheirer, W., Kumar, N., Ricanek, K., Boult, T., Belhumeur, P., Fusing with context: A bayesian approach to combining descriptive attributes (2011) IEEE Intl. Joint Conference on Biometrics (IJCB), pp. 1-8

Scheirer, W., Kumar, N., Belhumeur, P.N., Boult, T.E., Multi-attribute spaces: Calibration for attribute fusion and similarity search (2012) IEEE Intl. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2933-2940

Viola, P., Jones, M., Robust real-time face detection (2004) International Journal of Computer Vision (IJCV), 57, pp. 137-154

Materias

Mostrar el registro completo del ítem