Actas de congresos
Bossanova At Imageclef 2012 Flickr Photo Annotation Task
Registro en:
Ceur Workshop Proceedings. Ceur-ws, v. 1178, n. , p. - , 2012.
16130073
2-s2.0-84922031785
Autor
Avila S.
Thome N.
Cord M.
Valle E.
De Araujo A.
Institución
Resumen
We present the BossaNova scheme for the ImageCLEF 2012 Flickr Photo Annotation Task. BossaNova is a mid-level image representation, recently developed by our team, that enriches the Bag-of-Words representation, by keeping a histogram of distances between the descriptors found in the image and those in the codebook. Our scheme has the advantage of being conceptually simple, non-parametric, and easily adaptable. Compared to other schemes existing in the literature to add information to the Bag-of-Words model, it leads to much more compact representations. Furthermore, it complements well the cutting-edge Fisher Vector representations, showing even better results when employed in combination with them. In our participation, we submitted four purely visual runs. Our best result (MiAP = 34.37%) achieved the second rank by MiAP measure among the 28 purely visual submissions and the 18 teams. 1178
Thomee, B., Popescu, A., (2012) Overview of the ImageCLEF 2012 Flickr Photo Annotation and Retrieval Task, , In: CLEF 2012 working notes Rome Italy Avila, S., Thome, N., Cord, M., Valle, E., De Araújo, A., Pooling in image representation: The visual codeword point of view CVIU, , Special Issue on Visual Concept Detection (under review Sivic, J., Zisserman, A., Video Google: A text retrieval approach to object matching in videos (2003) ICCV., 2 Boureau, Y., Bach, F., LeCun, Y., Ponce, J., Learning mid-level features for recognition (2010) CVPR., pp. 2559-2566 Perronnin, F., Sánchez, J., Mensink, T., Improving the Fisher Kernel for large- scale image classification (2010) ECCV., pp. 143-156 Lowe, D., Distinctive image features from scale-invariant keypoints (2004) IJCV, 60, pp. 91-110 Vapnik, V.N., (1995) The Nature of Statistical Learning Theory, , Springer-Verlag New York, Inc Avila, S., Thome, N., Cord, M., Valle, E., Araújo, A., BOSSA: Extended BoW formalism for image classification (2011) ICIP., pp. 2909-2912 Vedaldi, A., Fulkerson, B., VLFeat - An open and portable library of computer vision algorithms (2010) ACM International Conference on Multimedia. Lazebnik, S., Schmid, C., Ponce, J., Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories (2006) CVPR., pp. 2169-2178