Transfer Learning For Human Action Recognition

Lopes A.P.B.; Da Santos E.R.S.; Do Valle Jr. E.A.; De Almeida J.M.; De Araujo A.A.

Actas de congresos

Registro en:

9780769545486

Proceedings - 24th Sibgrapi Conference On Graphics, Patterns And Images. , v. , n. , p. 352 - 359, 2011.

10.1109/SIBGRAPI.2011.41

http://www.scopus.com/inward/record.url?eid=2-s2.0-84857184838&partnerID=40&md5=6012fe35a567119f86462f4214fe2c84

http://www.repositorio.unicamp.br/handle/REPOSIP/108156

http://repositorio.unicamp.br/jspui/handle/REPOSIP/108156

2-s2.0-84857184838

http://repositorioslatinoamericanos.uchile.cl/handle/2250/1254175

Autor

Lopes A.P.B.

Da Santos E.R.S.

Do Valle Jr. E.A.

De Almeida J.M.

De Araujo A.A.

Institución

Universidade Estadual de Campinas (Brasil)

Resumen

To manually collect action samples from realistic videos is a time-consuming and error-prone task. This is a serious bottleneck to research related to video understanding, since the large intra-class variations of such videos demand training sets large enough to properly encompass those variations. Most authors dealing with this issue rely on (semi-) automated procedures to collect additional, generally noisy, examples. In this paper, we exploit a different approach, based on a Transfer Learning (TL) technique, to address the target task of action recognition. More specifically, we propose a framework that transfers the knowledge about concepts from a previously labeled still image database to the target action video database. It is assumed that, once identified in the target action database, these concepts provide some contextual clues to the action classifier. Our experiments with Caltech256 and Hollywood2 databases indicate: a) the feasibility of successfully using transfer learning techniques to detect concepts and, b) that it is indeed possible to enhance action recognition with the transferred knowledge of even a few concepts. In our case, only four concepts were enough to obtain statistically significant improvements for most actions. © 2011 IEEE.

352

359

Pan, S.J., Yang, Q., A survey on transfer learning (2009) Transactions on Knowledge and Data Engineering (Pre-print)

Marszalek, M., Laptev, I., Schmid, C., Actions in context (2009) CVPR '09, pp. 2929-2936. , June

Griffin, G., Holub, A., Perona, P., (2007) Caltech-256 Object Category Dataset, , http://authors.library.caltech.edu/7694, California Institute of Technology, Tech. Rep. 7694. [Online]

Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B., Learning realistic human actions from movies (2008) CVPR '08, pp. 1-8. , June

Duchenne, O., Laptev, I., Sivic, J., Bach, F., Ponce, J., Automatic annotation of human actions in video (2009) ICCV '09

Mitchell, T.M., (1997) Machine Learning, , New York: McGraw-Hill

Ulges, A., Schulze, C., Koch, M., Breuel, T.M., Learning automatic concept detectors from online video (2009) Computer Vision and Image Understanding, , http://www.sciencedirect.com/science/article/B6WCX-4X1J787-3/2/ 944190566d7103b11000f88dcc2eb526, In Press, Corrected Proof. [Online]

Wu, T.-F., Lin, C.-J., Weng, R.C., Probability estimates for multiclass classification by pairwise coupling (2004) Journal of Machine Learning Research, 5, pp. 975-1005. , August

Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K., Attribute and simile classifiers for face verification (2009) ICCV '09

Duan, L., Tsang, I.W., Xu, D., Chua, T.-S., Domain adaptation from multiple sources via auxiliary classifiers (2009) Proceedings of the 26th International Conference on Machine Learning, pp. 289-296. , L. Bottou and M. Littman, Eds. Montreal: Omnipress, June

Verbancsics, P., Stanley, K.O., Evolving static representations for task transfer (2010) J. Mach. Learn. Res., 11, pp. 1737-1769. , http://portal.acm.org/citation.cfm?id=1756006.1859909, August. [Online]

Lopes, A.P.B., Oliveira, R.S., De Almeida, J.M., De Albuquerque Araújo, A., Comparing alternatives for capturing dynamic information in bag of visual features approaches applied to human actions recognition (2009) Proceedings of MMSP '09

Ebadollahi, S., Xie, L., Chang, S.-F., Smith, J.R., Visual event detection using multi-dimensional concept dynamics (2006) 2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings, 2006, pp. 881-884. , DOI 10.1109/ICME.2006.262691, 4036741, 2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings

Kennedy, L., (2006) Revision of LSCOM Event/Activity Annotations, DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia, , Columbia University, Tech. Rep., December

Kennedy, L., Hauptmann, A., (2006) LSCOM Lexicon Definitions and Annotations(Version 1.0), , Columbia University, Tech. Rep., March

Sun, J., Wu, X., Yan, S., Cheong, L.-F., Chua, T.-S., Li, J., Hierarchical spatio-temporal context modeling for action recognition (2009) Computer Vision and Pattern Recognition, 2009. CVPR 2009 IEEE Conference on, pp. 2004-2011. , June

Lowe David, G., Object recognition from local scale-invariant features (1999) Proceedings of the IEEE International Conference on Computer Vision, 2, pp. 1150-1157

Wang, H., Ullah, M., Klaser, A., Laptev, I., Schmid, C., Evaluation of local spatio-temporal features for action recognition (2009) BMVC '09, pp. 1-5

Schuldt, C., Laptev, I., Caputo, B., Recognizing human actions: A local SVM approach (2004) ICPR '04, 3, pp. 32-36

Fei-Fei, L., Perona, P., A bayesian hierarchical model for learning natural scene categories (2005) CVPR, pp. 524-531

Dai, W., Yang, Q., Xue, G., Yu, Y., (2007) Boosting for Transfer Learning, pp. 193-200

Raina, R., Battle, A., Lee, H., Packer, B., Ng, A., (2007) Self-taught Learning: Transfer Learning from Unlabeled Data, pp. 759-766

Chang, C.-C., Lin, C.-J., (2001) LIBSVM: A Library for Support Vector Machines, , http://www.csie.ntu.edu.tw/cjlin/libsvm, software

Lazebnik, S., Schmid, C., Ponce, J., Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories (2006) Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, 2, pp. 2169-2178. , DOI 10.1109/CVPR.2006.68, 1641019, Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006

De Avila, S.E.F., Lopes, A.P.B.A., Da Luz Jr., A., De Albuquerque Araújo, A., Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method (2011) Pattern Recogn. Lett., 32, pp. 56-68. , http://dx.doi.org/10.1016/j.patrec.2010.08.004, January. [Online]

Materias

Mostrar el registro completo del ítem