Actas de congresos
Transfer Learning For Human Action Recognition
Registro en:
9780769545486
Proceedings - 24th Sibgrapi Conference On Graphics, Patterns And Images. , v. , n. , p. 352 - 359, 2011.
10.1109/SIBGRAPI.2011.41
2-s2.0-84857184838
Autor
Lopes A.P.B.
Da Santos E.R.S.
Do Valle Jr. E.A.
De Almeida J.M.
De Araujo A.A.
Institución
Resumen
To manually collect action samples from realistic videos is a time-consuming and error-prone task. This is a serious bottleneck to research related to video understanding, since the large intra-class variations of such videos demand training sets large enough to properly encompass those variations. Most authors dealing with this issue rely on (semi-) automated procedures to collect additional, generally noisy, examples. In this paper, we exploit a different approach, based on a Transfer Learning (TL) technique, to address the target task of action recognition. More specifically, we propose a framework that transfers the knowledge about concepts from a previously labeled still image database to the target action video database. It is assumed that, once identified in the target action database, these concepts provide some contextual clues to the action classifier. Our experiments with Caltech256 and Hollywood2 databases indicate: a) the feasibility of successfully using transfer learning techniques to detect concepts and, b) that it is indeed possible to enhance action recognition with the transferred knowledge of even a few concepts. In our case, only four concepts were enough to obtain statistically significant improvements for most actions. © 2011 IEEE.
352 359 Pan, S.J., Yang, Q., A survey on transfer learning (2009) Transactions on Knowledge and Data Engineering (Pre-print) Marszalek, M., Laptev, I., Schmid, C., Actions in context (2009) CVPR '09, pp. 2929-2936. , June Griffin, G., Holub, A., Perona, P., (2007) Caltech-256 Object Category Dataset, , http://authors.library.caltech.edu/7694, California Institute of Technology, Tech. Rep. 7694. [Online] Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B., Learning realistic human actions from movies (2008) CVPR '08, pp. 1-8. , June Duchenne, O., Laptev, I., Sivic, J., Bach, F., Ponce, J., Automatic annotation of human actions in video (2009) ICCV '09 Mitchell, T.M., (1997) Machine Learning, , New York: McGraw-Hill Ulges, A., Schulze, C., Koch, M., Breuel, T.M., Learning automatic concept detectors from online video (2009) Computer Vision and Image Understanding, , http://www.sciencedirect.com/science/article/B6WCX-4X1J787-3/2/ 944190566d7103b11000f88dcc2eb526, In Press, Corrected Proof. [Online] Wu, T.-F., Lin, C.-J., Weng, R.C., Probability estimates for multiclass classification by pairwise coupling (2004) Journal of Machine Learning Research, 5, pp. 975-1005. , August Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K., Attribute and simile classifiers for face verification (2009) ICCV '09 Duan, L., Tsang, I.W., Xu, D., Chua, T.-S., Domain adaptation from multiple sources via auxiliary classifiers (2009) Proceedings of the 26th International Conference on Machine Learning, pp. 289-296. , L. Bottou and M. Littman, Eds. Montreal: Omnipress, June Verbancsics, P., Stanley, K.O., Evolving static representations for task transfer (2010) J. Mach. Learn. Res., 11, pp. 1737-1769. , http://portal.acm.org/citation.cfm?id=1756006.1859909, August. [Online] Lopes, A.P.B., Oliveira, R.S., De Almeida, J.M., De Albuquerque Araújo, A., Comparing alternatives for capturing dynamic information in bag of visual features approaches applied to human actions recognition (2009) Proceedings of MMSP '09 Ebadollahi, S., Xie, L., Chang, S.-F., Smith, J.R., Visual event detection using multi-dimensional concept dynamics (2006) 2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings, 2006, pp. 881-884. , DOI 10.1109/ICME.2006.262691, 4036741, 2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings Kennedy, L., (2006) Revision of LSCOM Event/Activity Annotations, DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia, , Columbia University, Tech. Rep., December Kennedy, L., Hauptmann, A., (2006) LSCOM Lexicon Definitions and Annotations(Version 1.0), , Columbia University, Tech. Rep., March Sun, J., Wu, X., Yan, S., Cheong, L.-F., Chua, T.-S., Li, J., Hierarchical spatio-temporal context modeling for action recognition (2009) Computer Vision and Pattern Recognition, 2009. CVPR 2009 IEEE Conference on, pp. 2004-2011. , June Lowe David, G., Object recognition from local scale-invariant features (1999) Proceedings of the IEEE International Conference on Computer Vision, 2, pp. 1150-1157 Wang, H., Ullah, M., Klaser, A., Laptev, I., Schmid, C., Evaluation of local spatio-temporal features for action recognition (2009) BMVC '09, pp. 1-5 Schuldt, C., Laptev, I., Caputo, B., Recognizing human actions: A local SVM approach (2004) ICPR '04, 3, pp. 32-36 Fei-Fei, L., Perona, P., A bayesian hierarchical model for learning natural scene categories (2005) CVPR, pp. 524-531 Dai, W., Yang, Q., Xue, G., Yu, Y., (2007) Boosting for Transfer Learning, pp. 193-200 Raina, R., Battle, A., Lee, H., Packer, B., Ng, A., (2007) Self-taught Learning: Transfer Learning from Unlabeled Data, pp. 759-766 Chang, C.-C., Lin, C.-J., (2001) LIBSVM: A Library for Support Vector Machines, , http://www.csie.ntu.edu.tw/cjlin/libsvm, software Lazebnik, S., Schmid, C., Ponce, J., Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories (2006) Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, 2, pp. 2169-2178. , DOI 10.1109/CVPR.2006.68, 1641019, Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006 De Avila, S.E.F., Lopes, A.P.B.A., Da Luz Jr., A., De Albuquerque Araújo, A., Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method (2011) Pattern Recogn. Lett., 32, pp. 56-68. , http://dx.doi.org/10.1016/j.patrec.2010.08.004, January. [Online]