LBP Advantages over CNN Face Detection Method on Facial Recognition System in NOVA Robot
Network-optimized virtual assistant (NOVA) is a robot developed by Bandung Techno Park (BTP) that can interact with humans for various purposes, such as a receptionist robot. NOVA robot is still in development and one of the main focuses is adding face recognition features so that the robot can actively greet and interact with humans. Therefore, we propose a face recognition and tracking system based on neural networks. This system is developed using the Google FaceNet feature extraction method. Previously, face detection in NOVA robot was implemented by employing the multi-task cascaded convolutional networks (MTCNN) method, whereas face tracking on the system was realized by using the modification of the MOSSE object tracking method. However, we found that the implementation of MTCNN in NOVA robot cannot run better than 30 fps. Therefore, this paper aims to solve this issue by investigating conventional face detection methods that could outperform MTCNN in this regard. Tests conducted on the ChokePoint dataset demonstrates that the system with LBP can achieve 30.44 fps framerate with a precision of 95% and recall of 83%. The test results show that LBP is not only better than MTCNN in identifying faces but also more efficient to compute.
Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks.IEEE Signal Processing Letters, 23(10):1499–1503, October 2016.
Florian Schroff, Dmitry Kalenichenko, and James Philbin. FaceNet: A Unified Embedding for Face Recognition and Clustering.2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 815–823, June 2015.
Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, and Michael Felsberg. Accurate Scale Estimation for Robust VisualTracking. In Proceedings of the British Machine Vision Conference 2014, pages 65.1–65.11, Nottingham, 2014. British Machine Vision Association.
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alex Alemi. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv:1602.07261 [cs], February 2016.
Theodoros Evgeniou and Massimiliano Pontil. Support Vector Machines: Theory and Applications. volume 2049, pages 249–257, January 2001.
Dav Bolme, J. Ross Beveridge, Bruce A. Draper, and Yui Man Lui. Visual object tracking using adaptive correlation filters.In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2544–2550, San Francisco, CA, USA, June 2010. IEEE.
P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, volume 1, pages I–511–I–518, Kauai, HI, USA, 2001. IEEE Comput. Soc.
Timo Ahonen, Abdenour Hadid, and Matti Pietikäinen. Face Recognition with Local Binary Patterns. In Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Friedemann Mattern, John C. Mitchell, Oscar Nierstrasz, C. Pandu Rangan, Bernhard Steffen,Madhu Sudan, Demetri Terzopoulos, Dough Tygar, Moshe Y. Vardi, Gerhard Weikum, Tomás Pajdla, and Jiˇrí Matas, editors, Computer Vision - ECCV 2004, volume 3021, pages 469–481. Springer Berlin Heidelberg, Berlin, Heidelberg, 2004.
T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):971–987, July 2002.
Yongkang Wong, Shaokang Chen, Sandra Mau, Conrad Sanderson, and Brian C. Lovell. Patch-based probabilistic image quality assessment for face selection and improved video-based face recognition. In CVPR 2011 WORKSHOPS, pages 74–81,Colorado Springs, CO, USA, June 2011. IEEE.
Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, 88(2):303–338, June 2010.
Safa Alver and Ugur Halici. Attentive Deep Regression Networks for Real-Time Visual Face Tracking in Video Surveillance.arXiv:1908.03812 [cs], August 2019.
Qiong Cao, Li Shen, Weidi Xie, Omkar M. Parkhi, and Andrew Zisserman. VGGFace2: A dataset for recognising faces across pose and age.arXiv:1710.08092 [cs], May 2018.
Marina Sokolova and Guy Lapalme. A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4):427–437, July 2009.
Qing-Yi Gu and Idaku Ishii. Review of some advances and applications in real-time high-speed vision: Our views and experiences. International Journal of Automation and Computing, 13(4):305–318, August 2016.
Copyright (c) 2020 Luqman Bramantyo Rahmadi, Kemas Muslim Lhaksmana
This work is licensed under a Creative Commons Attribution 4.0 International License.
- Manuscript submitted to IndoJC has to be an original work of the author(s), contains no element of plagiarism, and has never been published or is not being considered for publication in other journals.
- Copyright on any article is retained by the author(s). Regarding copyright transfers please see below.
- Authors grant IndoJC a license to publish the article and identify itself as the original publisher.
- Authors grant IndoJC commercial rights to produce hardcopy volumes of the journal for sale to libraries and individuals.
- Authors grant any third party the right to use the article freely as long as its original authors and citation details are identified.
- The article and any associated published material is distributed under the Creative Commons Attribution 4.0License