Article Contents
Citation: W. Jia, W. Xia, Y. Zhao, H. Min, Y. X. Chen. 2D and 3D palmprint and palm vein recognition based on neural architecture search. International Journal of Automation and Computing. http://doi.org/10.1007/s11633-021-1292-1 doi:  10.1007/s11633-021-1292-1
Cite as: Citation: W. Jia, W. Xia, Y. Zhao, H. Min, Y. X. Chen. 2D and 3D palmprint and palm vein recognition based on neural architecture search. International Journal of Automation and Computing . http://doi.org/10.1007/s11633-021-1292-1 doi:  10.1007/s11633-021-1292-1

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

Author Biography:
  • Wei Jia received the B. Sc. degree in informatics from Central China Normal University, China in 1998, the M. Sc. degree in computer science from Hefei University of Technology, China in 2004, and the Ph. D. degree in pattern recognition and intelligence system from University of Science and Technology of China, China in 2008. He has been a research associate professor in Hefei Institutes of Physical Sciences, Chinese Academy of Sciences, China from 2008 to 2016. He is currently an associate professor in Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education, and in School of Computer Science and Information Engineering, Hefei University of Technology, China. His research interests include computer vision, biometrics, pattern recognition, image processing and machine learning. E-mail: jiawei@hfut.edu.cn (Corresponding author) ORCID iD: 0000-0001-5628-6237

    Wei Xia received the B. Sc. degree in computer science from Anhui University of Science and Technology, China in 2018. He is a master student in School of Computer Science and Information Engineering, Hefei University of Technology, China. His research interests include biometrics, pattern recognition and image processing. E-mail: hewelxw@mail.hfut.edu.cn

    Yang Zhao received the B. Eng. degree in automation from University of Science and Technology of China, China in 2008, and the Ph. D. degree in pattern recognition and intelligence system from University of Science and Technology of China, China in 2013. From 2013 to 2015, he was a postdoctoral researcher at School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, China. Currently, he is an associate professor at School of Computer Science and Information Engineering, Hefei University of Technology, China. His research interests include image processing and computer vision. E-mail: yzhao@hfut.edu.cn

    Hai Min received the Ph. D. degree in pattern recognition and intelligence system from the University of Science and Technology of China, China in 2014. He is currently an associate professor in School of Computer Science and Information Engineering, Hefei University of Technology, China. His research interests include pattern recognition and image segmentation. E-mail: minhai361@aliyun.com

    Yan-Xiang Chen received the B. Sc. and the M. Sc. degree in electronic information engineering from Hefei University of Technology, China in 1993 and 1996, and the Ph. D. degree in signal and information processing from University of Science and Technology of China, China in 2004. She has been a visiting scholar in University of Illinois at Urbana-Champaign, USA from 2006 to 2008, and in National University of Singapore, Singapore from 2012 to 2013. She is currently a professor in School of Computer Science and Information Engineering, Hefei University of Technology, China. Her research interests include audio-visual signal processing, saliency and machine learning. E-mail: chenyx@hfut.edu.cn

  • Received: 2021-01-07
  • Accepted: 2021-03-05
  • Published Online: 2021-04-13
  • Palmprint recognition and palm vein recognition are two emerging biometrics technologies. In the past two decades, many traditional methods have been proposed for palmprint recognition and palm vein recognition and have achieved impressive results. In recent years, in the field of artificial intelligence, deep learning has gradually become the mainstream recognition technology because of its excellent recognition performance. Some researchers have tried to use convolutional neural networks (CNNs) for palmprint recognition and palm vein recognition. However, the architectures of these CNNs have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In order to overcome some shortcomings of manually designed CNN, neural architecture search (NAS) technology has become an important research direction of deep learning. The significance of NAS is to solve the deep learning model′s parameter adjustment problem, which is a cross-study combining optimization and machine learning. NAS technology represents the future development direction of deep learning. However, up to now, NAS technology has not been well studied for palmprint recognition and palm vein recognition. In this paper, in order to investigate the problem of NAS-based 2D and 3D palmprint recognition and palm vein recognition in-depth, we conduct a performance evaluation of twenty representative NAS methods on five 2D palmprint databases, two palm vein databases, and one 3D palmprint database. Experimental results show that some NAS methods can achieve promising recognition results. Remarkably, among different evaluated NAS methods, ProxylessNAS achieves the best recognition performance.
  • 加载中
  • [1] A. Kong, D. Zhang, M. Kamel.  A survey of palmprint recognition[J]. Pattern Recognition, 2009, 42(7): 1408-1418. doi: 10.1016/j.patcog.2009.01.018
    [2] D. Zhang, W. M. Zuo, F. Yue.  A comparative study of palmprint recognition algorithms[J]. ACM Computing Surveys, 2012, 44(1): 2-. doi: 10.1145/2071389.2071391
    [3] L. K. Fei, G. M. Lu, W. Jia, S. H. Teng, D. Zhang.  Feature extraction methods for palmprint recognition: A survey and evaluation[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019, 49(2): 346-363. doi: 10.1109/TSMC.2018.2795609
    [4] D. X. Zhong, X. F. Du, K. C. Zhong.  Decade progress of palmprint recognition: A brief survey[J]. Neurocomputing, 2019, 328(): 16-28. doi: 10.1016/j.neucom.2018.03.081
    [5] B. Hu, J. C. Wang.  Deep learning based hand gesture recognition and UAV flight controls[J]. International Journal of Automation and Computing, 2020, 17(1): 17-29. doi: 10.1007/s11633-019-1194-7
    [6] V. K. Ha, J. C. Ren, X. Y. Xu, S. Zhao, G. Xie, V. Masero, A. Hussain.  Deep learning based single image super-resolution: A survey[J]. International Journal of Automation and Computing, 2019, 16(4): 413-426. doi: 10.1007/s11633-019-1183-x
    [7] C. L. Li, X. H. Wu, N. Zhao, X. C. Cao, J. Tang.  Fusing two-stream convolutional neural networks for RGB-T object tracking[J]. Neurocomputing, 2018, 281(): 78-85. doi: 10.1016/j.neucom.2017.11.068
    [8] C. L. Li, X. Y. Liang, Y. J. Lu, N. Zhao, J. Tang.  RGB-T object tracking: Benchmark and baseline[J]. Pattern Recognition, 2019, 96(): 106977-. doi: 10.1016/j.patcog.2019.106977
    [9] K. Sundararajan, D. L. Woodard.  Deep learning for biometrics: A survey[J]. ACM Computing Surveys, 2018, 51(3): 65-. doi: 10.1145/3190618
    [10] T. Elsken, J. H. Metzen, F. Hutter.  Neural architecture search: A survey[J]. Journal of Machine Learning Research, 2019, 20(55): 1-21.
    [11] M. Wistuba, A. Rawat, T. Pedapati. A survey on neural architecture search. [Online], Available: https://arxiv.org/abs/1905.01392, 2019.
    [12] P. Z. Ren, Y. Xiao, X. J. Chang, P. Y. Huang, Z. H. Li, X. J. Chen, X. Wang. A comprehensive survey of neural architecture search: Challenges and solutions. [Online], Available: https://arxiv.org/abs/2006.02903, 2020.
    [13] Y. Q. Hu, Y. Yu.  A technical view on neural architecture search[J]. International Journal of Machine Learning and Cybernetics, 2020, 11(4): 795-811. doi: 10.1007/s13042-020-01062-1
    [14] X. He, K. Y. Zhao, X. W. Chu.  AutoML: A survey of the state-of-the-art[J]. Knowledge-Based Systems, 2021, 212(): 106622-. doi: 10.1016/j.knosys.2020.106622
    [15] B. Zoph, Q. V. Le. Neural architecture search with reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [16] D. Zhang, W. K. Kong, J. You, M. Wong.  Online palmprint identification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(9): 1041-1050. doi: 10.1109/TPAMI.2003.1227981
    [17] D. Zhang, Z. H. Guo, G. M. Lu, L. Zhang, W. M. Zuo.  An online system of multispectral palmprint verification[J]. IEEE Transactions on Instrumentation and Measurement, 2010, 59(2): 480-490. doi: 10.1109/TIM.2009.2028772
    [18] W. Jia, B. Zhang, J. T. Lu, Y. H. Zhu, Y. Zhao, W. M. Zuo, H. B. Ling.  Palmprint recognition based on complete direction representation[J]. IEEE Transactions on Image Processing, 2017, 26(9): 4483-4498. doi: 10.1109/TIP.2017.2705424
    [19] W. Jia, R. X. Hu, J. Gui, Y. Zhao, X. M. Ren.  Palmprint recognition across different devices[J]. Sensors, 2012, 12(6): 7938-7964. doi: 10.3390/s120607938
    [20] L. Zhang, L. D. Li, A. Q. Yang, Y. Shen, M. Yang.  Towards contactless palmprint recognition: A novel device, a new benchmark, and a collaborative representation based identification approach[J]. Pattern Recognition, 2017, 69(): 199-212. doi: 10.1016/j.patcog.2017.04.016
    [21] W. Li, D. Zhang, L. Zhang, G. M. Lu, J. Q. Yan.  3-D palmprint recognition with joint line and orientation features[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2011, 41(2): 274-279. doi: 10.1109/TSMCC.2010.2055849
    [22] L. Zhang, Z. X. Cheng, Y. Shen, D. Q. Wang.  Palmprint and palmvein recognition based on DCNN and a new large-scale contactless palmvein dataset[J]. Symmetry, 2018, 10(4): 78-. doi: 10.3390/sym10040078
    [23] L. K. Fei, B. Zhang, W. Jia, J. Wen, D. Zhang.  Feature extraction for 3-D palmprint recognition: A survey[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 69(3): 645-656. doi: 10.1109/TIM.2020.2964076
    [24] A. Genovese, V. Piuri, K. N. Plataniotis, F. Scotti.  PalmNet: Gabor-PCA convolutional networks for touchless palmprint recognition[J]. IEEE Transactions on Information Forensics and Security, 2019, 14(12): 3160-3174. doi: 10.1109/TIFS.2019.2911165
    [25] D. X. Zhong, J. S. Zhu.  Centralized large margin cosine loss for open-set deep palmprint recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(6): 1559-1568. doi: 10.1109/TCSVT.2019.2904283
    [26] W. M. Matkowski, T. T. Chai, A. W. K. Kong.  Palmprint recognition in uncontrolled and uncooperative environment[J]. IEEE Transactions on Information Forensics and Security, 2019, 15(): 1601-1615. doi: 10.1109/TIFS.2019.2945183
    [27] S. P. Zhao, B. Zhang.  Deep discriminative representation for generic palmprint recognition[J]. Pattern Recognition, 2020, 98(): 107071-. doi: 10.1016/j.patcog.2019.107071
    [28] S. P. Zhao, B. Zhang.  Joint constrained least-square regression with deep convolutional feature for palmprint recognition[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020, (): -. doi: 10.1109/TSMC.2020.3003021
    [29] S. P. Zhao, B. Zhang, C. L. P. Chen.  Joint deep convolutional feature representation for hyperspectral palmprint recognition[J]. Information Sciences, 2019, 489(): 167-181. doi: 10.1016/j.ins.2019.03.027
    [30] Y. Liu, A. Kumar.  Contactless palmprint identification using deeply learned residual features[J]. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2020, 2(2): 172-181. doi: 10.1109/TBIOM.2020.2967073
    [31] S. Lefkovits, L. Lefkovits, L. Szilágyi. Applications of different CNN architectures for palm vein identification. In Proceedings of the 16th International Conference on Modeling Decisions for Artificial Intelligence, Springer, Milan, Italy, vol. 11676, pp. 295−306, 2019.
    [32] D. Thapar, G. Jaswal, A. Nigam, V. Kanhangad. PVSNet: Palm vein authentication siamese network trained using triplet loss and adaptive hard mining by learning enforced domain specific features. In Proceedings of the 5th IEEE International Conference on Identity, Security, and Behavior Analysis, IEEE, Hyderabad, India, pp. 1−8, 2019.
    [33] S. Chantaf, A. Hilal, R. Elsaleh. Palm vein biometric authentication using convolutional neural networks. In Proceedings of the 8th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications, Springer, Maghreb, Tunisia, vol. 146, pp. 352−363, 2020.
    [34] M. Stanuch, M. Wodzinski, A. Skalski.  Contact-free multispectral identity verification system using palm veins and deep neural network[J]. Sensors, 2020, 20(19): 5695-. doi: 10.3390/s20195695
    [35] W. Jia, J. Gao, W. Xia, Y. Zhao, H. Min, J. T. Lu.  A performance evaluation of classic convolutional neural networks for 2D and 3D palmprint and palm vein recognition[J]. International Journal of Automation and Computing, 2021, 18(1): 18-44. doi: 10.1007/s11633-020-1257-9
    [36] B. Baker, O. Gupta, N. Naik, R. Raskar. Designing neural network architectures using reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [37] R. Shin, C. Packer, D. Song. Differentiable neural network architecture search. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [38] K. Kandasamy, W. Neiswanger, J. Schneider, B. Póczos, E. P. Xing. Neural architecture search with Bayesian optimisation and optimal transport. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2016−2025, 2018.
    [39] C. X. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L. J. Li, L. Fei-Fei, A. Yuille, J. Huang, K. Murphy. Progressive neural architecture search. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, vol. 11205, pp. 19−35, 2018.
    [40] R. Q. Luo, F. Tian, T. Qin, E. H. Chen, T. Y. Liu. Neural architecture optimization. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 7816−7827, 2018.
    [41] A. Brock, T. Lim, J. M. Ritchie, N. Weston. SMASH: One-shot model architecture search through hyperNetworks. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [42] G. Bender, P. J. Kindermans, B. Zoph, V. Vasudevan, Q. Le. Understanding and simplifying one-shot architecture search. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 883−893, 2018.
    [43] Z. Zhong, J. J. Yan, W. Wu, J. Shao, C. L. Liu. Practical block-wise neural network architecture generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 2423−2432, 2018.
    [44] B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le. Learning transferable architectures for scalable image recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8697−8710, 2018.
    [45] T. J. Yang, A. Howard, B. Chen, X. Zhang, A. Go, M. Sandler, V. Sze, H. Adam. NetAdapt: Platform-aware neural network adaptation for mobile applications. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, vol. 11214, pp. 289−304, 2018.
    [46] C. Ying, A. Klein, E. Christiansen, E. Real, K. Murphy, F. Hutter. NAS-Bench-101: Towards reproducible neural architecture search. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 12334−12348, 2019.
    [47] M. X. Tan, Q. V. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 10691−10700, 2019.
    [48] X. X. Chu, B. Zhang, R. J. Xu, J. X. Li. FairNAS: Rethinking evaluation fairness of weight sharing neural architecture search. [Online], Available: https://arxiv.org/abs/1907.01845, 2019.
    [49] D. Ho, E. Liang, I. Stoica, P. Abbeel, X. Chen. Population based augmentation: Efficient learning of augmentation policy schedules. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 4843−4856, 2019.
    [50] E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, Q. V. Le. AutoAugment: Learning augmentation strategies from data. In Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 113−123, 2019.
    [51] E. Real, A. Aggarwal, Y. Huang, Q. V. Le. Regularized evolution for image classifier architecture search. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 31st Innovative Applications of Artificial Intelligence Conference and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, AIAA, Honolulu, USA, pp. 4780−4789, 2019.
    [52] V. Nekrasov, H. Chen, C. H. Shen, I. Reid. Fast neural architecture search of compact semantic segmentation models via auxiliary cells. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 9118−9127, 2019.
    [53] X. X. Chu, B. Zhang, J. X. Li, Q. Y. Li, R. J. Xu. ScarletNAS: Bridging the gap between scalability and fairness in neural architecture search. [Online], Available: https://arxiv.org/abs/1908.06022, 2019.
    [54] J. M. Pérez-Rúa, V. Vielzeuf, S. Pateux, M. Baccouche, F. Jurie. MFAS: Multimodal fusion architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 6959−6968, 2019.
    [55] M. X. Tan, B. Chen, R. M. Pang, V. Vasudevan, M. Sandler, A. Howard, Q. V. Le. MnasNet: Platform-aware neural architecture search for mobile. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2815−2823, 2019.
    [56] C. X. Liu, L. C. Chen, F. Schroff, H. Adam, W. Hua, A. L. Yuille, L. Fei-Fei. Auto-DeepLab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 82−92, 2019.
    [57] Y. K. Chen, G. F. Meng, Q. Zhang, S. M. Xiang, C. Huang, L. S. Mu, X. G. Wang. RENAS: Reinforced evolutionary neural architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4782−4791, 2019.
    [58] B. C. Wu, X. L. Dai, P. Z. Zhang, Y. H. Wang, F. Sun, Y. M. Wu, Y. D. Tian, P. Vajda, Y. Q. Jia, K. Keutzer. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 10726−10734, 2019.
    [59] X. Li, Y. M. Zhou, Z. Pan, J. S. Feng. Partial order pruning: For best speed/accuracy trade-off in neural architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 9137−9145, 2019.
    [60] X. Y. Dong, Y. Yang. Searching for a robust neural architecture in four GPU hours. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1761−1770, 2019.
    [61] H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, J. Dean. Efficient neural architecture search via parameter sharing. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 6522−6531, 2018.
    [62] S. R. Xie, H. H. Zheng, C. X. Liu, L. Lin. SNAS: Stochastic neural architecture search. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [63] T. Elsken, J. H. Metzen, F. Hutter. Efficient multi-objective neural architecture search via Lamarckian evolution. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [64] G. Ghiasi, T. Y. Lin, Q. V. Le. NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 7029−7038, 2019. DOI: 10.1109/CVPR.2019.00720.
    [65] C. Zhang, M. Y. Ren, R. Urtasun. Graph HyperNetworks for neural architecture search. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [66] H. Cai, L. Zhu, S. Han. ProxylessNAS: Direct neural architecture search on target task and hardware. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [67] H. X. Liu, K. Simonyan, Y. M. Yang. DARTS: Differentiable architecture search. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [68] N. Nayman, A. Noy, T. Ridnik, I. Friedman, R. Jin, L. Zelnik-Manor. XNAS: Neural architecture search with expert advice. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 1975−1985, 2019.
    [69] J. R. Peng, M. Sun, Z. X. Zhang, T. N. Tan, J. J. Yan. Efficient neural architecture transformation search in channel-level for object detection. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 14290−14299, 2019.
    [70] H. Z. Hu, J. Langford, R. Caruana, S. Mukherjee, E. Horvitz, D. Dey. Efficient forward architecture search. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 10122−10131, 2019.
    [71] X. Y. Dong, Y. Yang. Network pruning via transformable architecture search. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 759−770, 2019.
    [72] Y. K. Chen, T. Yang, X. Y. Zhang, G. F. Meng, X. Y. Xiao, J. Sun. DetNAS: backbone search for object detection. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 6638−6648, 2019.
    [73] M. Wortsman, A. Farhadi, M. Rastegari. Discovering neural wirings. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 2680−2690, 2019.
    [74] X. Y. Gong, S. Y. Chang, Y. F. Jiang, Z. Y. Wang. AutoGAN: Neural architecture search for generative adversarial networks. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3223−3233, 2019.
    [75] X. Y. Dong, Y. Yang. One-shot neural architecture search via self-evaluated template network. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3680−3689, 2019.
    [76] Y. Y. Xiong, R. Mehta, V. Singh. Resource constrained neural network architecture search: Will a submodularity assumption help? In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1901−1910, 2019.
    [77] A. Howard, M. Sandler, B. Chen, W. J. Wang, L. C. Chen, M. X. Tan, G. Chu, V. Vasudevan, Y. K. Zhu, R. M. Pang, H. Adam, Q. Le. Searching for mobileNetV3. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1314−1324, 2019.
    [78] X. W. Zheng, R. R. Ji, L. Tang, B. C. Zhang, J. Z. Liu, Q. Tian. Multinomial distribution learning for effective neural architecture search. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1304−1313, 2019.
    [79] R. Pasunuru, M. Bansal. Continual and multi-task architecture search. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL, Florence, Italy, pp. 1911−1922, 2019.
    [80] Y. F. Jiang, C. Hu, T. Xiao, C. L. Zhang, J. B. Zhu. Improved differentiable architecture search for language modeling and named entity recognition. In Proceedings of Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Association for Computational Linguistics, Hong Kong, China, pp. 3585−3590, 2019.
    [81] L. Li, A. Talwalkar. Random search and reproducibility for neural architecture search. In Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence, Tel Aviv, Israel, pp. 367−377, 2019.
    [82] X. X. Chu, B. Zhang, R. J. Xu. MoGA: Searching beyond Mobilenetv3. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Barcelona, Spain, pp. 4042−4046, 2020.
    [83] W. Y. Chen, X. Y. Gong, X. M. Liu, Q. Zhang, Y. Li, Z. Y. Wang. FasterSeg: Searching for faster real-time semantic segmentation. [Online], Available: https://arxiv.org/abs/1912.10917, 2020.
    [84] Y. H. Xu, L. X. Xie, X. P. Zhang, X. Chen, G. J. Qi, Q. Tian, H. K. Xiong. PC-DARTS: Partial channel connections for memory-efficient architecture search. [Online], Available: https://arxiv.org/abs/1907.05737, 2019.
    [85] J. R. Mei, Y. W. Li, X. C. Lian, X. J. Jin, L. J. Yang, A. Yuille, J. C. Yang. AtomNAS: Fine-grained end-to-end neural architecture search. [Online], Available: https://arxiv.org/abs/1912.09640, 2020.
    [86] X. Y. Dong, Y. Yang. NAS-Bench-201: Extending the scope of reproducible neural architecture search. [Online], Available: https://arxiv.org/abs/2001.00326, 2020.
    [87] M. X. Tan, R. M. Pang, Q. V. Le. EfficientDet: Scalable and efficient object detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 10778−10787, 2020.
    [88] J. M. Fang, Y. Z. Sun, Q. Zhang, Y. Li, W. Y. Liu, X. G. Wang. Densely connected search space for more flexible neural architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 10625−10634, 2020.
    [89] M. Zhang, H. Q. Li, S. R. Pan, X. J. Chang, S. Su. Overcoming multi-model forgetting in one-shot NAS with diversity maximization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7806−7815, 2020.
    [90] C. L. Li, J. F. Peng, L. C. Yuan, G. R. Wang, X. D. Liang, L. Lin, X. J. Chang. Block-wisely supervised neural architecture search with knowledge distillation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1986−1995, 2020.
    [91] M. H. Guo, Y. Z. Yang, R. Xu, Z. W. Liu, D. H. Lin. When NAS meets robustness: In search of robust architectures against adversarial attacks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 628−637, 2020.
    [92] C. Gao, Y. P. Chen, S. Liu, Z. X. Tan, S. C. Yan. AdversarialNAS: Adversarial neural architecture search for GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5679−5688, 2020.
    [93] A. Wan, X. L. Dai, P. Z. Zhang, Z. J. He, Y. D. Tian, S. Xie, B. C. Wu, M. Yu, T. Xu, K. Chen, P. Vajda, J. E. Gonzalez. FBNetV2: Differentiable neural architecture search for spatial and channel dimensions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 12962−12971, 2020.
    [94] G. Bender, H. X. Liu, B. Chen, G. Chu, S. Y. Cheng, P. J. Kindermans, Q. V. Le. Can weight sharing outperform random architecture search? An investigation with TuNAS. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 14311−14320, 2020.
    [95] G. H. Li, G. C. Qian, I. C. Delgadillo, M. Müller, A. Thabet, B. Ghanem. SGAS: Sequential greedy architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1617−1627, 2020.
    [96] X. W. Zheng, R. R. Ji, Q. Wang, Q. X. Ye, Z. G. Li, Y. H. Tian, Q. Tian. Rethinking performance estimation in neural architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 11353−11362, 2020.
    [97] H. Phan, Z. C. Liu, D. Huynh, M. Savvides, K. T. Cheng, Z. Q. Shen. Binarizing MobileNet via evolution-based searching. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 13417−13426, 2020.
    [98] C. Y. He, H. S. Ye, L. Shen, T. Zhang. MiLeNAS: Efficient neural architecture search via mixed-level reformulation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 11990−11999, 2020.
    [99] X. Y. Dai, D. D. Chen, M. C. Liu, Y. P. Chen, L. Yuan. DA-NAS: Data adapted pruning for efficient neural architecture search. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12372, pp. 584−600, 2020.
    [100] Y. Tian, Q. Wang, Z. W. Huang, W. Li, D. X. Dai, M. H. Yang, J. Wang, O. Fink. Off-policy reinforcement learning for efficient and effective GAN architecture search. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12352, pp. 175−192, 2020.
    [101] X. X. Chu, T. B. Zhou, B. Zhang, J. X. Li. Fair DARTS: Eliminating unfair advantages in differentiable architecture search. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12360, pp. 465−480, 2020.
    [102] Y. B. Hu, X. Wu, R. He. TF-NAS: Rethinking three search freedoms of latency-constrained differentiable neural architecture search. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12360, pp. 123−139, 2020.
    [103] Y. M. Hu, Y. D. Yang, Z. C. Guo, R. S. Wan, X. Y. Zhang, Y. C. Wei, Q. Y. Gu, J. Sun. Angle-based search space shrinking for neural architecture search. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12364, pp. 119−134, 2020.
    [104] H. B. Yu, Q. Han, J. B. Li, J. P. Shi, G. L. Cheng, B. Fan. Search what you want: Barrier panelty NAS for mixed precision quantization. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12354, pp. 1−16, 2020.
    [105] X. F. Wang, X. H. Xiong, M, Neumann, A. J. Piergiovanni, M. S. Ryoo, A. Angelova, K. M. Kitani, W. Hua. AttentionNAS: Spatiotemporal attention cell search for video classification. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12353, pp. 449−465, 2020.
    [106] A. Bulat, B. Martinez, G. Tzimiropoulos. BATS: Binary ArchitecTure search. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12368, pp. 309−325, 2020.
    [107] J. H. Yu, P. C. Jin, H. X. Liu, G. Bender, P. J. Kindermans, M. X. Tan, T. Huang, X. D. Song, R. M. Pang, Q. Le. BigNAS: Scaling up neural architecture search with big single-stage models. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12352, pp. 702−717, 2020.
    [108] Z. C. Guo, X. Y. Zhang, H. Y. Mu, W. Heng, Z. C. Liu, Y. C. Wei, J. Sun. Single path one-shot neural architecture search with uniform sampling. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12361, pp. 544−560, 2020.
    [109] C. X. Liu, P. Dollár, K. M. He, R. Girshick, A. Yuille, S. N. Xie. Are labels necessary for neural architecture search? In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, vol. 12349, pp. 798−813, 2020.
    [110] C. Szegedy, S. Ioffe, V. Vanhoucke, A. A. Alemi. Inception-v4, inception-ResNet and the impact of residual connections on learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 4278−4284, 2017.
    [111] M. Sandler, A. Howard, M. L. Zhu, A. Zhmoginov, L. C. Chen. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4510−4520, 2018.
    [112] Z. N. Sun, T. N. Tan, Y. H. Wang, S. Z. Li. Ordinal palmprint represention for personal identification. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Diego, USA, pp. 279−284, 2005.
    [113] W. Jia, D. S. Huang, D. Zhang.  Palmprint verification based on robust line orientation code[J]. Pattern Recognition, 2008, 41(5): 1504-1513. doi: 10.1016/j.patcog.2007.10.011
    [114] Y. T. Luo, L. Y. Zhao, B. Zhang, W. Jia, F. Xue, J. T. Lu, Y. H. Zhu, B. Q. Xu.  Local line directional pattern for palmprint recognition[J]. Pattern Recognition, 2016, 50(): 26-44. doi: 10.1016/j.patcog.2015.08.025
    [115] S. N. Xie, R. Girshick, P. Dollár, Z. W. Tu, K. M. He. Aggregated residual transformations for deep neural networks. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5987−5995, 2017.
  • 加载中
  • [1] Wei Tang, Yan Huang, Liang Wang. PokerNet: Expanding Features Cheaply via Depthwise Convolutions . International Journal of Automation and Computing,  doi: 10.1007/s11633-021-1288-x
    [2] Yue Wu, Jun-Wei Liu, Chen-Zhuo Zhu, Zhuang-Fei Bai, Qi-Guang Miao, Wen-Ping Ma, Mao-Guo Gong. Computational Intelligence in Remote Sensing Image Registration: A survey . International Journal of Automation and Computing,  doi: 10.1007/s11633-020-1248-x
    [3] Zhao-Hua Liu, Xu-Dong Meng, Hua-Liang Wei, Liang Chen, Bi-Liang Lu, Zhen-Heng Wang, Lei Chen. A Regularized LSTM Method for Predicting Remaining Useful Life of Rolling Bearings . International Journal of Automation and Computing,  doi: 10.1007/s11633-020-1276-6
    [4] Punyanuch Borwarnginn, Worapan Kusakunniran, Sarattha Karnjanapreechakorn, Kittikhun Thongkanchorn. Knowing Your Dog Breed: Identifying a Dog Breed with Deep Learning . International Journal of Automation and Computing,  doi: 10.1007/s11633-020-1261-0
    [5] Xiao-Qin Zhang, Run-Hua Jiang, Chen-Xiang Fan, Tian-Yu Tong, Tao Wang, Peng-Cheng Huang. Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals . International Journal of Automation and Computing,  doi: 10.1007/s11633-020-1274-8
    [6] Ai-Hua Zheng, Zi-Han Chen, Cheng-Long Li, Jin Tang, Bin Luo. Learning Deep RGBT Representations for Robust Person Re-identification . International Journal of Automation and Computing,  doi: 10.1007/s11633-020-1262-z
    [7] Wei Jia, Jian Gao, Wei Xia, Yang Zhao, Hai Min, Jing-Ting Lu. A Performance Evaluation of Classic Convolutional Neural Networks for 2D and 3D Palmprint and Palm Vein Recognition . International Journal of Automation and Computing,  doi: 10.1007/s11633-020-1257-9
    [8] Han Xu, Yao Ma, Hao-Chen Liu, Debayan Deb, Hui Liu, Ji-Liang Tang, Anil K. Jain. Adversarial Attacks and Defenses in Images, Graphs and Text: A Review . International Journal of Automation and Computing,  doi: 10.1007/s11633-019-1211-x
    [9] Chang-Hao Zhu, Jie Zhang. Developing Soft Sensors for Polymer Melt Index in an Industrial Polymerization Process Using Deep Belief Networks . International Journal of Automation and Computing,  doi: 10.1007/s11633-019-1203-x
    [10] Kittinun Aukkapinyo, Suchakree Sawangwong, Parintorn Pooyoi, Worapan Kusakunniran. Localization and Classification of Rice-grain Images Using Region Proposals-based Convolutional Neural Network . International Journal of Automation and Computing,  doi: 10.1007/s11633-019-1207-6
    [11] Fu-Qiang Liu, Zong-Yi Wang. Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning . International Journal of Automation and Computing,  doi: 10.1007/s11633-020-1221-8
    [12] Bin Hu, Jiacun Wang. Deep Learning Based Hand Gesture Recognition and UAV Flight Controls . International Journal of Automation and Computing,  doi: 10.1007/s11633-019-1194-7
    [13] Xiang Zhang, Qiang Yang. Transfer Hierarchical Attention Network for Generative Dialog System . International Journal of Automation and Computing,  doi: 10.1007/s11633-019-1200-0
    [14] Viet Khanh Ha, Jin-Chang Ren, Xin-Ying Xu, Sophia Zhao, Gang Xie, Valentin Masero, Amir Hussain. Deep Learning Based Single Image Super-resolution: A Survey . International Journal of Automation and Computing,  doi: 10.1007/s11633-019-1183-x
    [15] Bing-Shan Jiang, Hai-Rong Fang, Hai-Qiang Zhang. Type Synthesis and Kinematics Performance Analysis of a Class of 3T2R Parallel Mechanisms with Large Output Rotational Angles . International Journal of Automation and Computing,  doi: 10.1007/s11633-019-1192-9
    [16] Zhen-Jie Yao, Jie Bi, Yi-Xin Chen. Applying Deep Learning to Individual and Community Health Monitoring Data: A Survey . International Journal of Automation and Computing,  doi: 10.1007/s11633-018-1136-9
    [17] Tomaso Poggio, Hrushikesh Mhaskar, Lorenzo Rosasco, Brando Miranda, Qianli Liao. Why and When Can Deep-but Not Shallow-networks Avoid the Curse of Dimensionality:A Review . International Journal of Automation and Computing,  doi: 10.1007/s11633-017-1054-2
    [18] Ting Zhang, Ri-Zhen Qin, Qiu-Lei Dong, Wei Gao, Hua-Rong Xu, Zhan-Yi Hu. Physiognomy: Personality Traits Prediction by Learning . International Journal of Automation and Computing,  doi: 10.1007/s11633-017-1085-8
    [19] Bo Zhao, Jiashi Feng, Xiao Wu, Shuicheng Yan. A Survey on Deep Learning-based Fine-grained Object Classification and Semantic Segmentation . International Journal of Automation and Computing,  doi: 10.1007/s11633-017-1053-3
    [20] Guo-Bing Zhou, Jianxin Wu, Chen-Lin Zhang, Zhi-Hua Zhou. Minimal Gated Unit for Recurrent Neural Networks . International Journal of Automation and Computing,  doi: 10.1007/s11633-016-1006-2
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures (26)  / Tables (14)

Metrics

Abstract Views (4) PDF downloads (25) Citations (0)

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

Abstract: Palmprint recognition and palm vein recognition are two emerging biometrics technologies. In the past two decades, many traditional methods have been proposed for palmprint recognition and palm vein recognition and have achieved impressive results. In recent years, in the field of artificial intelligence, deep learning has gradually become the mainstream recognition technology because of its excellent recognition performance. Some researchers have tried to use convolutional neural networks (CNNs) for palmprint recognition and palm vein recognition. However, the architectures of these CNNs have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In order to overcome some shortcomings of manually designed CNN, neural architecture search (NAS) technology has become an important research direction of deep learning. The significance of NAS is to solve the deep learning model′s parameter adjustment problem, which is a cross-study combining optimization and machine learning. NAS technology represents the future development direction of deep learning. However, up to now, NAS technology has not been well studied for palmprint recognition and palm vein recognition. In this paper, in order to investigate the problem of NAS-based 2D and 3D palmprint recognition and palm vein recognition in-depth, we conduct a performance evaluation of twenty representative NAS methods on five 2D palmprint databases, two palm vein databases, and one 3D palmprint database. Experimental results show that some NAS methods can achieve promising recognition results. Remarkably, among different evaluated NAS methods, ProxylessNAS achieves the best recognition performance.

Citation: W. Jia, W. Xia, Y. Zhao, H. Min, Y. X. Chen. 2D and 3D palmprint and palm vein recognition based on neural architecture search. International Journal of Automation and Computing. http://doi.org/10.1007/s11633-021-1292-1 doi:  10.1007/s11633-021-1292-1
Citation: Citation: W. Jia, W. Xia, Y. Zhao, H. Min, Y. X. Chen. 2D and 3D palmprint and palm vein recognition based on neural architecture search. International Journal of Automation and Computing . http://doi.org/10.1007/s11633-021-1292-1 doi:  10.1007/s11633-021-1292-1
    • In the digital and intelligent society, more and more application scenarios need to authenticate people′s identity effectively. Biometric technology is considered to be one of the most effective solutions for personal authentication. The so-called biometrics refers to the technology that uses the human body′s physical or behavioral characteristics to identify individuals through image processing, computer vision, pattern recognition and other techniques. Generally speaking, face recognition, fingerprint recognition and iris recognition are the three most successful biometric technologies and have been widely used. However, different biometric technologies have their advantages and disadvantages. In other words, there is no one biometric technology that can meet the needs of all applications of personal authentication. Therefore, academic and industrial circles are developing different biometric technologies to meet the application requirements of different scenarios.

      In recent years, palmprint recognition and palm vein recognition have become two new biometric recognition technologies, which have attracted great attention[1-4]. Palmprint recognition refers to the technology conducting personal authentication based on the palm skin images of human hands. According to the resolution and data type of palmprint image, palmprint recognition technology can be divided into 2D palmprint recognition and 3D palmprint recognition. Furthermore, 2D palmprint recognition can be further divided into low-resolution palmprint recognition and high-resolution palmprint recognition. High-resolution palmprint recognition is generally used for forensics purposes, while low-resolution palmprint recognition and 3D palmprint recognition are mainly used for civilian purposes. Palm vein recognition refers to the technology using the palm vein images captured under near-infrared light for personal authentication. Palm vein recognition is also mainly used for civilian purposes. Since palmprint and palm vein are both collected from the palm, and their recognition methods are similar to some extent, some researchers study them simultaneously. In this paper, we only pay attention to civilian use of biometrics technology, so we mainly study 2D low-resolution palmprint recognition, 3D palmprint recognition, and palm vein recognition. In the rest of this paper, for the sake of convenience, we will write 2D low-resolution palmprint recognition as 2D palmprint recognition.

      Researchers have proposed many effective methods for 2D and 3D palmprint recognition and palm vein recognition, which can be divided into two groups, i.e., traditional methods and deep learning-based methods. Generally, traditional methods are based on hand-crafted features and traditional machine learning techniques. Different from traditional methods, deep learning can automatically learn features from images, videos or texts. The highly flexible architecture of deep learning can learn directly from the original data, and the prediction accuracy will be improved after more data are obtained. Nowadays, deep learning has become one of the most important technologies in the field of artificial intelligence. In recent years, the explosive progress made in computer vision, speech recognition, natural language processing, robotics and other fields almost all depend on deep learning technology[5-8].

      In the field of biometrics, especially in face recognition, deep learning has become the most mainstream technology[9]. Convolutional neural network (CNN) is one of the most important branches in deep learning. For image-based biometrics technologies, CNN is the most commonly used deep learning technique[9]. Until now, a lot of classic CNNs have been proposed, and have achieved impressive results for many recognition tasks. The success of these CNNs is mainly attributed to the automation of the feature engineering process: A layered feature extractor learns from data in an end-to-end manner. With this success, there is a growing demand for architecture engineering, and more and more complex neural architectures are designed in a manual manner. That is, currently employed architectures have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In order to overcome some shortcomings of manually designed CNN, neural architecture search (NAS) technology has become an important research direction of deep learning[10-14]. The core idea of NAS is to use a search algorithm to find the neural network structure needed to solve the problems. The significance of NAS is to solve the parameter adjustment problem of the deep learning model, which is a cross-study combining optimization and machine learning. The concept of NAS was first proposed by Zoph and Le[15] at International Conference on Machine Learning (ICML) in 2017, and has become a fundamental and active research direction of deep learning.

      With the continuous improvement of deep learning network architecture and the increasing amount of data, the recognition accuracy of deep learning in different biometrics tasks is also increasing. For example, in the field of face recognition, the recognition accuracy of deep learning has far exceeded the traditional hand-crafted algorithms; thus, deep learning has successfully promoted the large-scale application of face recognition technology. However, in the fields of 2D and 3D palmprint recognition and palm vein recognition, the related research based on deep learning is still preliminary. A lot of researchers have used some classic CNNs or manually designed CNNs for 2D and 3D palmprint recognition and palm vein recognition. Nevertheless, up to now, NAS technology has not been well studied for 2D and 3D palmprint recognition and palm vein recognition. Because NAS technology represents the future development direction of deep learning, it is vital to systematically investigate the recognition performance of NAS methods for 2D and 3D palmprint recognition and palm vein recognition. To this end, we conduct the performance evaluation of NAS methods on 2D and 3D palmprint recognition and palm vein recognition in this paper. Particularly, twenty representative NAS methods are selected and exploited for performance evaluation.

      The selected NAS methods are evaluated on five 2D palmprint databases, one 3D palmprint database and two palm vein databases. They are all representative databases in 2D and 3D palmprint recognition and palm vein recognition. Five 2D palmprint databases include Hong Kong Polytechnic University palmprint database II (PolyU II)[16], the blue band of the Hong Kong Polytechnic University Multispectral (PolyU M_B) palmprint database[17], Hefei University of Technology (HFUT) palmprint database[18], Hefei University of Technology Cross Sensor (HFUT CS) palmprint database[19], and Tongji University palmprint (TJU-P) database[20]. The 3D palmprint database is Hong Kong Polytechnic University 3D palmprint database (PolyU 3D)[21]. Two palm vein databases include the near-infrared band of Hong Kong Polytechnic University Multispectral palmprint database (PolyU M_N)[17] and Tongji University palm vein (TJU-PV) database[22].

      It should be noted that the samples within the above databases are captured in two different sessions at certain time intervals. If the training samples are only from the first session, and the test samples are from the second session, we call this experimental mode the “separate data mode”. If the training samples are from both sessions, we call this experimental mode the “mixed data mode”. In traditional recognition methods, some samples captured in the first session are usually used as training sets, while all the samples captured in the second session are used as the test set. Therefore, the experiments of those traditional recognition methods were usually conducted in the “separate data mode”. However, in existing deep learning-based palmprint recognition and palm vein recognition methods, the experiments were usually conducted in the “mixed data mode”. Thus, it is easy to obtain a high recognition accuracy. In this paper, we will conduct experiments in both “separate data mode” and “mixed data mode” to observe the recognition performance of representative NAS methods in these two different modes.

      The main contributions of our work are as follows.

      1) We briefly summarize some important NAS methods, which can help the readers to better understand the development history of NAS technology.

      2) We conduct a performance evaluation of representative NAS methods for 2D and 3D palmprint and palm vein recognition. To the best of our knowledge, it is the first time such an evaluation has been conducted. Particularly in the field of biometrics, this is also the first work to evaluate the recognition performance of representative NAS methods.

      3) We evaluated the performance of representative NAS methods on Hefei University of Technology cross sensor palmprint database. It is the first time that the problem of palmprint recognition across different devices using NAS technology has been investigated.

      4) We investigate the problem of the recognition performance of NAS methods on both “separate data mode” and “mixed data mode”.

      The rest of this paper is organized as follows. Section 2 presents the related work. Section 3 briefly introduces NAS technology. Section 4 introduces the selected NAS methods in detail. Section 5 introduces the 2D and 3D palmprint and palm vein databases used for evaluation. Extensive experiments are conducted and reported in Section 6. Section 7 offers the concluding remarks.

    • For 2D palmprint recognition, researchers have proposed many traditional methods. Kong et al.[1], Zhang et al.[2], Fei et al.[3], Zhong et al.[4] have published several survey papers on the traditional palmprint recognition methods. As shown in Fig. 1, these traditional methods can be classified into different subcategories, such as palm line-based, texture-based, orientation coding-based, correlation filter-based, and subspace learning-based. The palm line is the primary feature of the palmprint. Therefore, some researchers tried to extract the palm lines for palmprint recognition. However, due to the complexity of palmprint images, it is still difficult to extract palm lines accurately. Palmprint images contain obvious texture features. Therefore, researchers have proposed many texture-based palmprint recognition methods. Texture-based methods usually exploited sparse descriptor, dense descriptor, or other texture representations, such as the Gabor feature and wavelet feature, for palmprint recognition. Thus, texture-based methods can be further divided into three subtypes, i.e., Gabor and wavelet-based methods, dense texture descriptor-based methods, and sparse texture descriptor-based methods. Notably, some dense texture descriptors have achieved promising recognition results. As we know, palmprint contains many palm lines, and these lines have their orientations. Orientation features are insensitive to some variations such as illumination changes; thus, orientation is a robust feature of palmprint. A lot of orientation coding-based methods have been proposed, which have high accuracy and fast matching speed. Generally, orientation coding-based methods first detect the orientation of each pixel and then encode the orientation number to a bit string, at last, exploited Hamming distance for matching. Recently, correlation methods have been successfully used for biometrics, which also has high accuracy and fast matching speed. Subspace learning has been one of the important techniques for pattern recognition. Some subspace learning-based methods have been used for palmprint recognition. However, the recognition performance of subspace learning-based methods is sensitive to illumination changes and other image variations.

      Figure 1.  Subcategories of traditional 2D palmprint recognition methods

    • Fei et al.[23] surveyed the papers on traditional 3D palmprint recognition methods. Generally, 3D palmprint data preserves the depth information of a palm surface. The original captured 3D palmprint data is a small positive or negative float. For practical feature extraction, the original 3D palmprint data is usually transformed into the grey-level value. To this end, the original 3D palmprint data is usually transformed into a curvature-based data to facilitate the design of recognition algorithms. The two most important curvatures include the mean curvature (MC) and Gaussian curvature (GC), and their corresponding images are mean curvature image (MCI) and Gaussian curvature image (GCI). Based on GC and MC, two new grey-level image representations have been proposed, including surface type (ST) and compact ST (CST). Since the representations of MCI, GCI, ST and CST depict a 3D palmprint as a 2D grey-level palmprint image, those 2D palmprint recognition methods can be used for 3D palmprint recognition. Fig. 2 shows the examples of original 3D palmprint, 3D palmprint region of interest (ROI), and four 2D representations, including MCI, GCI, ST and CST.

      Figure 2.  Samples of original 3D palmprint, 3D palmprint ROI, and four 2D representations including MCI, GCI, ST and CST

    • As shown in Fig. 3, the traditional palm vein recognition methods can be divided into the following subcategories: structure-based, texture-based, orientation coding-based, and subspace learning-based. Structure-based methods usually first perform an image segmentation algorithm or line detection algorithm, and then extract the structure features of palm vein for recognition such as lines, skeletons, points, minutiae and graph. Thus, structure-based methods can be further divided into three subtypes, i.e., line/skeleton-based methods, points/minutiae-based methods, and graph-based methods. Texture-based methods, orientation coding-based methods, and subspace learning-based methods used for palm vein recognition are similar to those used for palmprint recognition.

      Figure 3.  Subcategories of traditional 2D palm vein recognition methods

    • Many researchers have studied 2D and 3D palmprint recognition and palm vein recognition based on deep learning.

      Some representative 2D palmprint recognition methods based on deep learning are as follows. Zhang et al.[22] proposed the method of PalmRCNN for palmprint recognition, which is a modified version of Inception-ResNet-V1. Genovese et al.[24] proposed the method of PalmNet, a CNN that uses a method to tune palmprint specific filters through an unsupervised procedure based on Gabor responses and principal component analysis (PCA). Zhong and Zhu[25] proposed an end-to-end method for open-set 2D palmprint recognition by applying CNN with a novel loss function, i.e., centralized large margin cosine loss (C-LMCL). In order to solve the problem of palmprint recognition in an uncontrolled and uncooperative environment, Matkowski et al.[26] proposed end-to-end palmprint recognition network (EE-PRnet) consisting of two main networks, i.e., ROI localization and alignment network (ROI-LAnet) and feature extraction and recognition network (FERnet). Zhao and Zhang[27] proposed a deep discriminative representation (DDR) for palmprint recognition. DDR uses several CNNs similar to VGG-F to extract deep features from global and local palmprint images, and uses the collaborative representation-based classifier (CRC) for recognition. Zhao and Zhang[28] presented a joint constrained least-square regression (JCLSR) model with a deep local convolution feature for palmprint recognition. Zhao et al.[29] also proposed a joint deep convolutional feature representation (JDCFR) methodology for hyperspectral palmprint recognition. Liu and Kumar[30] proposed a generalizable deep learning-based framework for the contactless palmprint recognition, in which the network is based on a fully convolutional network that generates deeply learned residual features.

      Some representative palm vein recognition algorithms based on deep learning are as follows. Zhang et al.[22] released a new touchless palm vein database and used the method of PalmRCNN for palm vein recognition. Lefkovits et al.[31] applied four CNNs for palm vein identification, including AlexNet, VGG-16, ResNet-50, and SqueezeNet. Thapar et al.[32] proposed the method of PVSNet, where a Siamese network was trained using triplet loss. Chantaf et al.[33] applied Inception-V3 and SmallerVGGNet for palm vein recognition. Stanuch et al.[34] proposed a contact-free multispectral palm vein recognition system using a designed CNN, whose architecture comprises of ten different layers, including five convolutional, four max pooling, and one dense layers.

      In our previous work[35], we systematically investigated the recognition performance of classic CNNs for 2D and 3D palmprint recognition and palm vein recognition. Seventeen representative and classic CNNs were exploited for performance evaluation including AlexNet, VGG, Inception-V3, Inception-V4, ResNet, ResNeXt, Inception-ResNet-V2, DenseNet, Xception, MobileNet-V2, MobliNet-V3, ShunffleNet-V2, SENet, EfficientNet, GhostNet, RegNet and ResNeSt. We have also conducted experiments on both separate data mode and mixed data mode. Among different classic CNNs, EfficientNet achieves the best recognition accuracy.

    • NAS is the sub-field of automated machine learning (AutoML). The goal of NAS is to design a network architecture with the best performance with the least human intervention and limited computing resources. The papers of [15] and [36] are considered as the pioneering work of NAS. In [15], the network structure obtained by the reinforcement learning (RL) achieves very promising accuracy in image classification tasks, which shows the idea of automation network architecture design is feasible. The development of NAS technology is very rapid. At the same time, NAS technology is being widely used in various tasks, such as classification, object detection, semantic segmentation, language modeling, and data augmentation, etc.

      Despite the short development history of NAS, there have been many papers published, including five survey papers[10-14]. In 2019, Elsken et al.[10] provided an overview of existing NAS methods and categorized them according to three dimensions: search space, search strategy, and performance estimation strategy. Wistuba et al.[11] provided a formalism that unifies the landscape of existing NAS methods. This formalism can be used to critically examine the different approaches and understand the benefits of the different components that contribute to the design and success of NAS. Wistuba et al.[11] also highlighted some popular misconception pitfalls in the current trends of NAS technology. Ren et al.[12] provided a new perspective of NAS technology: Starting with an overview of the characteristics of the earliest NAS algorithms, a summary of the problems in these early NAS algorithms, and then giving solutions for subsequent related research work. Ren et al.[12] also conducted a detailed and comprehensive analysis, comparison and summary of existing NAS works and gave possible future research directions. Hu and Yu[13] surveyed NAS technology from a technical view. By summarizing the previous NAS approaches, Hu and Yu[13] drew a picture of NAS from different aspects, including problem definition, search approaches, progress towards practical applications and possible future directions. He et al.[14] compared the performance and efficiency of existing NAS algorithms on the CIFAR-10 and ImageNet datasets and provided an in-depth discussion of different research directions on NAS, including one/two-stage NAS, one-shot NAS, and joint hyperparameter and architecture optimization.

      Almost all NAS methods are organized around three components: search space, optimization method, and evaluation method. Fig. 4 shows an abstract illustration of the NAS methods.

      Figure 4.  Abstract illustration of neural architecture search methods

      1) Search space. Search space is a set of possible neural network architectures. It adopts different design concepts according to different application scenarios, including computer vision tasks and language modeling tasks. From the above point of view, NAS is not completely out of artificial design; instead, it is more based on the design of the network structure of the search and reconstruction, and the number of architectures searched will usually reach a very large order of magnitude.

      2) Optimization method. The optimization method teaches the search space how to search better. A good optimization method often plays a key role. Although there are many optimization methods, the starting point of their research is to obtain a better network architecture. In addition, most optimization methods are introduced on the basis of traditional optimization methods, such as reinforcement learning, evolutionary search, gradient-based optimization, Bayesian optimization, etc.

      3) Evaluation method. The evaluation method especially evaluates the network structure. The general evaluation methods include a full training mode, partial training mode and NAS specific evaluation method. The full training mode is a time-consuming method, which usually requires thorough training for all the searched models, while partial training mode usually stops the training process early, saving cost and time. Among the specific evaluation methods of NAS, network morphism, weight sharing and hypernetworks are often used as heuristic quality assessment methods. In general, partial training mode is typically an order-of-magnitude cheaper than full training mode, while NAS specific evaluation methods are 2−3 orders-of-magnitude cheaper than full training mode.

      Many NAS methods have been proposed. According to the survey papers of NAS and the new development of NAS, we selected a lot of important NAS methods and listed them in Tables 1-3 according to the publishing year. It can be seen that most NAS papers have been published after 2017 and at some top conferences of artificial intelligence such as CVPR, ICML, ICCV, ECCV, ICLR, and NeurIPS. In fact, the number of papers on NAS is increasing rapidly. More comprehensive and newest NAS papers can be found on the websites https://github.com/D-X-Y/Awesome-AutoDL and https://www.automl.org.

      ReferenceFull nameAbbreviationYearSourceOptimization methodApplication
      Zoph and Le[15]Neural architecture searchNAS2017ICMLReinforcement learningClassification
      Baker et al.[36]MetaQNNMetaQNN2017ICMLReinforcement learningClassification
      Shin et al.[37]Differentiable neural network
      architecture Search
      DAS2018ICLRGradient basedClassification
      Kandasamy et al.[38]Neural architecture searchNASBOT2018NeurIPSOtherClassification
      Liu et al.[39]Progressive neural architecture searchPNASNet2018ECCVPerformance predictionClassification
      Luo et al.[40]Neural architecture optimizationNAONet2018NeurIPSGradient basedClassification
      Brock et al.[41]One-shot model architecture searchSMASH2018ICLRGradient basedClassification
      Bender et al.[42]Understanding and simplifying
      one-shot architecture search
      One-shot2018ICMLGradient basedClassification
      Zhong et al.[43]Block-wise neural network architectureBlock-QNN2018CVPRReinforcement learningClassification
      Zoph et al.[44]NASNet architectureNASNet2018CVPRReinforcement learningClassification
      Yang et al.[45]Platform-aware neural network adaptationNetAdapt2018ECCVOtherClassification

      Table 1.  List of important NAS papers published in 2017 and 2018

      ReferenceFull name of methodsAbbreviationYearSourceOptimization methodApplication
      Ying et al.[46]NAS-Bench-101NAS-Bench-1012019ICMLOtherClassification
      Tan and Le[47]EfficientNetEfficientNet2019ICMLReinforcement learningClassification
      Chu et al.[48]Fairness of weight sharing
      neural architecture search
      FairNAS2019arXivEvolutionary algorithmClassification
      Ho et al.[49]Population based augmentationPBA2019ICMLEvolutionary algorithmAugmentation
      Cubuk et al.[50]AutoAugmentAutoAugment2019CVPRReinforcement learningAugmentation
      Real et al.[51]AmoebaNetAmoebaNet2019AAAIEvolutionary algorithmClassification
      Nekrasov et al.[52]2019CVPRReinforcement learningSemantic segmentation
      Chu et al.[53]Bridging the gap between stability
      and scalability in weight-sharing
      neural architecture search
      ScarletNAS2019arXivEvolutionary algorithmClassification
      Pérez-Rúa et al.[54]Multimodal fusion architecture searchMFAS2019CVPREvolutionary algorithmClassification
      Tan et al.[55]Mobile neural architecture searchMNASNet2019CVPRReinforcement learningClassification
      Liu et al.[56]Auto-DeeplabAuto-Deeplab2019CVPRGradient basedSemantic segmentation
      Chen et al.[57]Reinforced evolutionary
      neural architecture search
      RENAS2019CVPRGradient basedClassification
      Wu et al.[58]Facebook-Berkeley-NetsFBNet-V12019CVPRGradient basedClassification
      Li et al.[59]Dongfeng networksDF2019CVPREvolutionary algorithmClassification
      Dong and Yang[60]Differentiable architecture samplerGDAS2019CVPRGradient basedClassification
      Pham et al.[61]Efficient neural architecture searchENAS2019CVPRReinforcement learningClassification
      Xie et al.[62]Stochastic neural architecture searchSNAS2019ICLRGradient basedClassification
      Elsken et al.[63]LEMONADELEMONADE2019ICLREvolutionary algorithmClassification
      Ghiasi et al.[64]Feature pyramid architecture
      with neural architecture search
      NAS-FPN2019CVPRReinforcement learningObject detection
      Zhang et al.[65]Graph hypernetworks for
      neural architecture search
      GHN2019ICLRGradient basedClassification
      Cai et al.[66]Direct neural architecture searchProxylesNAS2019ICLRReinforcement learning Gradient basedClassification
      Liu et al.[67]Differentiable architecture searchDARTS2019ICLRGradient basedClassification
      Nayman et al.[68]Neural architecture search
      with expert advice
      XNAS2019NeurIPSGradient basedClassification
      Peng et al.[69]Neural architecture transformationNATS2019NeurIPSGradient basedObject detection
      Hu et al.[70]PetridishPetridish2019NeurIPSGradient basedClassification
      Dong and Yang[71]Transformable architecture searchTAS2019NeurIPSGradient basedClassification
      Chen et al.[72]Neural architecture search
      on object detection
      DetNAS2019NeurIPSOtherObject detection
      Wortsman et al.[73]Discovering neural wiringsDNW2019NeurIPSGradient basedClassification
      Dong and Yang[74]Neural architecture search for
      generative adversarial networks
      AutoGAN2019ICCVReinforcement learningGAN
      Dong and Yang[75]Self-evaluated template networkSETN2019ICCVGradient basedClassification
      Xiong et al.[76]Resource constrained neural
      network architecture search
      RCNet2019ICCVEvolutionary algorithmClassification
      Howard et al.[77]MobileNet-V3MobileNet-V32019ICCVEvolutionary algorithmClassification
      Zheng et al.[78]Multinomial distribution
      learning for effective
      neural architecture search
      MdeNAS2019ICCVOtherClassification
      Pasunuru and Bansal[79]Continual architecture searchCAS2019ACLReinforcement learningVideo captioning
      Jiang et al.[80]Improved differentiable architecture searchI-DARTS2019EMNLPGradient basedNLP
      Li and Talwalkar[81]Random search and reproducibility
      for neural architecture search
      Random search WS2019UAIGradient basedClassification

      Table 2.  List of important NAS papers published in 2019

      ReferenceFull nameAbbreviationYearSourceOptimization methodApplication
      Chu et al.[82]Mobile GPU-aware
      neural architecture search
      MoGA2020ICASSPEvolutionary algorithmClassification
      Chen et al.[83]Searching for faster real-time
      semantic segmentation
      FasterSeg2020ICLRGradient basedSemantic
      segmentation
      Xu et al.[84]Partially-connected
      differentiable architecture search
      PC-DARTS2020ICLRGradient basedClassification
      Mei et al.[85]Atomic blocks for neural architecture searchAtomNAS2020ICLROtherClassification
      Dong and Yang[86]NAS-Bench-201NAS-Bench-2012020ICLROtherClassification
      Tan et al.[87]EfficientDetEfficientDet2020CVPRReinforcement learningObject detection
      Fang et al.[88]Densely connected search space for more flexible neural architecture searchDenseNAS2020CVPRGradient basedClassification
      Zhang et al.[89]Gradient-based sampling NAS-random sampling NASGDAS-NSAS2020CVPRGradient basedClassification
      Li et al.[90]Distill neural architectureDNA2020CVPRGradient basedClassification
      Guo et al.[91]Robust architectures networkRobNet2020CVPRGradient basedClassification
      Gao et al.[92]Adversarial neural architecture search for GANsAdversarialNAS2020CVPRGradient basedGAN
      Wan et al.[93]Differentiable neural
      architecture search for
      spatial and channel dimensions
      FBNet-V22020CVPRGradient basedClassification
      Bender et al.[94]TuNASTuNAS2020CVPRReinforcement learningClassification
      Li et al.[95]Sequential greedy
      architecture search
      SGAS2020CVPRGradient basedClassification
      Zheng et al.[96]Budgeted performance estimationBPE2020CVPROtherClassification
      Phan et al.[97]Binary neural networkBNN2020CVPREvolutionary algorithmClassification
      He et al.[98]Efficient neural architecture
      search via mixed-level
      reformulation
      MiLeNAS2020CVPRGradient basedClassification
      Dai et al.[99]Data adapted pruning for efficient
      neural architecture search
      DA-NAS2020ECCVGradient basedClassification
      Tian et al.[100]Efficient and effective
      GAN architecture search
      E2GAN2020ECCVReinforcement learningGAN
      Chu et al.[101]Fair differentiable
      architecture search
      FairDARTS2020ECCVGradient basedClassification
      Hu et al.[102]Three-freedom neural
      architecture search
      TF-NAS2020ECCVGradient basedClassification
      Hu et al.[103]Angle-based search space shrinkingABS2020ECCVOtherClassification
      Yu et al.[104]Barrier penalty neural
      architecture search
      BP-NAS2020ECCVOtherClassification
      Wang et al.[105]Attention cell search for
      video classification
      AttentionNAS2020ECCVOtherVideo classification
      Bulat et al.[106]Binary architecTure searchBATS2020ECCVOtherClassification
      Yu et al.[107]Neural architecture search
      with big single-stage models
      BigNAS2020ECCVGradient basedClassification
      Guo et al.[108]Single path one-shot neural
      architecture search with
      uniform sampling
      Single-Path-SuperNet2020ECCVEvolutionary algorithmClassification
      Liu et al.[109]Unsupervised neural
      architecture search
      UnNAS2020ECCVGradient basedClassification

      Table 3.  List of important NAS papers published in 2020

    • The classification task is one of the important applications of NAS technology. As can be seen from Tables 1-3, most NAS methods are dedicated to finding a robust classification model. Fig. 5 shows the chronology of representative NAS methods for the classification task. These methods play an important role in the development history of NAS. Here, we briefly introduce them by year of publication.

      Figure 5.  Chronology of representative NAS methods for the classification task

      In 2017, Zoph and Le[15] published the first paper that proposed the concept of NAS. Their work expresses the network structure as a variable-length string. They learn a good network structure through reinforcement learning (RL), specifically generate a description of the neural network model by using recurrent neural network (RNN), and train the RNN to maximize the accuracy of the generated neural network model.

      In 2018, Zoph et al.[44] proposed the method of NASNet. NASNet improves the search space from searching hyperparameters to searching block cell structure, and its accuracy can reach state of the art (SOTA). Moreover, in this method, Zoph et al.[44] proposed to search on the proxy dataset, the small datasets (such as CIFAR-10), and then migrated to large datasets (such as ImageNet). Brock et al.[41] proposed the method of SMASH. SMASH uses an auxiliary network to initialize parameters of different networks and avoid retraining again, which greatly reduces the training time. Liu et al.[39] proposed the method of PNASNet to learn the structure of the CNN, which is more effective than the NAS methods based on reinforcement learning and evolutionary algorithm. Particularly, a sequential model-based optimization strategy is used in PNANet. Luo et al.[40] proposed the method of NAONet, which is a new method for optimizing network architecture, mapping the architecture to a continuous vector space. NAONet uses performance predictors and encoders to perform gradient optimization in continuous space to find a new coding structure with higher accuracy and decode it into a network by decoder.

      In 2019, Xie et al.[62] proposed the method of SNAS. SNAS directly optimizes the objective function of the NAS task, and puts forward the expectation of optimizing the network loss function and network forward delay, so as to automatically generate hardware-friendly sparse network. Real et al.[51] proposed the method of AmoebaNet. The search space of AmoebaNet adopts the search space of NASNet[44], and its network structure is similar to the structure of Inception[110]. In SNAS, the algorithm of ageing evolution is used to achieve better results. Pham et al.[61] proposed the method of ENAS, which is an economical and automatic model design method. By forcing all sub-models to share weights, the shortcomings of huge and time-consuming NAS computing power are overcome, and the GPU computing time is reduced by more than 1000 times. Cai et al.[66] proposed a NAS method without proxy tasks called ProxylessNAS. ProxylessNAS can directly search structures of large-scale target tasks, which can solve large GPU memory consumption problems and long computation time of the NAS method. Liu et al.[67] proposed the method of DARTS for effective structure search. Instead of searching in the discrete set of candidate structures, the search space of DARTS is relaxed to the continuous domain. The optimization can be carried out with the effect of verification set by means of gradient descent. The method of FairNAS is proposed by Chu et al.[48] FairNAS is the inheritance and development of the one-shot in the NAS community. FairNAS thinks that fair sampling and training methods can exert the potential of each module. Therefore, FairNAS proposes to meet the strict fairness. This constraint is that every single iteration of the hypernetwork makes the parameters of each optional operation module of each layer be trained. Dong and Yang[60] proposed the method of GDAS, which uses the gradient descent method to realize the effective network structure search. GDAS treats the search space as a directed acyclic graph, uses a differentiable sampler to test the sample structure, and optimizes the sampler by training the verification loss of the sampled structure. Howard et al.[77] proposed the method of MobileNet-V3, a new light-weight network structure based on MobileNet-V2[111]. It is searched by MNASNet[55] and NetAdapt[45]. MobileNet-V3 contains the MobileNet-V3-large version and the Mobilenet-V3-small version to cope with resource consumption scenarios. Moreover, MobileNet-V3 has been successfully used in target detection and semantic segmentation tasks. Wu et al.[58] proposed a search framework for differentiable neural structures called DNAS, which uses a gradient-based method to optimize the convolution network structure and avoids exhaustive and independent training structure. FBNets are the family of network structures generated by the DNAS search framework, surpasses the manually designed and automatically generated state of the art model. Tan et al.[55] proposed an automatic mobile NAS method called MNASNet, which explicitly incorporated model delay into the main target. The search can identify a model that achieves a good trade-off between precision and delay. Chu et al.[53] proposed the method of ScarletNAS with scalability function and solved the fairness problem of scalable hypernetwork training in one-shot routes through linear equivalent transformation. Tan and Le[47] proposed the method of EfficientNet. They used NAS to search a baseline network with accuracy and flops simultaneously, and do the balance of depth, width and resolution, and get a group of better EfficientNets.

      In 2020, Chu et al.[82] designed the mobile terminal GPU sensitive model from a practical standpoint, which is called MoGA. The method of PC-DARTS was proposed by Xu et al.[84] PC-DARTS is an extension of DARTS, which reduces the memory consumption of computing time in the process of network search through partial channel connection. Guo et al.[108] constructed a simplified hypernetwork called Single-Path- SuperNet, which was trained according to the uniform path sampling method. All substructures (and their weights) are fully and equally trained. Based on the trained hypernetwork, the optimal substructure can be quickly searched by an evolutionary algorithm, in which no fine-tuning of any substructure is required. Based on the idea of knowledge distillation, Li et al.[90] proposed the distill neural architecture (DNA), which introduces a teacher model to guide the direction of the network structure search. Using the supervision information from different depths of teacher model, the original end-to-end network search space is divided into blocks in-depth to realize the weight sharing training of independent blocks of network search space, which significantly reduces the interference caused by weight sharing. Wan et al.[93] proposed FBNet-V2, which takes both memory and efficiency into account. FBNet-V2 uses a masking mechanism for feature graph reuse and effective shape propagation to obtain better accuracy. Guo et al.[91] studied the model of neural network structure and then proposed the method of RobNet, which can resist the attack from the perspective of neural network structure. To obtain a large number of networks needed for research, they used a one-shot neural network structure search to train a super-net and then fine-tuned the sub-networks sampled from it.

      For 2D and 3D palmprint recognition and palm vein recognition, we select twenty representative NAS methods for performance evaluation including NASNet[44], SMASH[41], PNASNet[39], NAONet[40], SNAS[62], AmoebaNet[51], ENAS[61], ProxylessNSA[66], DARTS[67], FairNAS[48], GDAS[60], FBNet-V1[58], MNASNet[55], ScarletNAS[53], MoGA[82], PC-DARTS[84], Single-Path-SuperNet[108], DNA[90], FBNet-V2[93] and RobNet[91]. There are two main reasons for choosing these NAS methods. One reason is that these NAS methods are often used as reference methods for performance comparison in many literature pieces. The second reason is that the results of these NAS approaches are outstanding. It′s worth noting that MobileNet-V3[77] and EfficientNet[47] are network architectures based on the existing NAS methods and then elaborately designed manually. Therefore, MobileNet-V3 and EfficientNet can be viewed as semi-NAS methods; thus, they are not included in our selection. In our previous work[35], we have classified MobileNet-V3 and EfficientNet as traditional classic CNNs, and we have evaluated their recognition performance.

      In this section, we introduce the selected NAS methods in detail as follows.

      1) NASNet

      NASNet[44] was designed to make the training structure transferable. The best structure of NASNet is found on CIFAR-10 dataset, and then stacked several times and then applied to the ImageNet dataset. In addition, a new regularization technique called schedule drop path is proposed in this method. Fig. 6(a) shows the controller model architecture for recursively constructing one block of a convolutional cell. Each block requires selecting 5 discrete parameters, each of which corresponds to the output of a softmax layer. Fig. 6(b) shows the architecture of the best convolutional cells with B = 5 blocks identified with CIFAR-10. NASNet has different versions including NASNet-A, NASNet-B, NASNet-C and NASNet-mobile. The recognition performance of NASNet-A is the best, and the NASNet-mobile is light-weight network. In this paper, NASNet-A and NASNet-mobile are used for performance evaluation.

      Figure 6.  A diagram of the top performing normal cell and reduction cell in NASNet[44]: (a) Controller model architecture for recursively constructing one block of a convolutional cell; (b) Architecture of the best convolutional cells with B = 5 blocks identified with CIFAR-10.

      2) SMASH

      SMASH[41] trains an auxiliary model-hypernet to train the candidate models in the search process, which dynamically generates the weights of the main model with variable structure. Although the weights generated are worse than those obtained by free learning of fixed network structure, the relative performance of different networks in early training provides meaningful guidance for the performance under an optimal state. At the same time, a network representation mechanism based on memory back is developed to define various network structures.

      3) PNASNet

      PNASNet[39] can learn a CNN, which matches the previous SOTA, and requires five times less model evaluation during architecture search. The starting point of this work is the structured search space proposed by NASNet, in which the task of the search algorithm is to search for a suitable convolution cell rather than a complete CNN. A cell contains B blocks, one of which is a combination operator (such as addition) applied to two inputs (tensors), each of which can be transformed (e.g., using convolution) before combining. Then, according to the size of the training set and the running time of the CNN, the cell structure is stacked for a certain number of times. This modular design also allows us to migrate the architecture from one dataset to another easily. Fig. 7(a) shows the best cell structure found by PNASNet, consisting of 5 blocks, and Fig. 7(b) shows the construction of CNNs from cells on CIFAR-10 and ImageNet.

      Figure 7.  Cell structure of PNASNet[39]: (a) The best cell structure found by PNASNet; (b) Employing a similar strategy as [44] when constructing CNNs from cells on CIFAR-10 and ImageNet.

      4) NAONet

      Fig. 8 shows the general framework of NAONet[40]. NAONet consists of three parts: encoder, predictor and decoder. Experimental results showed that the architecture found by NAONet performs well in both CIFAR-10 image classification task and penn treebank (PTB) language modeling task, and is better than or equal to the best previous architecture search methods with significantly reduced computing resources.

      Figure 8.  General framework of NAONet[40]

      5) SNAS

      Compared with ENAS, the search optimization of SNAS is differentiable and the search efficiency is higher[62]. Compared with other differentiable methods such as DARTS and so on, SNAS directly optimizes the objective function of NAS tasks, and the search structure is more robust and efficient for multitasking. In addition, based on the advantage that SNAS keeps the advantage of stochasticity, Xie et al.[62] further proposed to optimize both the expectation of network loss function and the expectation of network forward delay to generate hardware friendly sparse network automatically.

      6) AmoebaNet

      AmoebaNet[51] improves the tournament selection method in genetic algorithm. This method is changed into an age-based selection method, namely the ageing evolution algorithm, which makes the genetic algorithm prefer young individuals. Experiments show that the algorithm has a faster search speed than reinforcement learning and random search on the same hardware conditions. The ageing evolution algorithm has the following six steps: i) P neural network structures are randomly initialized and added to the queue to form a population for training; ii) The population was sampled, and S neural networks were selected; iii) S neural networks are obtained by sampling, and the neural network with the highest accuracy is selected as the parent; iv) The network is trained and added to the population, i.e., the rightmost side of the queue; v) Removing the “oldest” neural network in the population is actually the leftmost element of the queue; vi) Go back to Step ii) and cycle a certain number of times.

      7) ENAS

      ENAS[61] is an economical and automatic model design method beyond NASNet[44]. By forcing all sub-models to share weights, the efficiency of NAS is improved, and the shortcomings of huge computing cost and time-consuming of NAS are overcome. The computing time of GPU is reduced by more than 1000 times. On the CIFAR-10 dataset, the test error reaches 2.89%, which is similar to NASNet (2.65% test error). Fig. 9 shows the network architecture of ENAS.

      Figure 9.  ENAS′s discovered network from the macro search space for image classification[61]

      8) ProxylessNSA

      ProxylessNAS[66] is the first NAS algorithm to search large design space directly on large-scale ImageNet datasets without any proxy and customize CNN architecture for hardware for the first time. Cai et al.[66] combine the idea of model compression (pruning and quantifying) with NAS, reduce the computing cost (GPU time, GPU memory) of NAS to the same scale as the conventional training, and reserve rich search space, and directly incorporate the hardware performance (delay, energy consumption) of neural network structure into the optimization objective. Fig. 10 shows the efficient models optimized for different hardware. Figs. 10(a)10(c) show the GPU model, the CPU model and the mobile model found by ProxylessNAS. GPU prefers shallow and wide model with early pooling, while CPU prefers deep and narrow model with late pooling. Pooling layers prefer large and wide kernel while early layers prefer small kernel, late layers prefer large kernel. In this paper, the GPU model and mobile model of ProxylessNAS are used for evaluation.

      Figure 10.  Efficient models optimized for different hardware of the ProxylessNAS method [66]: (a) GPU model found by ProxylessNAS; (b) CPU model found by ProxylessNAS; (c) Mobile model found by ProxylessNAS.

      9) DARTS

      Most network search algorithms use enhanced learning or evolutionary algorithms to search for structures. The search space of such algorithms is discrete, and the search time is too time consuming. The search space of DARTS is continuous, and the search process is completed by using a gradient descent algorithm on the verification set. The computational cost of DARTS is several orders of magnitude smaller than that of ordinary network search algorithms, but the result obtained by searching can still be equal to that of the previous SOTA algorithm. Meanwhile, its generalization ability is also very good. It can be used not only for searching CNN structure, but also for searching RNN structure. Fig. 11 shows an overview of DARTS.

      Figure 11.  An overview of DARTS[67]: (a) Operations on the edges are initially unknown; (b) Continuous relaxation of the search space by placing a mixture of candidate operations on each edge; (c) Joint optimization of the mixing probabilities and the network weights by solving a bilevel optimization problem; (d) Inducing the final architecture from the learned mixing probabilities.

      10) FairNAS

      FairNAS[48] is a one-shot method in the field of NAS, and it advocates that weights can be shared. It trains a super-net from the beginning to the end (only one HYPERNET is trained completely, which is also the meaning of one-shot). Each model is a sampling model of HYPERNET. The advantage of this is that it does not need time-consuming training for each model to know its representation ability. Therefore, it is famous for greatly improving the efficiency of NAS and has become the mainstream of NAS. Nevertheless, the premise of one-shot is that weight sharing is effective, and the model ability can be verified quickly and accurately in this way. This kind of situation is a little like the Matthew effect. If the conditions are not good, they will fall into a circular dilemma. FairNAS believes that fair sampling and training methods can give full play to the potential of each module. Finally, after completing the training, the sampling model can quickly use the weights in the hypernet to get a relatively stable performance index on the verification set. This fair algorithm can almost completely maintain the ranking of the models, and the models sampled from the super-net and the models trained separately will eventually have almost the same ranking. FairNAS has three versions, including FairNAS-A, FairNAS-B and FairNAS-C, which have different search architectures by different search space. Fig. 12 shows architectures of FairNAS-A, B and C.

      Figure 12.  Architectures of FairNAS-A, B and C (from top to bottom)[48]

      11) GDAS

      GDAS[60] uses the gradient descent to search the network structure effectively, and the search space is represented by a directed acyclic graph (DAG). This DAG may have millions of sub-graphs, each of which is a neural network structure. In order to avoid traversing so many subgraphs, Dong and Yang[60] use a differentiable sampler to sample the sample structure and optimizes the sampler by training the verification loss of the sampled structure. GDAS can search a robust neural network structure in 4 hours on a V100GPU. GDAS is similar to DARTS, but there are two differences between them: i) How to make the search space differentiable? DARTS transforms the weight softmax of operations into a probability after joint optimization and takes the operation with the maximum probability of connection between nodes. Dong and Yang[60] used the Gumbel max trick to select the transformation function between nodes with argmax function in forward propagation, and use softmax function to differentiate one hot vector in backward propagation and use gradient backpropagation. ii) DARTS jointly searches all operations, which will lead to antagonism between operations, and the weight that may be generated will offset each other, which makes the optimization difficult. Besides, the joint search of normal cell and reduction cell greatly increases the search space. In [60], the reduction cell is fixed, and the normal cell is searched. It only takes 4 hours on a V100GPU, and only the function between the sampled nodes is updated each time. Fig. 13 shows the search space of a neural cell using DAG. Different operations (colored arrows) transform one node (square) to its intermediate features (little circles). Meanwhile, each node is the sum of the intermediate features transformed from the previous nodes. As indicated by the solid connections, the neural cell in the proposed GDAS is a sampled sub-graph of this DAG. Specifically, among the intermediate features between every two nodes, GDAS samples one feature in a differentiable way.

      Figure 13.  Using a DAG to represent the search space of a neural cell[60]

      12) FBNet-V1

      In the method of FBNet-V1[58], a differentiable neural architecture search (DNAS) is used to find the hardware related light-weight convolution network. The DNAS method represents the whole search space as a hypernetwork, transforms the search for the optimal network structure into finding the optimal candidate block distribution, trains the block distribution by gradient descent, and can select different blocks for each layer of the network. In order to better estimate the network delay, the actual delay of each candidate block is measured and recorded in advance, which can be accumulated directly according to the network structure and the corresponding delay. Fig. 14 shows the visualization of some of the searched architectures.

      Figure 14.  Visualization of some of the searched architectures of FBNet-V1[58]

      13) MNASNet

      MNASNet[55] is an automatic mobile neural network search method. MNASNet explicitly takes the model operation delay time as one of the main optimization objectives to search for a network model structure that can balance the latency and accuracy. In the previous work, the latency was measured indirectly by an inaccurate method, such as flops (floating-point operations per second). MNASNet can implement models on mobile devices to directly measure the inference delay in the real world. In addition, a hierarchical search space method is proposed to determine the network structure. The first inspiration of [55] comes from the fact that although there are similar flops (575 M VS. 564 M) in MobileNet and NASNet, the delay time is quite different (113 ms VS. 183 ms). Secondly, Tan et al.[55] observed that the previous automation methods mainly search for several types of units and then stack the same units repeatedly through the network. This simple search mechanism will limit the diversity of layers. The first inspiration gave birth to the idea of multi-objective optimization of operation delay time and precision; the second inspiration gave birth to the method of hierarchical decomposition of search space. Allow layers to be architecturally different but still strike the right balance between flexibility and search space size. Fig. 15 shows an overview of MNASNet.

      Figure 15.  An overview of MNASNet

      14) ScarletNAS

      In the method of ScarletNAS[53], an automatic neural network search with a scalability function is proposed. The problem of the fairness of scalable hypernetwork training in one shot route is solved by a linear equivalent transformation. ScarletNAS uses conv1×1 (without bias/relu) + conv to replace identity + conv in training super net, which solves the problem of convergence in training scalable network. The introduction of conv1×1 (without bias/relu) + conv is a linear transformation, which is equivalent to identity + conv. In ImageNet 1K classification task, it achieves 76.9% top-1 accuracy and is currently SOTA of < 390M flops level. ScaletNAS has three different versions, including ScaletNAS-A, ScaletNAS-B and ScaletNAS-C, and their network complexity is gradually reduced. ScaletNAS-A usually gets the better result. Fig. 16 shows architectures of ScaletNAS-A, B and C.

      Figure 16.  Architectures of ScaletNAS -A, B and C (from top to bottom)[53]

      15) MoGA

      MoGA[82] considers the use of mobile GPU in real scenes, and the model can directly serve mobile visual products. The first novel point of MoGA is mobile GPU aware (MoGA), which is to design the mobile GPU sensitive model from the perspective of practical use. The second point of view of MoGA comes from the analysis of MobileNet trilogy. From MobileNet-V1 to MobileNet-V3, the accuracy is constantly improving, but the number of model parameters is increasing. Therefore, the optimization of the model parameters is an aspect worth studying. In addition to the business indicator top-1 accuracy, the running time of the model at the device side is regarded as the key indicator to measure the model, not the multiplier and adder, so the multiplier and adder are eliminated from the target. Also, the previous methods tried to compress the parameters, which is very disadvantageous to multi-objective optimization. On the Pareto boundary, which does not harm others but benefits oneself, one must give up to get something. It is considered that the parameter quantity is the representation of the model capability, so the model with high parameters but with low delay can be obtained by encouraging the increase of parameter quantity instead of increasing the search range. MoGA has three versions including MoGA-A, MoGA-B and MoGA-C, which have different search layers. Fig. 17 shows architectures of MoGA-A, B and C.

      Figure 17.  Architectures of MoGA-A, B and C (from top to bottom)[82]

      16) PC-DARTS

      PC-DARTS[84] is an effective channel sampling method in which only a part of the channel is sampled into the core of the multi-choice operation. Channel sampling can alleviate the “overfitting” phenomenon of hypernetwork, and greatly reduce its memory consumption so that the speed and stability of structure search can be improved by increasing the batch size in the training process. However, channel sampling will lead to the inconsistency of the edge selection of the hypernetwork, which increases the disturbance caused by random approximation. In order to solve this problem, an edge regularization method is proposed, which uses a set of additional edge weight parameters to reduce the uncertainty in search. After these two improvements, the search speed of the method is faster, the performance is more stable, and the accuracy is higher. Fig. 18 shows the general framework of PC-DARTS.

      Figure 18.  Overall framework of PC-DART: The upper part is the partial channel connection and the lower part is the edge regularization[84].

      17) Single-Path-SuperNet

      One-shot is a powerful neural architecture search framework, but its training is relatively complex, and it isn′t easy to obtain competitive results on large data sets (such as ImageNet). Guo et al.[108] proposed a single path one-shot model called Single-Path-SuperNet to solve the main challenges in the training process. The core idea of Single-Path-SuperNet is to construct a simplified super network, which is trained according to the uniform path sampling method. All substructures (and their weights) are fully and equally trained. Based on the trained hypernetwork, the optimal substructure can be quickly searched by an evolutionary algorithm, in which no fine-tuning of any substructure is required.

      18) DNA

      DNA[90] is a method to solve the two problems: efficiency and effectiveness. It differs from the existing neural network search algorithm such as RL, DARTS, One-Shot, etc. Based on the idea of knowledge distillation, Li et al.[90] introduce a teacher model to guide the direction of the network structure search. Using the supervision information from different depths of teacher model, the original end-to-end network search space is divided into blocks in-depth to realize the weight sharing training of independent blocks of network search space, which greatly reduces the interference caused by weight sharing. At the same time, it ensures the accuracy of the evaluation of candidate sub-models without sacrificing the efficiency of weight sharing. The algorithm can traverse all candidate structures in the search space. DNA has four versions, i.e., DNA-a, DNA-b, DNA-c and DNA-d, which are searched by different parameters. Fig. 19 shows an illustration of DNA. The teacher′s previous feature map is used as input for both teacher and student block. Each cell of the super net is trained independently to mimic the behavior of the corresponding teacher block by minimizing the l2-distance between their output feature maps. The dotted lines indicate randomly sampled paths in a cell.

      Figure 19.  Illustration of the DNA method[90]

      19) FBNet-V2

      There have been some classic NAS before the FBNet-V2 method was proposed, such as DARTS, ProxylessNAS and so on. However, all of them have their own defects, such as: i) Memory loss limits the size of search space; ii) Memory loss is linearly related to the number of operations per layer; iii) ProxylessNAS can effectively reduce memory loss by using binary training method, but in large search space, the memory loss is not limited, and the convergence rate is very slow. FBNet-V2[93] can greatly expand the search space without increasing memory loss and can maintain high-speed search in large search space. The main contributions of FBNet-V2 are as follows: i) A NAS with both memory and efficiency is proposed; ii) A masking mechanism and an effective shape propagation for feature map reuse are proposed; iii) The precision of the network is very high.

      20) RobNet

      In order to improve the robustness of the deep neural network, the existing work focuses on the study of confrontation learning algorithm or loss function to enhance the network robustness. From the perspective of neural network structure, RobNet[91] studies the model of neural network structure, which can resist attack. In order to obtain a large number of networks needed in this study, one-shot neural network structure search is used to train a supernet, and then the sub-networks sampled from it are adjusted against fine-tuning. The sampling network structure and its robustness accuracy provide a rich basis for the research.

    • In this paper, five 2D palmprint image databases, one 3D palmprint database and two palm vein databases are exploited for performance evaluation, including PolyU II, PolyU M_B, HFUT, HFUT CS, TJU-P, PolyU 3D, PolyU M_N and TJU-PV. After preprocessing, the ROI sub-images were cropped. The ROI size of all databases is 128×128. The detailed descriptions of the above databases are listed in Table 4. Figs. 20-23 depict some ROI images of four 2D palmprint databases. In Figs. 20-23, the three images depicted in the first row were captured in the first session. The three images depicted in the second row were captured in the second session. Fig. 24 shows three original palmprints of the HFUT CS database and their corresponding ROI images. Figs. 25 and 26 depict some ROI images of two 2D palm vein databases. In Figs. 25 and 26, the three images depicted in the first row were captured in the first session. The three images depicted in the second row were captured in the second session. Fig. 1 shows one original 3D palmprint data, and four different 2D representations from one 3D palmprint, including MCI, GCI, ST and CST.

      DatabaseTypeTouch?Individual
      number
      Palm
      number
      Session
      number
      Session
      interval
      Image number
      of each palm
      Total image
      number
      PolyU II2D palmprintYes19338622 months10×27752
      PolyU M_B2D palmprintYes25050029 days6×26000
      HFUT2D palmprintYes400800210 days10×216000
      HFUT CS2D palmprintNo100200210 days10×2×312000
      TJU-P2D palmprintNo300600261 days10×212000
      PolyU 3D3D palmprintYes20040021 months10×28000
      PolyU M_NPalm veinYes25050029 days6×26000
      TJU-PVPalm veinNo300600261 days10×212000

      Table 4.  Details of 2D palmprint, 3D palmprint and palm vein databases

      Figure 20.  Six palmprint ROI images of PolyU II database. The three images in the first row were captured in the first session. The three images in the second row were captured in the second session.

      Figure 21.  Six palmprint ROI images of the PolyU M_B database. The three images in the first row were captured in the first session. The three images in the second row were captured in the second session.

      Figure 22.  Six palmprint ROI images of the HFUT database. The three images in the first row were captured in the first session. The three images in the second row were captured in the second session.

      Figure 23.  Six palmprint ROI images of the TJU-P database. The three images in the first row were captured in the first session. The three images in the second row were captured in the second session.

      Figure 24.  Three original palmprint and ROI images of the HFUT CS database. The three images in the first row are original palmprint images. The three images in the second row are the corresponding ROI images.

      Figure 25.  Six palm vein ROI images of the PolyU M_N database. The three images in the first row were captured in the first session. The three images in the second row were captured in the second session.

      Figure 26.  Six palm vein ROI images of the TJU-PV database. The three images in the first row were captured in the first session. The three images in the second row were captured in the second session.

      PolyU II is a challenging palmprint database because the illumination between the first and second sessions has a notorious change. HFUT CS is also a challenging palmprint database. From Fig. 24, it can be seen that there are some differences between the palmprints captured by different devices.

    • We selected twenty NAS methods for performance evaluation. As some of the selected methods have different versions, we selected various versions for evaluation, including (NASNet-A, NASNet-mobile), (ProxylessNAS, ProxylessNAS-mobile), (FairNAS-A, B, and C), (ScaletNAS-A, B and C), (MoGA-A, B and C) and (DNA-a, b, c and d).

      Here, we introduce the default configuration of the experiment, including experimental hyperparameters and hardware configuration. Since different networks need different input sizes, the palmprint and palm vein ROI image need to be up-sampled to a suitable size before inputting into the network. In order to enhance the stability of the network, we also added a random flip operation (only during the training phase); that is, for a training image, there is a certain probability that the image is flipped horizontally and then input into the network. We did not initialize the model parameters using the random parameter initialization method; instead, we initialized it using the parameters of the pretrained model in the ImageNet or CIFAR dataset. It is worth noting that when an official model is trained on the ImageNet dataset, we prefer to use the pre-trained model. If not, we will use the pre-trained model of the CIFAR dataset. The palmprint and palm vein ROI image in the database is usually a grayscale image, which means that the number of image channels is 1. The input of the model is an RGB image, so the grayscale channel of the image is copied three times to form an RGB image with 3 channels.

      The system configuration is as follows: Intel CPU i7-8700 3.20GHz, NVIDIA GPU GTX 2080, 16GB memory and Windows 10 operating system. All evaluation experiments were performed on Pytorch. The cross-entropy loss function, Adam optimizer, was used by default. The batch size was set to 4, and the learning rate to 5×10−5.

    • In this paper, both identification and verification experiments are conducted.

      Identification is a one-to-many comparison, which answers the question of “who the person is?”. In this paper, the close-set identification is conducted. That is, we know all enrollments exist in the training set. In order to obtain identification accuracy, the rank 1 identification rate is used, in which a test image will be matched with all templates in the training set, and the label of the most similar template will be assigned to this test image. For the sake of simplicity, we define the rank 1 identification rate as the accuracy recognition rate (ARR).

      Verification is a one-to-one comparison, which answers the question of “whether the person is whom they claim to be”. In the verification experiments, the statistical value of equal error rate (EER) is adopted to evaluate the performance of different methods.

    • We first conduct evaluation experiments on separate data mode. That is, in all databases, all samples captured in the first session are used for training, and all samples captured in the second session are used for the test. We conduct the experiments using all selected NAS methods on all databases. The values of ARR and EER of the selected NAS methods obtained from 2D palmprint and palm vein databases are listed in Tables 5 and 6, respectively. The values of ARR and EER obtained from four 2D representations (CST, ST, MCI and GCI) of 3D palmprint databases are listed in Table 7.

      PolyU IIPolyU M_BHFUT IHFUT CSTJU-PPolyU M_NTJU-PV
      NASNet94.4699.1094.4797.7397.6899.4794.25
      NASNet-Mobile93.0497.9793.0999.0297.7799.2394.77
      SMASH87.5095.6892.4481.2993.1696.2991.03
      PNASNet95.6299.3797.9199.4899.2599.6397.58
      NAONet79.9994.5688.5371.2185.3392.7284.07
      SNAS88.7796.1790.6694.0393.7895.5592.12
      AmoebaNet72.5893.9381.8585.1777.7595.4785.33
      ENAS90.7798.4890.3586.7591.6397.4690.28
      ProxylessNAS98.6310099.6899.7899.7599.8999.67
      ProxylessNAS-Mobile98.5699.8698.9899.3599.3099.8699.53
      DARTS91.5598.7593.7492.0797.4798.3696.18
      FairNAS-A96.5310098.8599.3999.5510099.23
      FairNAS-B97.7199.9598.5997.1999.2299.9099.10
      FairNAS-C97.0999.9398.8696.8899.2899.9799.27
      GDAS67.4587.3084.0083.8187.8388.5889.50
      FBNet-V190.3197.8093.3496.8390.0896.4589.78
      MNASNet94.0599.8795.3898.8196.0799.4795.62
      Scarlet-A98.1899.7098.7498.7099.6999.7099.56
      Scarlet-B98.4199.9398.6698.0199.5899.8399.01
      Scarlet-C97.2599.9598.6897.5999.4299.9398.95
      MoGA-A94.1999.4395.8596.5497.2299.8396.27
      MoGA-B95.3399.6095.7798.2997.9899.7797.03
      MoGA-C94.0998.8395.9997.0097.2899.5096.89
      PC-DARTS76.3994.3083.4567.8882.4887.8386.15
      Single-Path-SuperNet92.3999.3091.3593.5594.0599.2094.68
      DNA-a93.5098.9092.2094.9694.3098.3393.57
      DNA-b94.5399.3392.3398.494.4798.5096.02
      DNA-c94.7699.2792.9798.5594.8898.6793.58
      DNA-d95.2799.4393.4199.1995.1499.3794.45
      FBNet-V292.6099.7795.0497.2198.5297.5394.17
      RobNet83.8695.5788.8991.2790.4894.7191.95

      Table 5.  ARR (%) of different NAS methods on different 2D palmprint and palm vein databases under the separate data mode

      PolyUPolyU M_BHFUT IHFUT CSTJU-PPolyU M_NTJU-PV
      NASNet2.43930.05582.93290.26570.25050.10382.6386
      NASNet-Mobile2.65600.14312.93080.15630.09070.11452.6503
      SMASH6.64431.65733.11708.58762.62911.83023.1176
      PNASNet1.07460.07270.45750.10800.02350.02060.1771
      NAONet10.28442.64065.735116.77657.84733.67847.0421
      SNAS5.15921.30524.39882.59052.53641.44533.0628
      AmoebaNet15.90822.68508.78047.589411.31711.89667.5680
      ENAS4.83640.10574.54076.33403.87210.15824.2093
      ProxylessNAS0.17283.34×10−50.04070.02170.01870.00110.0704
      ProxylessNAS-Mobile0.11820.01100.11200.07530.05930.00120.0690
      DARTS3.57450.10692.03483.44830.24680.13231.3347
      FairNAS-A1.45420.00010.10570.04820.01320.00010.0647
      FairNAS-B0.18760.00030.21520.46930.04080.02220.1819
      FairNAS-C0.60650.00080.25841.35670.01390.00150.0720
      GDAS17.57716.07257.41868.43446.41935.36446.4783
      FBNet-V14.73390.19512.80291.37824.13821..78905.3434
      MNASNet2.49920.00041.50630.20341.28180.01951.3711
      Scarlet-A0.13230.00270.17190.12970.00250.02110.0719
      Scarlet-B0.11460.00090.18380.42380.01710.01920.1023
      Scarlet-C0.70560.00010.23860.35180.01660.00080.0809
      MoGA-A2.15920.02031.79241.44790.23430.00051.4563
      MoGA-B1.10680.01811.58100.20930.17780.01890.1983
      MoGA-C2.56570.12391.52040.56420.18460.02101.2376
      PC-DARTS12.74902.63347.546120.59048.42426.56826.4458
      Single-Path-SuperNet3.74650.02173.28222.10982.52890.06972.4697
      DNA-a2.26190.06853.13032.44712.57820.28212.3198
      DNA-b2.84740.03783.26030.21312.52030.23221.4756
      DNA-c2.30420.03913.02390.22002.42050.17682.2239
      DNA-d1.76030.05293.95180.15181.34570.08612.2149
      FBNet-V23.76990.00131.66560.22940.07240.29292.1560
      RobNet9.87971.39615.94393.69024.02452.08953.5789

      Table 6.  EER (%) of different NAS methods on different 2D palmprint and palm vein databases under the separate data mode

      CST STMCIGCI
      ARREERARREERARREERARREER
      NASNet97.621.087697.881.067497.501.569890.764.9980
      NASNet-Mobile97.781.003498.400.492798.980.354591.654.3674
      SMASH96.761.789497.441.566596.361.891393.283.0001
      PNASNet97.881.002199.950.001899.250.039397.301.5289
      NAONet73.7615.443868.5418.445079.3310.775669.3816.2980
      SNAS95.672.587994.782.890593.253.265591.554.3881
      AmoebaNet88.055.689092.703.762091.884.197886.556.8345
      ENAS97.331.335797.671.302593.543.778092.103.0186
      ProxylessNAS99.900.002899.920.00201000.005799.450.1193
      ProxylessNAS-Mobile99.080.054499.620.015899.920.020498.970.1256
      DARTS97.171.156496.071.870395.732.665491.254.4035
      FairNAS-A99.580.043699.400.049299.250.058794.053.2790
      FairNAS-B99.620.033699.901.322599.520.023494.532.4674
      FairNAS-C98.250.106998.980.063299.750.003796.271.7345
      GDAS93.123.978885.207.588783.758.665582.028.7798
      FBNet-V196.031.378996.601.223795.402.787789.765.6678
      MNASNet98.750.188299.120.043097.951.209295.232.7489
      Scarlet-A99.300.053099.780.002399.880.002697.201.2018
      Scarlet-B98.970.086499.000.045698.900.175198.470.1159
      Scarlet-C97.781.001099.220.074798.970.105593.563.2766
      MoGA-A98.900.072097.600.083799.350.145194.302.9876
      MoGA-B99.120.020899.000.062498.520.701694.152.9915
      MoGA-C97.881.000398.200.108996.701.589091.674.2876
      PC-DARTS91.454.865892.483.564490.674.171790.224.7780
      Single-Path-SuperNet94.752.889094.532.906789.555.389089.855.7723
      DNA-a93.533.676194.642.287692.073.880186.456.8990
      DNA-b93.883.548895.582.056793.033.380987.076.7865
      DNA-c95.502.865096.471.447993.533.000287.866.3987
      DNA-d94.332.991496.621.387594.672.855088.335.6868
      FBNet-V297.051.787798.650.123197.471.010290.334.3550
      RobNet91.444.356990.464.139087.356.890386.896.7743

      Table 7.  ARR (%) and EER (%) of different NAS methods on four 2D representations of 3D palmprint databases under the separate data mode

      From Tables 5-7, we have the following observations:

      1) ProxylessNAS achieves the best recognition results on most databases. In the palm vein databases PolyU M_N and FairNAS-A, it achieves the best recognition results, whose ARR and EER are 100% and 0.0001%, respectively.

      2) ProxylessNAS was proposed in 2019, but its recognition performance is better than those methods proposed in 2020. This shows that the recognition performance of the latest NAS recognition methods is not necessarily better than that of the old NAS methods. For example, the recognition performance of NASNet proposed in 2016 is better than that of some methods proposed in 2019 and 2020, such as DARTS, ENAS, and RobNet, etc.

      3) PolyU II is a challenging database because the samples captured in the first session and the second session have some noticeable variations, such as illumination change. In this database, the highest ARR is obtained by the proxylessNAS method, which is 98.63%. However, this is an unsatisfactory recognition result. It also shows that it is necessary to further study the new NAS methods to improve the results of 2D palmprint recognition.

      4) HFUT CS is a cross-sensors database and is also a challenging database. In this database, the highest ARR is obtained by the proxylessNAS method, which is 99.78%. It is a promising result, and it also shows that cross-sensor palmprint recognition based on NAS technology is worthy of attention in the future.

      5) For 3D palmprint recognition, the method of ProxylessNAS achieves 100% ARR and 0.005 7% EER on MCI representation, which is a very encouraging result. This result shows that NAS technology is up-and-coming in 3D palmprint recognition, which deserves further study. Among four 2D representations of 3D palmprint, MCI is most suitable for 3D palmprint recognition based on NAS technology.

    • In this section, we conduct evaluation experiments on the mixed data mode. The first image captured in the second session is added to the training data. That is, the training set of each palm contains all images captured in the first session and the first image captured in the second session. The values of ARR and EER of the selected NAS methods obtained from 2D palmprint and palm vein databases are listed in Tables 8 and 9, respectively. The values of ARR and EER obtained from 3D palmprint databases are listed in Table 10.

      PolyUPolyU M_BHFUT IHFUT CS TJU-P PolyU M_NTJU-PV
      NASNet 98.2610099.1310010010098.55
      NASNet-Mobile98.1110098.1110010010098.68
      SMASH 93.7699.3597.7787.7798.7799.9396.93
      PNASNet99.63100100100100100100
      NAONet86.3598.7893.0878.9591.2297.8889.89
      SNAS94.5899.7696.0299.4498.3799.9097.68
      AmoebaNet78.4398.5686.3490.1784.5099.7990.77
      ENAS96.9810096.0192.3796.6810095.87
      ProxylessNAS100100100100100100100
      ProxylessNAS-Mobile100100100100100100100
      DARTS 97.2510098.6797.2310010099.97
      FairNAS-A99.76100100100100100100
      FairNAS-B 100100100100100100100
      FairNAS-C10010010099.95100100100
      GDAS74.8894.2189.6789.0293.3293.4394.78
      FBNet-V196.7510098.3499.3396.7799.8695.49
      MNASNet98.8710099.2210099.5810099.80
      Scarlet-A100100100100100100100
      Scarlet-B 100100100100100100100
      Scarlet-C100100100100100100100
      MoGA-A98.6610099.4599.86100100100
      MoGA-B99.4110099.76100100100100
      MoGA-C98.3810099.8610010010099.98
      PC-DARTS83.7598.8089.6273.4688.4093.4592.35
      Single-Path-SuperNet97.6010096.4798.9099.0810098.98
      DNA-a 98.2210098.3699.8899.0710098.12
      DNA-b 99.2610098.5510099.1410099.85
      DNA-c 99.3110098.6710099.5410098.54
      DNA-d99.7910098.9010099.8710098.69
      FBNet-V297.4410099.6610010010098.72
      RobNet90.2599.1094.3396.9495.6098.8696.80

      Table 8.  ARR (%) of different NAS methods on different 2D palmprint and palm vein databases under the mixed data mode

      PolyUPolyU M_BHFUT IHFUT CS TJU-PPolyU M_NTJU-PV
      NASNet0.13250.00130.85730.01010.00780.00600.1543
      NASNet-Mobile 0.14780.00180.14760.00960.00840.00540.1448
      SMASH 2.67210.04790.97606.17250.13210.01441.4550
      PNASNet 0.03220.00250.00880.00460.00330.00210.0077
      NAONet6.64380.57882.402112.44523.27680.70945.2336
      SNAS 2.28760.03861.48930.03780.17680.01770.7446
      AmoebaNet 11.65480.72367.48764.52769.34620.02334.5573
      ENAS1.36850.00261.58763.54261.37650.00371.6630
      ProxylessNAS0.01102.26×10−50.00650.00080.00100.00040.0020
      ProxylessNAS-Mobile0.01350.00010.00930.00180.00230.00080.0036
      DARTS 0.87340.00520.11250.81280.00910.00770.0133
      FairNAS-A 0.04525.58×10−50.00860.00120.00160.00010.0064
      FairNAS-B 0.01970.00010.01030.00350.00190.00180.0078
      FairNAS-C0.01250.00040.01220.02020.00280.00140.0045
      GDAS13.25582.12605.30125.09822.57682.65102.2765
      FBNet-V1 1.42230.00070.13290.04371.34250.01891.7890
      MNASNet0.10750.00030.81700.00880.03760.00340.0208
      Scarlet-A 0.00980.00180.00860.00340.00680.00240.0053
      Scarlet-B0.01020.00230.00910.00780.00730.00560.0064
      Scarlet-C 0.00930.00350.00980.00660.00700.00070.0058
      MoGA-A0.17650.00380.04220.02320.00240.00030.0093
      MoGA-B0.04220.00330.04080.00750.00370.00550.0077
      MoGA-C0.22100.00170.03860.00560.00420.00390.0098
      PC-DARTS 9.76830.55645.567314.67925.98062.64813.5470
      Single-Path-SuperNet0.74600.00441.44260.09560.05670.00550.1386
      DNA-a 0.18780.00740.17770.01100.05640.00350.1745
      DNA-b0.03850.00650.17440.00640.04980.00280.0200
      DNA-c0.03220.00580.16590.00510.04220.00240.1537
      DNA-d0.02800.00340.13220.00480.03570.00160.1422
      FBNet-V2 0.82120.00080.03760.00350.00250.00740.1444
      RobNet4.58700.05522.25541.23691.66540.15651.4987

      Table 9.  EER (%) of different NAS methods on different 2D palmprint and palm vein databases under the mixed data mode

      CSTSTMCIGCI
      ARREERARREERARREERARREER
      NASNet99.870.024499.930.011899.460.038698.320.1221
      NASNet-Mobile99.910.02131000.010199.580.034998.870.1168
      SMASH99.380.035299.870.013599.140.031198.250.1288
      PNASNet1000.10071000.00221000.00871000.1018
      NAONet79.4610.376574.5512.332086.296.866976.8711.0005
      SNAS98.760.155397.420.636097.180.911496.331.4582
      AmoebaNet95.651.762197.770.789097.210.827892.463.4640
      ENAS99.550.033599.930.011098.670.143297.880.8126
      ProxylessNAS1000.00061000.00041000.00151000.0076
      ProxylessNAS-Mobile1000.00141000.00121000.00321000.0087
      DARTS99.450.034299.110.041298.790.122598.900.1155
      FairNAS-A1000.00281000.00301000.004198.750.1119
      FairNAS-B1000.00191000.00581000.001099.000.1034
      FairNAS-C99.980.00971000.00851000.005099.650.0108
      GDAS98.330.165091.263.284490.594.344588.435.9870
      FBNet-V199.220.037599.640.032299.690.021898.190.1276
      MNASNet1000.00851000.00341000.010199.560.0183
      Scarlet-A1000.00741000.00181000.002099.660.0098
      Scarlet-B1000.00831000.00681000.00951000.0076
      Scarlet-C99.870.01101000.00771000.008598.130.1513
      MoGA-A1000.005699.760.01621000.007898.780.1114
      MoGA-B1000.00481000.00551000.009898.440.1130
      MoGA-C99.940.01021000.007399.670.022597.350.7554
      PC-DARTS96.311.544397.440.856496.101.412295.871.6347
      Single-Path-SuperNet99.660.032499.350.023695.871.676095.951.6220
      DNA-a98.550.159099.120.032097.380.732691.303.2625
      DNA-b98.970.135399.740.027798.110.173592.333.5870
      DNA-c99.540.110899.900.015599.620.027092.983.3218
      DNA-d99.020.125499.990.010699.150.112893.472.5538
      FBNet-V21000.10141000.00921000.006895.091.7944
      RobNet97.340.821196.551.452193.292.544993.062.5718

      Table 10.  ARR (%) and EER (%) of different NAS methods on four 2D representations of 3D palmprint databases under the mixed data mode

      From Tables 8-10, we have the following observations:

      1) The recognition results of all the NAS methods obtained in “mixed data mode” is better than that of “separate data mode”. Although we only added one image of the second session to the training set, the recognition results of all NAS methods are still greatly improved. This experiment shows that the prediction accuracy of deep learning-based methods will be improved after more data are obtained. We can infer that if the network can be trained by the data collected from multiple stages, the recognition results of all methods can be significantly improved.

      2) In the “mixed data mode”, the three methods with the best recognition performance are ProxylessNAS, ProxylessNAS-Mobile and Scarlet-A. The ProxylessNAS method achieves the best recognition results in most databases. On all databases, the ARR of ProxylessNAS is 100%, and the EER is very low.

      3) In the “mixed data mode”, the convergence speed of the neural network is usually faster, and the number of epochs trained is reduced by nearly half.

    • For 2D and 3D palmprint and palm vein recognition, we compare the performance between NAS methods and other methods, including four traditional methods and four deep learning methods.

      For NAS methods, we select ProxylessNAS and FairNAS-A for performance comparison. Among different NAS method, the overall performance of ProxylessNAS is the best, and FairNAS-A achieves the best recognition performance in the PolyU M_N database.

      Four traditional and representative palmprint recognition methods are selected for performance comparison, including competitive code (CompC)[16], ordinal code (OrdinalC)[112], robust line orientation code (RLOC)[113] and local line directional pattern (LLDP)[114].

      Four deep learning methods are selected for performance comparison including PalmNet[24], ResNet[115], MobileNet-V3, and EfficientNet. PalmNet is a deep learning method specially designed for palmprint recognition, which has excellent recognition performance. ResNet is a very famous CNN and is also a typical representative of manually designed CNN. As we have mentioned above, MobileNet-V3 and EfficientNet are two semi-NAS methods and have excellent recognition performance. The recognition performance of MobileNet-V3 and EfficientNet has been evaluated in our previous work.

    • In the “separate data mode”, for all databases, traditional methods use four images captured in the first session as the training data and uses the images collected in the second session as the test data. For deep learning-based methods including PalmNet, ResNet, MobileNet-V3, EfficientNet. ProxylessNAS and FairNAS-A, all images collected in the first session are used as the training data, and the remaining second session images are used as test data. The comparison results on 2D palmprint and palm vein databases are listed in Table 11, and the comparison results on 3D palmprint database are listed in Table12.

      PolyU IIPolyU M_BHFUT IHFUT CSTJU-PPolyU M_NTJU-PV
      ARREERARREERARREERARREERARREERARREERARREER
      CompC1000.02591000.004499.640.287499.450.601699.870.250099.970.100099.320.6500
      OrdinalC1000.05181000.005999.600.425599.670.585199.950.26711000.003499.550.5946
      RLOC1000.02681000.033499.750.254199.360.702799.630.45291000.00011000.5814
      LLDP1000.05211000.003099.890.501199.400.199499.500.51661000.031498.931.0771
      PalmNet1000.83801000.00471000.065592.454.09741000.011599.021.400999.610.3281
      ResNet97.660.53331000.000298.510.275495.200.667799.250.02631000.000798.580.3147
      MobileNet-V397.350.57411000.001198.670.173496.550.432699.370.05521000.000398.670.2678
      EfficientNet97.390.35521000.000299.410.050795.310.585899.890.00221000.000299.000.0774
      ProxylessNAS98.630.17281003.34×10−599.680.040799.780.021799.750.018799.890.001199.670.0704
      FairNAS-A96.531.45421000.000198.850.105799.390.048299.550.01321000.000199.230.0647

      Table 11.  2D palmprint and palm vein recognition: ARR (%) and EER (%) performance comparison between NAS and other methods under the separate data mode

      CSTSTMCIGCI
      ARREERARREERARREERARREER
      CompC97.751.133598.500.775993.004.089271.2319.0440
      OrdinalC98.680.920199.030.638191.433.721597.951.5548
      RLOC96.981.373398.600.935796.452.046996.701.7160
      LLDP91.053.517796.501.617389.104.726693.802.9386
      PalmNet97.281.195798.700.892896.901.815796.921.6901
      ResNet97.580.548899.120.073499.350.056693.651.9025
      MobileNet-V397.530.602598.470.321697.320.712594.161.7122
      EfficientNet97.810.441899.370.051099.880.004999.660.0411
      ProxylessNAS99.900.002899.920.00201000.005799.450.1193
      FairNAS-A99.580.043699.400.049299.250.058794.053.2790

      Table 12.  3D palmprint recognition: ARR (%) and EER (%) performance comparison between NAS and other methods under the separate data mode

      From Tables 11 and 12, we have the following observations:

      1) In the PolyU II palmprint database, four traditional methods (CompC, OrdinalC, RLOC, and LLDP) and one manually designed deep learning method (PalmNet) achieve better recognition performance than the NAS methods. In the PolyU M_B palmprint database, all methods can achieve 100% ARR, and the method of ProxylessNAS achieves the lowest EER, which is 3.34×10−5%. In the HFUT I palmprint database, the method of PalmNet achieves the highest ARR (100%), and the method of ProxylessNAS achieves the lowest EER (0.040 7%). In the HFUT CS palmprint database, the method of PalmNet achieves the highest ARR (100%), and the method of EfficientNet achieves the lowest EER (0.021 7%). In the PolyU M_N palm vein database, most methods can achieve 100% ARR, and RLOC and FairNAS-A achieve the lowest EER (0.000 1%). In the TJU-PV palm vein database, the method of RLOC achieves the highest ARR (100%), and the method of FairNAS-A achieves the lowest EER (0.064 7%).

      2) For 2D palmprint and palm vein recognition, the overall recognition performance of NAS methods is close to that of the traditional methods. On some databases, traditional recognition methods have better recognition performance. On other databases, NAS methods have better recognition performance.

      3) For 2D palmprint and palm vein recognition, the overall recognition performance of the NAS methods is close to the deep learning-based method, i.e., PalmNet, a method specially designed for palmprint recognition. In the databases of PolyU II, PolyU M_B, HFUT I and TJU-P, PalmNet has better recognition performance than the NAS methods. However, in the database of HFUT CS, the recognition performance of PalmNet is very poor.

      4) For 2D palmprint and palm vein recognition, the overall recognition performance of the NAS methods is notoriously better than that of one representative manually designed CNN method, i.e., ResNet.

      5) For 2D palmprint and palm vein recognition, the overall recognition performance of one pure NAS method, i.e., ProxylessNAS, is slightly better than EfficientNet and MobileNet-V3.

      6) For 3D palmprint recognition, the method of ProxylessNAS achieves 100% ARR on the MCI representation, which is better than other methods.

    • In the “mixed data mode”, for traditional methods, four images collected in the first session are used as the training data, and we add the first image captured in the second session to the training set. The remaining images collected in the second session are exploited as the test data. For ProxylessNAS and FairNAS-A, all the images collected in the first session are used as the training data. Then we add the first image captured in the second session to the training set. The remaining images collected in the second session are exploited as the test data. The comparison results on the 2D palmprint and palm vein databases are listed in Table 13, and the comparison results on the 3D palmprint database are listed in Table 14.

      PolyU IIPolyU M_BHFUT IHFUT CSTJU-PPolyU M_NTJU-PV
      ARREERARREERARREERARREERARREERARREERARREER
      CompC1000.00041008.02×10−599.980.016499.960.53831000.00091000.001599.870.1497
      OrdinalC1000.00401000.003199.980.03851000.29111001.55×10−5100099.870.0406
      RLOC1007.48×10−51000.00031000.04051000.27081000.017910001000.1314
      LLDP1000.0058100099.930.04171000.33241000.01851000.001099.960.1671
      PalmNet1000.00651005.32×10−51000.02781000.33301000.01861000.001299.910.1668
      ResNet1000.00321000.00011000.008899.840.03221000.00251000.000799.920.0110
      MobileNet-V399.690.05341000.00031000.008299.490.07251000.00221000.00021000.0094
      EfficientNet1000.01301006.34×10−51000.007299.570.06881000.00121000.00011000.0085
      ProxylessNAS1000.01101002.26×10−51000.00651000.00081000.00101000.00041000.0020
      FairNAS-A99.760.04521005.58×10−51000.00861000.00121000.00161000.00011000.0064

      Table 13.  2D palmprint and palm vein recognition: ARR (%) and EER (%) performance comparison between NAS method and other methods under the mixed data mode

      CSTSTMCIGCI
      RREERARREERARREERARREER
      CompC99.170.502099.470.384797.612.156472.6418.6315
      OrdinalC99.750.338899.810.165997.861.422097.971.4462
      RLOC99.310.403699.830.281999.330.718796.891.7188
      LLDP95.861.952198.140.867995.842.437694.582.7588
      PalmNet99.280.434999.890.190499.390.579197.331.4931
      ResNet99.500.103399.940.00261000.000698.560.2856
      MobileNet-V398.640.255899.670.032899.670.038597.190.8335
      EfficientNet98.540.301599.880.004599.940.002397.430.8021
      ProxylessNAS1000.00061000.00041000.00151000.0076
      FairNAS-A1000.00281000.00301000.004198.750.1119

      Table 14.  3D palmprint recognition: ARR (%) and EER (%) performance comparison between NAS and other methods under the mixed data mode

      From Tables 13 and 14, we have the following observations:

      1) In the mixed data mode, all methods have achieved outstanding recognition performance. Almost all methods can obtain 100% ARR and very low EER. That is to say, for various methods, it is very easy to obtain high recognition performance by using the mixed data mode.

      2) In the mixed data mode, for 2D palmprint and palm vein recognition, the overall recognition performance of NAS methods is close to that of traditional methods and other deep learning-based methods.

      3) In the mixed data mode, for 3D palmprint recognition, the method of ProxylessNAS achieves the best performance on the MCI representation.

    • This paper systematically investigated the recognition performance of representative NAS methods for 2D and 3D palmprint recognition and palm vein recognition. Twenty representative NAS methods were exploited for performance evaluation, including NASNet, SMASH, PNASNet, NAONet, SNAS, AmoebaNet, ENAS, ProxylessNSA, DARTS, FairNAS, GDAS, FBNet-V1, MNASNet, ScarletNAS, MoGA, PC-DARTS, Single-Path-SuperNet, DNA, FBNet-V2 and RobNet. Five 2D palmprint image databases, one 3D palmprint database and two palm vein databases were exploited for performance evaluation, including PolyU II, PolyU M_B, HFUT, HFUT CS, TJU-P, PolyU 3D, PolyU M_B and TJU-PV. These databases are very representative. For example, PolyU II, PolyU M_B, PolyU M_N and HFUT databases were collected by the contact manner; HFUT CS, TJU-P and TJU-PV were captured in a contactless manner. All databases were collected in two different sessions. In particular, HFUT CS is a rather challenging database because it was collected in the conditions of two different sessions, contactless manner and crossing three different sensors.

      We conducted evaluation experiments on both separate data mode and mixed data mode. Experimental results showed that, among different NAS methods, ProxylessNAS achieved the best recognition accuracy. In other words, ProxylessNAS is a very suitable NAS method for 2D and 3D palmprint recognition and palm vein recognition.

      In the “separate data mode”, for 2D palmprint recognition and palm vein recognition, the overall recognition performance of ProxylessNAS is close to that of traditional methods including CompC, OrdinalC, RLOC and LLDP. It is close to the deep learning-based method, i.e., PalmNet, which is a method specially designed for palmprint recognition. The overall recognition performance of ProxylessNAS is considerably better than that of one representative manually designed CNN method, i.e., ResNet, and is slightly better than EfficientNet and MobileNet-V3. For 3D palmprint recognition, ProxylessNAS achieved 100% ARR on the MCI representation, which is better than other methods.

      In the “mixed data mode”, almost all methods can obtain 100% ARR and very low EER. For 2D palmprint and palm vein recognition, the overall recognition performance of ProxylessNAS is close to that of the traditional methods and other deep learning-based methods. For 3D palmprint recognition, ProxylessNAS achieves the best performance on the MCI representation.

      In this work, it is the first time to conduct a performance evaluation of representative NAS methods for 2D and 3D palmprint and palm vein recognition. Experimental results showed the NAS is an up-and-coming technology for 2D and 3D palmprint and palm vein recognition. In our future work, based on NAS technology, we will try to design new methods to further improve the recognition performance of 2D and 3D palmprint recognition and palm vein recognition.

    • This work was supported by National Science Foundation of China (Nos. 62076086, 61673157, 61972129, 61972127 and 61702154), and Key Research and Development Program in Anhui Province (Nos. 202004d07020008 and 201904d07020010).

    • This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

      The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

      To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reference (115)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return