I. INTRODUCTION
In recent years, biometric recognition has been widely concerned and applied. Biometric recognition includes palmprint [1], face [2-3], gesture [4], and other modalities. Palmprint recognition is a non-invasive recognition method with stable recognition, low equipment cost [5], high user-friendliness, and high privacy, so it can be used in a wide range of applications [6]. The palmprint is representative and has a variety of feature types. Many palmprint features are also applicable to other biological feature modes. Therefore, the present attack is studied on palmprint recognition systems.
Presentation attack is common in the physical world. The attacker only needs to use printed images or videos placed at the front end of sensors to deceive the biometric identification system, typically such as printed photos, electronic displays, rubber molds, etc. Typically, presentation attacks use original biometric images. In addition to using the original biometric images, fake images produced by reconstruction attacks and adversarial attacks can also be used to present attacks.
In Reconstruction attacks, attackers reconstruct users biometric images by obtaining leaked information or vulnerabilities by identification systems, such as feature templates stored in databases and decision information of identification systems. Reconstructed images can impersonate the target user (victim user) and fool the recognition system.
In recent years, deep learning trained models have achieved good results in the field of computer vision [7]. However, some researchers have found that although the deep learning model has a high accuracy rate, it is fragile and vulnerable to attack by adversarial samples. Adversarial attacks add a minor perturbation to a clean image that the human visual system can't detect, but the deep learning model can alter its original classification results, give false results, and even control what kind of results it makes.
Reconstruction attacks and adversarial attacks both have high attack success rates, but they are input to the attacked system in the form of digital images and occur at the back of the sensor. The presentation attack is carried out in front of the sensor, which requires a lower level of authority and goes through a complete identification process, posing a huge threat to the security of the identification system. If the reconstruction attack and adversarial attack can be carried out in front of the sensor like the presentation attack, more attention should be paid to their threat to the biometric recognition system.
This paper analyzes the threat of presentation attacks to palmprint recognition systems by using the original image, adversarial attack, and reconstruction attack. The main contributions are as follows:
-
For the original images, adversarial attack images, and reconstruction attack images, through two physical display carriers, i.e. photo and monitor, after re-imaging at the front of the acquisition device, six palmprint presentation attack datasets were produced.
-
The experiment analyzed the success rate of presentation attacks by using stolen original palmprint images.
-
The threat to the palmprint recognition system is analyzed experimentally in the case of reconstruction attack and adversarial attack.
The rest of this paper is organized as follows. Section 2 revisits the related works. Section 3 specifies the methodology. Section 4 are the experiments and discussions. Finally, the conclusions are drawn in Section 5.
II. RELATED WORKS
Presentation attack is a common attack method. The steps of presenting attacks are relatively simple, usually using different carriers to present biological features and establishing a new dataset of presentation attacks. Research on presentation attack focuses on its defense methods, such as in liveness detection. Reconstruction attacks and adversarial attacks can generate fake biometric images. These fake images may be attacked in a similar manner to the presentation attack, which is fed into the system from the sensor. This section introduces the research status of reconstruction attack, adversarial attack and palmprint recognition.
Reconstruction attack means that the attacker reconstructs the biometric image of the attacked user. Some palmprint recognition systems protect templates in databases. There are two common methods, such as biometric template protection [8-9] and cancelable biometric [10-11]. However, reconstruction attacks can still generate reconstructed images by matching scores in the recognition system.
In terms of fingerprint modal, Uludag et al. [12] proposed to reconstruct fingerprint mitutia point template with hill-climbing algorithm and divide mitutia image into grid to avoid over-dense detail points in reconstructed image.
In terms of face modes, Andy et al. [13] continuously superimposed face feature images on a face image to modify face features until they were verified by the recognition system. Galbally et al. [14] reconstructed face images with bayesian hill-climbing algorithm. Andy et al. [15] proved that quantized matching scores cannot defend against reconstruction attacks by using hill-climbing algorithm. Marta et al. [16] used uphillsimplex method to generate reconstructed images more efficiently.
In terms of iris mode, Rathgeb et al. [17] optimized the hill-climbing algorithm by simultaneously modifying the pixels within a block. The size of the block depends on the size of the filter of the recognition system, which can reduce the number of modifications and generate reconstructed images faster. Galbally et al. [18] reconstructed iris images with genetic algorithm. Many iris images are synthesized with iris synthesizer as the initial individual, and then the initial individual is divided into blocks, and each block is the individual gene. By continuously producing new offspring until there is an individual verified by the recognition system. The verified individuals are embedded into the real iris image in a small proportion to improve the image quality.
In the palmprint mode, Wang et al. [19] attacked the palmprint recognition system with brute force. This method uses DCGAN to generate a large number of palmprint images, which are continuously input into the recognition system for verification until the palmprint images that pass verification are found. Sun et al. [20] proposed two reconstruction attack methods based on Hill-climbing algorithm, Modified Constraint within Neighborhood (MCwN) and Batch Member Selection (BMS). These two attack methods can quickly generate high-quality reconstructed images.
Deep neural network trained models have excellent effects on many tasks in the field of computer vision, but Szegedy et al. [21] first discovered the fragile characteristics of neural networks. Whitebox adversarial attacks can be divided into three categories, namely gradient-based, optimization-based and GAN-based methods.
Goodfellow et al. [22] believed that the high-dimensional linearity of neural networks led to the appearance of adversative samples, and based on this assumption, they proposed Fast Gradient Sign Method (FGSM). Kurakin et al. [23] extended FGSM and proposed a Basic Iterative Method (BIM), also known as I-FGSM(Iterative FGSM).
Carlini et al. [24] limited the L0, L2 and L∞ norms to make the adversarial perturbation smaller and harder to detect, and could break through the protection of the model by defensive distillation. Moosavi et al. [25] proposed DeepFool. In this method, the decision boundary of classification is assumed first, then the minimum norm adversarial perturbation is generated continuously by iterative calculation method, and the image within the classification boundary is gradually pushed out of the boundary until the wrong classification occurs.
Palmprint recognition is a promising and representative biometric modality. Palmprint recognition methods can be roughly categorized into subspace-based [28-29], statistical-based [30-31], deep-learning-based [32], and coding-based [33-34] methods. deep-learning-based and coding-based methods are popular for palmprint recognition.
In recent years, deep learning has achieved tremendous development and remarkable achievements in various fields of computer vision [35-37]. Zhong and Zhu [38] designed a new loss function for palmprint recognition, which can make the distance distribution of Inter-class more concentrated, while the distance distribution of intra-class more dispersed. Matkowski et al. [39] proposed a palmprint recognition method suitable for low-constraint scenarios, which uses a cascading network structure consisting of two sub-networks to perform ROI segmentation and feature extraction tasks respectively. Liang et al. [40] proposed CompNet, which uses CNN to learn the parameters of Gabor filter and effectively utilizes the direction information in palmprint through special Softmax and channel convolution operations. CompNet has lower equal error rate compared with the existing methods, and has fewer parameters in the network, so it is easy to train. Wu et al. [41] realized multi-spectral palmprint fusion, and reduce the variance between intra-class score and inter-class score [42]. Moreover, this method saves storage space and matching computation. Xu et al. [43] Combine with soft biometrics to improve model accuracy.
Coding-based palmprint recognition method uses hand-designed filters to extract palmprint features. Compared with deep learning-based methods, such methods have faster matching speed, less storage space, and no training is required. The feature extraction of coding-based palmprint recognition method can be divided into cooperative and competitive methods.
The cooperative approach usually fuses multiple feature templates extracted from the palmprint at the feature level or score level. The related works include PalmCode [44], BOCV [45], OrdinalCode [46] and FusionCode [47].
Competition usually selects the index of the maximum/minimum response as the final template. The related works include CompCode [48], RLOC [49], DOC [50] and DRCC [51].
III. METHODOLOGY
Monitor presentation attack displays palmprint images on the Monitor and retakes them with the camera. Input the re-shot image into the palmprint recognition system to test whether it can pass the verification. In this way, it simulates the attacker using an monitor to display the stolen palmprint image and impersonates the stolen user, to test the security of the palmprint recognition system in this scenario.
To improve the efficiency of the experiment, the screen displays 15 images of the palmprint at a time. The monitor model used in the experiment is DELL U2419HS with a size of 24 inches and a resolution of 1,920×1,080. The original ROI of each palmprint is 128×128 in size, so displaying 15 palmprint images can also fully show the details of the palmprint images. The camera is iPhone XS. The camera is positioned horizontally with the screen and shot under indoor light. The shooting scene is shown in Fig. 1. The image taken is shown in Fig. 2. It can be found that due to the interaction between the camera and the display, the retaken palmprint image will appear moire fringe.
Next, each palmprint image is cut out from the shot image and reduced to the input size set by the recognition system. Since the position between the camera and the monitor does not change, it is possible to quickly crop out the palmprint image. The angle between camera position and display is horizontal, thus avoiding image distortion. Since the images taken are RGB images and most palmprint recognition systems use grayscale images for recognition, it is necessary to grayscale each palmprint image. Fig. 3 shows the comparison between the original palmprint image, and the image displayed by a monitor.
Paper presentation attack prints palmprint images with a printer. One piece of paper can hold 9 palmprint images. The paper printed with palmprint images is placed on the desktop, and the camera is at a horizontal angle to the paper. The printer is Lenovo M101DW and the camera is iPhone XS. The image taken is shown in Fig. 4. The palmprint image for the presentation attack was obtained after clipping, shrinking, and graying the shot image. Fig. 5 shows a comparison of the original palmprint image with the image rendered on paper.
IV. EXPERIMENTS
The experimental hardware environment is described as follows: Intel Xeon(R) W-2145 CPU @ 3.70GHz ×16, GeForce GTX 1080 Ti, 64GB memory. The used programming languages are MATLAB and Python. The presentation attack dataset was made based on three palmprint datasets: real palmprint dataset, reconstructed image dataset, and adversarial sample dataset. The real palmprint dataset is a PolyU which contains 600 images. The reconstructed image dataset is the reconstructed images generated by the BMS, which contains 300 images. The adversarial sample dataset contains 400 adversarial samples generated by FGSM against CompNet [40]. At the same time, two kinds of presentation attack datasets should be made based on these three datasets, namely, monitor presentation attack dataset and paper presentation attack data set. Therefore, there are six presentation datasets, Reconstruction_Monitor, Reconstruction_Paper, Adversarial_Monitor, Adversarial _Paper, PolyU_Monitor and PolyU_Paper.
Attack coding-based palmprint recognition methods with PolyU_Monitor and PolyU_Paper. The simulated scenario is an attacker using the stolen original palm print image displayed on a monitor or paper, input into the palm print recognition system, and trying to impersonate a legitimate user. The distance distribution between the presentation attack dataset and the attacked palmprint image (the original image of making the presentation attack dataset) was calculated.
Fig. 6 shows presentation attack on 8 coding-based palmprint recognition methods using PolyU_Monitor. The red and green lines are intra-class distance and Inter-class distance distribution respectively. The blue lines are the distance distribution between the image in PolyU_Monitor and the palmprint image of the corresponding target user in PolyU. It can be found that there is a small peak on the left of the blue line because PolyU_Monitor matches the corresponding original image in PolyU. The original image has only been displayed and reshot, so the matching distance between them is very small.
Fig. 7 shows presentation attack on 8 coding-based palmprint recognition methods using PolyU_Paper. The blue line also moves slightly to the right compared to the red line, but the small mountain on the left is hard to see and there is a small bump behind it. This is because, compared to the monitor, the printer printing resolution is lower, so the printed palmprint image details are blurry. In particular, some images with high brightness are difficult to be clearly displayed on white paper, so the occurrence of small matching distance will be reduced. In addition, a small part of the paper is bent when being shot, which leads to the deformation of the palmprint image and appear some large matching distance. In general, the overlap area of blue line distribution and red line distribution is very large, and the paper presentation attack has a very high success rate.
BMS makes minor modifications on the original palmprint images to generate reconstructed images. After the reconstructed images are displayed on a monitor or paper, the camera collects them and feeds them into the recognition system to test whether they can be successfully authenticated. The experiment calculated the matching distance between images in presentation attack datasets and corresponding target images, and then counted the proportion in each matching distance interval. The experimental results are shown in Fig. 8. The blue line shows the distribution of attack matching distance. As can be seen from the figure, the overall regular of Reconstruction_Monitor and Reconstruction_Paper are similar, and the blue line and the green line basically coincide. This means that after the reconstructed image is reshot on the monitor or paper, the modification of BMS is destroyed and it is not capable of presentation attack.
CompNet uses a neural network to learn Gabor filter parameters and establishes a competitive mechanism to efficiently utilize the direction information of palm print. Since most neural network models use accuracy rate (ACC) as the model evaluation standard, the accuracy rate is also used to describe the model recognition accuracy in experiments.
Attacks on CompNet using PolyU_Monitor and PolyU_ Paper were found to be 100% accurate, while for PolyU_ Paper it was slightly lower at 97.4%. The accuracy of the identification system is also the success rate of the attack, which indicates that CompNet has a high success rate in both monitor and paper presentation attack.
Adversarial_Monitor and Adversarial_Paper are made on the basis of adversarial sample generated on CompNet to simulate the attack scene of adversarial sample in front of the camera. These images are targeted adversarial samples generated by FGSM. They can be classified into specific categories to impersonate legitimate users. Adversarial_Monitor and Adversarial_Paper each contain 400 images, which are divided into 4 categories with different ε values, 100 images for each. The higher the ε value is, the better the attack performance is, and the more obvious the forgery trace is. Adversarial samples with different ε values are shown in Fig. 9.
The experimental results of Adversarial_Monitor and Adversarial_Paper for the adversarial attack against CompNet are shown in Fig. 10. The experimental results show that the overall attack success rate is not high, because the targeted adversarial samples not only require the model to be misclassified, but also to be classified into the specified categories. When the ε value is low, the success rate of attack is very low, but with the increase of ε value, the success rate also has an increase. This is because when the ε value is small, the added adversarial perturbation is small, and the noise is of high frequency. In the process of monitor or paper presentation and camera shooting, high-frequency noise is easy to be destroyed. With the increase of ε, the added perturbation becomes more obvious and is not easy to be destroyed. Compared with monitor adversarial attack and paper adversarial attack, monitor adversarial attack has a higher attack success rate.
V. CONCLUSIONS
In this paper, six presentation attack datasets are made, and the presentation attack experiments are carried out against coding-based palmprint recognition method and CompNet. The experimental results show that presentation attack caused by palmprint image leakage has a high success rate and poses a great threat to palmprint recognition system. The palmprint image presented by the display will appear moire fringe after re-shooting, but it has little influence on presentation attack. As the resolution of the printer is generally not high, the palmprint image presented on the paper is a little bit blurred and the details are not highly clear. In addition, the paper material is soft and easy to bend, which will lead to the deformation of the printed palmprint image. The experimental results show that monitor presentation attack has a higher attack success rate than paper presentation attack. Adversarial attack and reconstruction attack have a low success rate when they are conducted in front of the camera sensors. BMS does not have the ability to carry out presentation attack on PalmCode, because the BMS changes the initial image too little and it is easy to destroy. Similarly, adversarial attack is difficult to pose a threat to CompNet when the ε value is low.