KR20220100812A

KR20220100812A - Facial biometric detection method, device, electronics and storage media

Info

Publication number: KR20220100812A
Application number: KR1020220079655A
Authority: KR
Inventors: 케야오 왕
Original assignee: 베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드
Priority date: 2021-07-21
Filing date: 2022-06-29
Publication date: 2022-07-18
Also published as: JP2022133378A; CN113435408A

Abstract

The present disclosure provides a method, device, electronic device, and storage media for facial biometric detection. The method, device, electronic device, and storage media for facial biometric detection is related to a field of artificial intelligence technology, especially, computer vision and deep learning technology, and can be applied to scenarios such as facial recognition. The method for facial biometric detection comprises the steps of: obtaining a facial color image to be detected; acquiring a face reconstruction infrared image and a face reconstruction deep image, respectively, by inputting the facial color image into each of a first codec reconstruction model and a second codec reconstruction model which are pre-trained; and obtaining a biometric detection result by inputting the face color image, the face reconstruction infrared image, and the face reconstruction deep image into a pre-trained multimodal detection network model. Accordingly, since sensitivity to light can be reduced, detection accuracy is improved. In addition, since a generalization ability of a network can be improved, a defense effect against flat attacks, such as photo and video, is improved.

Description

FACIAL BIOMETRIC DETECTION METHOD, DEVICE, ELECTRONICS AND STORAGE MEDIA

본 개시는 인공지능 기술 분야, 특히 컴퓨터 시각 및 딥러닝 기술 분야에 관한 것으로, 안면 인식 등 시나리오에 적용될 수 있다.The present disclosure relates to the field of artificial intelligence technology, particularly computer vision and deep learning technology, and may be applied to scenarios such as facial recognition.

전자 상거래 등 기술의 발전에 따라, 안면 기반 신원 인증이 널리 적용되고 있으며, 안면 기반 신원 인증은 주로 안면 인식 기술을 통해 구현되고 있지만, 안면 인식 기술이 사람들의 생활 편의성을 크게 향상시키는 동시에, 보안 문제도 점차 노출되고 있다. 예를 들어, 프린트 사진, 스크린 사진 등을 실체 안면으로 위장해 인증을 통과하는 문제가 있다. With the development of technologies such as e-commerce, face-based identity authentication is widely applied, and face-based identity authentication is mainly implemented through facial recognition technology, but while facial recognition technology greatly improves people's convenience in life, security problems are also gradually exposed. For example, there is a problem of passing authentication by disguising a printed photo, a screen photo, etc. as an actual face.

이리하여, 안면 인식 기술에서, 안면 생체 검출 기술에 의해 안면 이미지가 생체 안면을 촬영하여 얻은 것인지 여부를 판정해야 한다.Thus, in the facial recognition technology, it is necessary to determine whether the facial image is obtained by photographing the biological face by the facial biometric detection technology.

본 개시는 안면 생체 검출 방법, 장치, 기기 및 저장 매체를 제공한다.The present disclosure provides a facial biometric detection method, apparatus, device, and storage medium.

본 개시의 일 측면에 따르면, 안면 생체 검출 방법을 제공하며, 상기 방법은,According to one aspect of the present disclosure, there is provided a facial biometric detection method, the method comprising:

검출할 안면 컬러 이미지를 획득하는 단계;acquiring a facial color image to be detected;

상기 안면 컬러 이미지를 사전 트레이닝된 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델에 각각 입력하여, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 각각 획득하는 단계; 및 inputting the facial color image to a pre-trained first codec reconstruction model and a second codec reconstruction model, respectively, to obtain a facial reconstruction infrared image and a facial reconstruction deep image, respectively; and

상기 안면 컬러 이미지, 상기 안면 재구성 적외선 이미지 및 상기 안면 재구성 딥 이미지를 사전 트레이닝된 멀티모달 검출 네트워크 모델에 입력하여, 생체 검출 결과를 획득하는 단계;를 포함한다. and inputting the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image into a pre-trained multimodal detection network model to obtain a biometric detection result.

본 개시의 다른 측면에 따르면, 안면 생체 검출 장치를 제공하며, According to another aspect of the present disclosure, there is provided a facial biometric detection device,

검출할 안면 컬러 이미지를 획득하는 획득 모듈;an acquisition module for acquiring a facial color image to be detected;

상기 안면 컬러 이미지를 사전 트레이닝된 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델에 각각 입력하여, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 각각 획득하는 재구성 모듈; 및 a reconstruction module for respectively inputting the facial color image to a pre-trained first codec reconstruction model and a second codec reconstruction model to obtain a facial reconstruction infrared image and a facial reconstruction deep image, respectively; and

상기 안면 컬러 이미지, 상기 안면 재구성 적외선 이미지 및 상기 안면 재구성 딥 이미지를 사전 트레이닝된 멀티모달 검출 네트워크 모델에 입력하여, 생체 검출 결과를 획득하는 검출 모듈;을 포함한다.and a detection module configured to obtain a biometric detection result by inputting the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image to a pre-trained multimodal detection network model.

본 개시의 또 다른 측면에 따르면, 전자 기기를 제공하며, According to another aspect of the present disclosure, there is provided an electronic device,

적어도 하나의 프로세서; 및 at least one processor; and

상기 적어도 하나의 프로세서와 통신 가능하게 연결되는 메모리;를 포함하고,a memory communicatively connected to the at least one processor; and

상기 메모리에 상기 적어도 하나의 프로세서에 의해 수행 가능한 명령이 저장되어 있고, 상기 명령이 상기 적어도 하나의 프로세서에 의해 수행될 경우, 상기 적어도 하나의 프로세서가 안면 생체 검출 방법을 수행한다. A command executable by the at least one processor is stored in the memory, and when the command is executed by the at least one processor, the at least one processor performs the facial biometric detection method.

본 개시의 또 다른 측면에 따르면, 컴퓨터 명령이 저장되어 있는 비일시적 컴퓨터 판독 가능 저장 매체를 제공하며, 상기 컴퓨터 명령은 컴퓨터가 안면 생체 검출 방법을 수행하는데 사용된다. According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium having computer instructions stored thereon, wherein the computer instructions are used by a computer to perform a facial biometric detection method.

본 개시의 또 다른 측면에 따르면, 컴퓨터 판독 가능 저장 매체에 저장된 컴퓨터 프로그램을 제공하며, 상기 컴퓨터 프로그램 중의 명령이 프로세서에 의해 수행될 경우, 안면 생체 검출 방법이 수행된다. According to another aspect of the present disclosure, there is provided a computer program stored in a computer-readable storage medium, and when instructions in the computer program are executed by a processor, a facial biometric detection method is performed.

이해 가능한 바로는, 본 부분에서 설명된 내용은 본 개시의 실시예의 핵심 또는 중요한 특징을 식별하기 위한 것이 아니며, 본 개시의 범위를 한정하지도 않는다. 본 개시의 다른 특징들은 하기의 명세서에 의해 쉽게 이해될 것이다.As can be understood, the content described in this section is not intended to identify key or critical features of embodiments of the present disclosure, nor does it limit the scope of the present disclosure. Other features of the present disclosure will be readily understood by the following specification.

본 개시는 빛에 대한 민감도를 감소시킬 수 있으므로, 검출 정확도를 향상시키고, 네트워크의 일반화 능력을 향상시킬 수 있으므로, 사진, 비디오 등 평면적인 공격에 대한 방어 효과가 향상된다. Since the present disclosure can reduce the sensitivity to light, the detection accuracy can be improved, and the generalization ability of the network can be improved, so that the protective effect against flat attacks such as photos and videos is improved.

첨부된 도면은 본 기술 수단을 더 잘 이해하기 위한 것으로, 본 개시에 대한 한정이 구성되지 않는다.
도 1은 본 개시의 실시예에서 제공되는 안면 생체 검출 방법의 개략적인 흐름도이다.
도 2는 본 개시의 실시예에서 제공되는 안면 샘플 이미지를 획득하는 개략적인 흐름도이다.
도 3은 본 개시의 실시예에서 제공되는 안면 생체 검출 방법의 개략도이다.
도 4는 본 개시의 실시예의 안면 생체 검출 방법을 구현하는 장치의 블록도이다.
도 5는 본 개시의 실시예의 안면 생체 검출 방법을 구현하는 전자 기기의 블록도이다.The accompanying drawings are for a better understanding of the technical means, and do not constitute a limitation on the present disclosure.
1 is a schematic flowchart of a facial biometric detection method provided in an embodiment of the present disclosure;
2 is a schematic flowchart of acquiring a facial sample image provided in an embodiment of the present disclosure;
3 is a schematic diagram of a facial biometric detection method provided in an embodiment of the present disclosure;
4 is a block diagram of an apparatus implementing the facial biometric detection method of an embodiment of the present disclosure;
5 is a block diagram of an electronic device implementing a facial biometric detection method according to an embodiment of the present disclosure.

이하, 첨부된 도면을 결부하여 본 개시의 예시적인 실시예에 대해 설명하며, 여기에는 이해를 돕기 위해 본 개시의 실시예의 다양한 세부 사항을 포함하므로, 이는 단지 예시적인 것으로 이해해야 한다. 따라서, 당업자는 본 개시의 범위 및 사상을 벗어나지 않는 한 여기에 설명된 실시예에 대해 다양한 변경 및 수정이 이루어질 수 있음을 인식해야 한다. 마찬가지로, 명확성과 간결성을 위해, 하기의 설명에서는 공지된 기능 및 구조에 대한 설명을 생략한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, exemplary embodiments of the present disclosure will be described in conjunction with the accompanying drawings, which include various details of the exemplary embodiments of the present disclosure to aid understanding, and it should be understood that these are merely exemplary. Accordingly, those skilled in the art should recognize that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the present disclosure. Likewise, for the sake of clarity and conciseness, descriptions of well-known functions and structures are omitted from the following description.

안면 생체 검출은 안면 관련 분야의 기본 기술 중의 하나이며, 출퇴근, 출입 통제 등 많은 시나리오에 적용될 수 있다. 현재 많은 업무에서 널리 사용되고 있다.Facial biometric detection is one of the basic technologies in the face-related field, and can be applied to many scenarios such as commuting and access control. It is now widely used in many jobs.

현재, 일반적으로 컨볼루션 신경망을 사용하여 안면 생체 검출을 수행하며, 컨볼루션 신경망의 입력은 안면 컬러 이미지이다. 그러나, 컬러 이미지 만을 기반으로 하는 안면 생체 검출은, 빛에 민감하여 검출 정확도가 낮고, 사진, 비디오 등 평면적인 공격에 대한 방어 효과가 나쁜 기술적 과제가 있다. Currently, facial biometric detection is generally performed using a convolutional neural network, and the input of the convolutional neural network is a facial color image. However, facial biometric detection based only on color images is sensitive to light, so detection accuracy is low, and there are technical problems in that it has a poor protective effect against flat attacks such as photos and videos.

상기 기술적 과제를 해결하기 위해, 본 개시는 안면 생체 검출 방법, 장치, 전자 기기 및 저장 매체를 제공한다. In order to solve the above technical problem, the present disclosure provides a facial biometric detection method, an apparatus, an electronic device, and a storage medium.

본 개시의 일 실시예에서, 안면 생체 검출 방법을 제공하며, 방법은, In one embodiment of the present disclosure, there is provided a facial biometric detection method, the method comprising:

안면 컬러 이미지를 사전 트레이닝된 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델에 각각 입력하여, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 각각 획득하는 단계; 및 inputting the facial color image to the pre-trained first codec reconstruction model and the second codec reconstruction model, respectively, to obtain a facial reconstruction infrared image and a facial reconstruction deep image, respectively; and

안면 컬러 이미지, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 사전 트레이닝된 멀티모달 검출 네트워크 모델에 입력하여, 생체 검출 결과를 획득하는 단계;를 포함한다. and inputting the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image into a pre-trained multimodal detection network model to obtain a biometric detection result.

이리하여, 샘플 이미지 집합을 통해 2개의 코덱 재구성 모델을 트레이닝하여, 제1 코덱 재구성 모델은 안면 컬러 이미지의 이미지 특징에 대응되는 안면 적외선 이미지의 이미지 특징을 러닝하고, 제2 코덱 재구성 모델은 안면 컬러 이미지의 이미지 특징에 대응되는 안면 딥 이미지의 이미지 특징을 러닝한다. 따라서, 검출할 안면 컬러 이미지에 따라 안면 적외선 이미지 및 안면 딥 이미지를 재구성한 후, 안면 컬러 이미지 및 재구성된 안면 적외선 이미지, 안면 딥 이미지를 멀티모달 네트워크 모델에 입력하여, 멀티모달 네트워크 모델에 의해 안면의 컬러 이미지 특징, 적외선 이미지 특징 및 딥 이미지 특징을 융합시켜, 컬러 이미지 만을 기반으로 하는 안면 생체 검출에 비해, 빛에 대한 민감도를 감소시킬 수 있으므로, 검출 정확도가 크게 향상되고, 네트워크의 일반화 능력을 향상시킬 수 있으므로, 사진, 비디오 등 평면적인 공격에 대한 방어 효과가 향상되는 동시에, 미지의 공격 샘플에 대한 방어 효과도 향상된다. Thus, by training two codec reconstruction models through a set of sample images, the first codec reconstruction model learns the image features of the facial infrared image corresponding to the image features of the facial color image, and the second codec reconstruction model learns the image features of the facial color image. It learns the image features of the deep facial image corresponding to the image features of the image. Therefore, after reconstructing the facial infrared image and the facial deep image according to the facial color image to be detected, the facial color image, the reconstructed facial infrared image, and the facial deep image are input to the multimodal network model, By fusing color image features, infrared image features, and deep image features of Since it can be improved, the defense effect against flat attacks such as photos and videos is improved, and the defense effect against unknown attack samples is also improved.

또한, 검출 과정에서, 안면 컬러 이미지 만을 사용하여 안면 생체의 멀티모달 융합 검출을 수행할 수 있으므로, 즉, 하나의 안면 컬러 이미지에 따라 멀티모달 안면 생체 검출을 수행할 수 있으므로, 안면 적외선 이미지 및 안면 딥 이미지를 더 이상 수집할 필요가 없다.In addition, in the detection process, multimodal fusion detection of a facial biometric can be performed using only a facial color image, that is, multimodal facial biometric detection can be performed according to one facial color image, so that a facial infrared image and a facial You no longer need to acquire deep images.

이하, 본 개시의 실시예에서 제공되는 안면 생체 검출 방법, 장치, 전자 기기 및 저장 매체에 대해 각각 상세히 설명한다.Hereinafter, each of the facial biometric detection method, apparatus, electronic device, and storage medium provided in the embodiments of the present disclosure will be described in detail.

도 1을 참조하면, 도 1은 본 개시의 실시예에서 제공되는 안면 생체 검출 방법의 개략적인 흐름도이다. 도 1에 도시된 바와 같이, 방법은 단계 S101 내지 단계 S103을 포함할 수 있다.Referring to FIG. 1 , FIG. 1 is a schematic flowchart of a facial biometric detection method provided in an embodiment of the present disclosure. 1 , the method may include steps S101 to S103.

S101, 검출할 안면 컬러 이미지를 획득한다.S101, acquire a face color image to be detected.

본 개시의 실시예에서, 안면 생체 검출을 수행해야 할 경우, 검출할 안면 컬러 이미지를 획득한다. 안면 생체 검출은 안면 이미지가 생체 안면을 촬영하여 얻은 것인지 검출하는 것으로 이해할 수 있다. 컬러 이미지는 RGB(red-green-bule, 레드, 그린 및 블루 세 가지 기본 색상) 이미지일 수 있다. In an embodiment of the present disclosure, when facial biometric detection is to be performed, a facial color image to be detected is acquired. Facial biometric detection can be understood as detecting whether a facial image is obtained by photographing a biometric face. The color image may be an RGB (red-green-bule, three primary colors red, green, and blue) image.

본 개시의 실시예는 안면 컬러 이미지를 획득하는 방법에 대해 한정하지 않는다. Embodiments of the present disclosure are not limited to a method of acquiring a facial color image.

S102, 안면 컬러 이미지를 사전 트레이닝된 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델에 각각 입력하여, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 각각 획득한다.S102, the facial color image is input to the pre-trained first codec reconstruction model and the second codec reconstruction model, respectively, to obtain a facial reconstruction infrared image and a facial reconstruction deep image, respectively.

본 개시의 일 실시예에서, 제1 코덱 재구성 모델은 복수의 제1 샘플 이미지 집합에 따라 트레이닝된 바, 각 제1 샘플 이미지 집합은 서로 정합된 안면 컬러 샘플 이미지 및 안면 적외선 샘플 이미지를 포함하고; 제2 코덱 재구성 모델은 복수의 제2 샘플 이미지 집합에 따라 트레이닝된 바, 각 제2 샘플 이미지 집합은 서로 정합된 안면 컬러 샘플 이미지 및 안면 딥 샘플 이미지를 포함한다.In an embodiment of the present disclosure, the first codec reconstruction model is trained according to a plurality of first sample image sets, each first sample image set comprising a facial color sample image and a facial infrared sample image matched with each other; The second codec reconstruction model is trained according to a plurality of second sample image sets, each second sample image set including a facial color sample image and a facial deep sample image matched to each other.

본 개시의 실시예에서, 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델은 모두 Encoder-Decoder(인코더-디코더) 프레임워크의 모델이다.In an embodiment of the present disclosure, the first codec reconstruction model and the second codec reconstruction model are both models of the Encoder-Decoder framework.

본 개시의 실시예에서, 복수의 제1 샘플 이미지 집합에 따라 제1 인코딩 재구성 모델을 사전 트레이닝할 수 있으며, 각 제1 샘플 이미지 집합은 서로 정합된 안면 컬러 샘플 이미지 및 안면 적외선 샘플 이미지를 포함한다. In an embodiment of the present disclosure, a first encoding reconstruction model may be pre-trained according to a plurality of first sample image sets, each first sample image set comprising a facial color sample image and a facial infrared sample image matched with each other .

안면 컬러 샘플 이미지 및 안면 적외선 샘플 이미지가 서로 정합된다는 것은 안면 컬러 샘플 이미지 및 안면 적외선 샘플 이미지의 이미지 크기, 픽셀 수량, 동일한 안면에 대해 촬영한 실제 영역 및 촬영 각도가 모두 같고, 픽셀점이 일대일로 대응되는 것을 의미한다. That the facial color sample image and the facial infrared sample image are matched with each other means that the facial color sample image and the facial infrared sample image have the same image size, pixel quantity, real area photographed for the same face, and shooting angle, and the pixel points correspond one-to-one means to be

상응하게, 안면 컬러 샘플 이미지 및 안면 딥 샘플 이미지가 서로 정합된다는 것은 안면 컬러 샘플 이미지 및 안면 딥 샘플 이미지의 이미지 크기, 픽셀 수량, 동일한 안면에 대해 촬영한 실제 영역 및 촬영 각도가 모두 같고, 픽셀점이 일대일로 대응되는 것을 의미한다. Correspondingly, that the facial color sample image and the facial deep sample image are matched to each other means that the facial color sample image and the facial deep sample image have the same image size, pixel quantity, real area photographed for the same face, and shooting angle, and the pixel point It means one-on-one correspondence.

예를 들어, RGB 카메라, NIR(Near Infrared, 근적외선) 카메라 및 딥 카메라를 포함하는 멀티 카메라를 사용하여, 생체 안면을 동시 촬영하여, 안면 컬러 샘플 이미지, 안면 적외선 샘플 이미지 및 안면 딥 샘플 이미지를 각각 획득한다. For example, using a multi-camera including an RGB camera, a near infrared (NIR) camera, and a deep camera, a living face is photographed simultaneously, and a facial color sample image, a facial infrared sample image, and a facial deep sample image are respectively captured. acquire

다량의 샘플 이미지 집합을 획득한 후, 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델을 각각 트레이닝한다. 제1 코덱 재구성 모델을 예로 들어, 트레이닝 과정에서, 안면 컬러 샘플 이미지를 입력하되, 출력은 안면 컬러 샘플 이미지의 크기와 같은 특징맵이며, 저장된 안면 적외선 샘플 이미지를 결부하여, 재구성 모델에 대해 L1 모니터링 트레이닝을 수행한다. 예를 들어, 손실 함수를 설정하여, 출력된 특징맵 및 안면 적외선 샘플 이미지를 기반으로 손실값을 계산하고, 손실값에 따라 제1 코덱 재구성 모델 중의 모델 파라미터를 조정한다. 반복 트레이닝한 후, 제1 코덱 재구성 모델은 안면 적외선 이미지의 특징을 러닝할 수 있다. 따라서, 트레이닝이 완료된 후, 트레이닝이 완료된 제1 코덱 재구성 모델에 안면 컬러 이미지를 입력하여, 제1 코덱 재구성 모델로 하여금, 재구성된 안면 컬러 이미지에 대응되는 안면 적외선 이미지를 출력할 수 있다. After acquiring a large set of sample images, the first codec reconstruction model and the second codec reconstruction model are respectively trained. Taking the first codec reconstruction model as an example, in the training process, a facial color sample image is input, but the output is a feature map equal to the size of the facial color sample image, and L1 monitoring for the reconstruction model by combining the stored facial infrared sample image carry out training For example, a loss function is set, a loss value is calculated based on an output feature map and a facial infrared sample image, and a model parameter in the first codec reconstruction model is adjusted according to the loss value. After iterative training, the first codec reconstruction model may learn features of the facial infrared image. Accordingly, after training is completed, a facial color image may be input to the first codec reconstruction model on which training is completed, and the first codec reconstruction model may output a facial infrared image corresponding to the reconstructed facial color image.

상응하게, 같은 원리를 기반으로 제2 코덱 재구성 모델을 트레이닝하고, 트레이닝이 완료된 후, 입력된 안면 컬러 이미지에 다라 대응되는 안면 딥 이미지를 재구성할 수 있다. Correspondingly, the second codec reconstruction model is trained based on the same principle, and after the training is completed, the corresponding deep facial image may be reconstructed according to the input facial color image.

본 개시의 실시예에서, 안면 컬러 이미지를 트레이닝이 완료된 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델에 각각 입력하여, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 각각 획득할 수 있다. In an embodiment of the present disclosure, by inputting a facial color image to a first codec reconstruction model and a second codec reconstruction model that have been trained, respectively, a facial reconstruction infrared image and a facial reconstruction deep image may be obtained, respectively.

S103, 안면 컬러 이미지, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 사전 트레이닝된 멀티모달 검출 네트워크 모델에 입력하여, 생체 검출 결과를 획득한다. S103, the facial color image, the facial reconstruction infrared image and the facial reconstruction deep image are input into the pre-trained multimodal detection network model to obtain the biometric detection result.

본 개시의 일 실시예에서, 멀티모달 검출 네트워크 모델은 복수의 생체 샘플 이미지 집합 및 복수의 비생체 샘플 이미지 집합 중의 적어도 하나에 따라 트레이닝된 것이고, 각 생체 샘플 이미지 집합은 서로 정합된 생체 안면 컬러 이미지, 생체 안면 적외선 이미지, 생체 안면 딥 이미지를 포함하고; 각 비생체 샘플 이미지 집합은 서로 정합된 비생체 안면 컬러 이미지, 비생체 안면 적외선 이미지, 비생체 안면 딥 이미지를 포함한다. In an embodiment of the present disclosure, the multimodal detection network model is trained according to at least one of a plurality of biological sample image sets and a plurality of non-living sample image sets, and each biological sample image set includes a biometric facial color image matched with each other. , including a bio-facial infrared image and a bio-facial deep image; Each non-living sample image set includes an in vivo facial color image, an in vivo facial infrared image, and an in vivo facial deep image that are matched to each other.

본 개시의 실시예에서, 멀티모달 검출 네트워크 모델을 사전 트레이닝할 수 있으며, 멀티모달 검출은 멀티모달 특징을 기반으로 하는 검출이다. In an embodiment of the present disclosure, a multimodal detection network model may be pre-trained, wherein the multimodal detection is detection based on multimodal features.

생체 검출 결과는 이진 분류 결과이며, 따라서 다량의 양성 샘플 및 음성 샘플을 수집하여 네트워크 모델 트레이닝에 사용할 수 있다. 양성 샘플이 바로 생체 샘플 이미지 집합이며, 구체적으로 서로 정합된 생체 안면 컬러 이미지, 생체 안면 적외선 이미지, 생체 안면 딥 이미지를 포함한다. 즉, 생체 샘플 이미지 집합 중의 이미지는 모두 생체 안면을 촬영하여 얻은 것이다. 서로 정합된다는 의미는 전술한 내용을 참조할 수 있다. The biodetection result is a binary classification result, so a large amount of positive and negative samples can be collected and used for network model training. The positive sample is a set of biological sample images, and specifically includes a biological facial color image, a biological facial infrared image, and a biological facial deep image matched to each other. That is, all of the images in the biological sample image set are obtained by photographing the biological face. The meaning of matching with each other may refer to the above description.

음성 샘플이 바로 비생체 샘플 이미지 집합이며, 구체적으로 서로 정합된 비생체 안면 컬러 이미지, 비생체 안면 적외선 이미지 및 비생체 안면 딥 이미지를 포함한다. 즉, 비생체 샘플 이미지 집합 중의 이미지는 모두 생체 안면을 촬영하여 얻은 것이 아니다. 예를 들어, 사진, 전자 기기 스크린 등을 촬영하는 방식으로 얻은 것이다. The negative sample is a set of non-living sample images, and specifically includes a non-living facial color image, a non-living facial infrared image, and a non-living facial deep image that are matched to each other. That is, not all images in the non-living sample image set are obtained by photographing the living face. For example, it is obtained by taking pictures, screens of electronic devices, etc.

하나의 예시로서, RGB 카메라, NIR 카메라 및 딥 카메라를 포함하는 멀티 카메라를 사용하여, 사진 중의 안면 영역을 동시 촬영하여, 비생체 안면 컬러 이미지, 비생체 안면 적외선 이미지 및 비생체 안면 딥 이미지를 각각 획득한다. As an example, by using a multi-camera including an RGB camera, a NIR camera and a deep camera, the facial region in the picture is taken simultaneously, and a non-living facial color image, a non-living facial infrared image, and a non-living facial deep image are respectively obtained. acquire

양성 샘플의 태그는 생체이고, 음성 샘플의 태그는 비생체이며, 상기 양성 샘플, 음성 샘플 및 대응되는 태그에 따라 멀티모달 검출 네트워크 모델을 트레이닝할 수 있다. 구체적으로, 양성 샘플 또는 음성 샘플을 딥러닝 신경망 모델에 입력하여 출력 결과를 획득하고, 출력 결과 및 진실 태그에 따라 손실값을 계산하여, 손실값을 기반으로 딥러닝 신경망 모델 중의 모델 파라미터를 조정하고, 손실값이 사전 설정된 임계값에 도달하거나, 반복 횟수가 사전 설정된 횟수에 도달할 경우, 트레이닝이 완료된다. 트레이닝이 완료된 딥러닝 신경망이 바로 멀티모달 검출 네트워크 모델이다. A tag of a positive sample is a living organism, and a tag of a negative sample is non-living, and a multimodal detection network model can be trained according to the positive sample, the negative sample, and the corresponding tag. Specifically, a positive sample or a negative sample is input to a deep learning neural network model to obtain an output result, and a loss value is calculated according to the output result and truth tag, and model parameters in the deep learning neural network model are adjusted based on the loss value. , when the loss value reaches a preset threshold, or the number of repetitions reaches a preset number of times, the training is complete. A deep learning neural network that has been trained is a multimodal detection network model.

본 개시의 일 실시예에서, 멀티모달 검출 네트워크 모델은 컨볼루션 계층, 어텐션 메커니즘 모듈, 전역 평균 풀링 계층 및 완전 연결 계층을 포함할 수 있다. 컨볼루션 계층은 병행하는 제1 서브 컨볼루션 계층, 제2 서브 컨볼루션 계층 및 제3 서브 컨볼루션 계층을 포함한다. 상응하게, 안면 컬러 이미지, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 사전 트레이닝된 멀티모달 검출 네트워크 모델에 입력하는 것은 구체적으로 안면 컬러 이미지, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 멀티모달 검출 네트워크 모델의 제1 서브 컨볼루션 계층, 제2 서브 컨볼루션 계층 및 제3 서브 컨볼루션 계층에 각각 입력하는 것이다. In an embodiment of the present disclosure, the multimodal detection network model may include a convolutional layer, an attention mechanism module, a global average pooling layer, and a fully connected layer. The convolutional layer includes a first subconvolutional layer, a second subconvolutional layer, and a third subconvolutional layer in parallel. Correspondingly, inputting the facial color image, the facial reconstruction infrared image and the facial reconstruction deep image into the pre-trained multimodal detection network model specifically converts the facial color image, the facial reconstruction infrared image and the facial reconstruction deep image into the multimodal detection network model. input to the first sub-convolutional layer, the second sub-convolutional layer, and the third sub-convolutional layer, respectively.

각 서브 컨볼루션 계층에 대해, 적절한 신경망 구조, 서브 컨볼루션 계층의 수량 및 출력하는 특징맵 수량을 선택할 수 있다.For each sub-convolutional layer, an appropriate neural network structure, the number of sub-convolutional layers, and the number of output feature maps can be selected.

하나의 예시로서, 서브 컨볼루션 계층의 신경망 구조로서 MobileNet를 사용하여, 컬러 이미지 특징을 추출하는 제1 서브 컨볼루션 계층 중 마지막 계층의 특징맵 수량은 256이고, 적외선 이미지 특징을 추출하는 제2 서브 컨볼루션 계층 중 마지막 계층의 특징맵 수량은 128이고, 딥 이미지 특징을 추출하는 제3 서브 컨볼루션 계층 중 마지막 계층의 특징맵 수량은 128이다. 다음 3개의 서브 컨볼루션 계층의 특징맵을 합병하여 수량이 512인 특징맵을 획득한 후, SE(Squeeze-and-Excitation) 어텐션 모듈, 전역 평균 풀링 계층 및 완전 연결 계층을 순차적으로 연결한다. As an example, using MobileNet as the neural network structure of the sub-convolutional layer, the number of feature maps of the last layer among the first sub-convolutional layers for extracting color image features is 256, and the second sub-convolutional layer for extracting infrared image features The number of feature maps of the last layer among the convolutional layers is 128, and the number of feature maps of the last layer among the third sub-convolutional layers for extracting deep image features is 128. After merging the feature maps of the following three sub-convolutional layers to obtain a feature map with a quantity of 512, the SE (Squeeze-and-Excitation) attention module, the global average pooling layer, and the fully connected layer are sequentially connected.

본 개시의 실시예에서, 상기 안면 컬러 이미지, 안면 적외선 이미지 및 안면 딥 이미지는 3개의 입력 데이터 스트림으로 이해할 수 있다. 이리하여, 3개의 입력 데이터 스트림을 구비하는 멀티모달 검출 네트워크 모델은 멀티모달 특징을 추출하고, 어텐션 모듈을 통해 융합하여 최종의 안면 생체 검출 결과를 획득할 수 있다. In an embodiment of the present disclosure, the facial color image, facial infrared image, and facial deep image may be understood as three input data streams. In this way, the multimodal detection network model having three input data streams can extract multimodal features and fuse through the attention module to obtain the final facial biometric detection result.

이리하여, 본 개시의 실시예에서, 샘플 이미지 집합을 통해 2개의 코덱 재구성 모델을 트레이닝하여, 제1 코덱 재구성 모델은 안면 컬러 이미지의 이미지 특징에 대응되는 안면 적외선 이미지의 이미지 특징을 러닝하고, 제2 코덱 재구성 모델은 안면 컬러 이미지의 이미지 특징에 대응되는 안면 딥 이미지의 이미지 특징을 러닝한다. 따라서 검출할 안면 컬러 이미지에 따라 안면 적외선 이미지 및 안면 딥 이미지를 재구성한 후, 안면 컬러 이미지 및 재구성된 안면 적외선 이미지, 안면 딥 이미지를 멀티모달 네트워크 모델에 입력하여, 멀티모달 네트워크 모델에 의해 안면의 컬러 이미지 특징, 적외선 이미지 특징 및 딥 이미지 특징을 융합시켜, 컬러 이미지 만을 기반으로 하는 안면 생체 검출에 비해, 빛에 대한 민감도를 감소시킬 수 있으므로, 검출 정확도가 크게 향상되고, 네트워크의 일반화 능력을 향상시킬 수 있으므로, 사진, 비디오 등 평면적인 공격에 대한 방어 효과가 향상되는 동시에, 미지의 공격 샘플에 대한 방어 효과도 향상된다.Thus, in an embodiment of the present disclosure, by training two codec reconstruction models through a set of sample images, the first codec reconstruction model learns the image features of the facial infrared image corresponding to the image features of the facial color image, and the second The two-codec reconstruction model learns the image features of the deep facial image corresponding to the image features of the facial color image. Therefore, after reconstructing the facial infrared image and the facial deep image according to the facial color image to be detected, the facial color image, the reconstructed facial infrared image, and the facial deep image are input to the multimodal network model, By fusing color image features, infrared image features and deep image features, compared to facial biometric detection based on color images alone, the sensitivity to light can be reduced, greatly improving the detection accuracy and improving the generalization ability of the network. Therefore, the defensive effect against flat attacks such as photos and videos is improved, and at the same time, the defensive effect against unknown attack samples is also improved.

그 외에, 멀티모달 특징 정보는 모델 러닝에 유리하므로, 모델 수렴 속도를 현저하게 향상시킨다.In addition, since multimodal feature information is advantageous for model learning, it significantly improves the model convergence speed.

본 개시의 일 실시예에서, 단계 S101 이후, 단계 S102의 전에, 안면 컬러 이미지에 대해 안면 키포인트 검출을 수행하고, 안면 키포인트 검출 결과를 기반으로 안면 이미지 교정을 수행하여, 교정된 이미지에 대해 정규화 처리를 수행하는 단계를 더 포함할 수 있다. In an embodiment of the present disclosure, after step S101, before step S102, facial keypoint detection is performed on the facial color image, and facial image correction is performed based on the facial keypoint detection result, and normalization processing is performed on the corrected image It may further include the step of performing.

구체적으로, 안면 컬러 이미지를 획득한 후, 안면 영역 검출을 먼전 수행하여, 안면의 대략적인 위치 영역을 획득할 수 있다. 예를 들어, 안면 컬러 이미지를 안면 영역 검출 모델에 입력하여 안면이 위치한 영역을 획득한다.Specifically, after acquiring the facial color image, the facial region detection may be performed to obtain an approximate position region of the face. For example, a facial color image is input to a facial region detection model to obtain a region in which a face is located.

다음, 안면 키포인트 검출 모델을 통해 안면이 위치한 영역을 검출하여, 안면의 키포인트 좌표값을 획득한다. 안면의 키포인트는 사전 정의된 것이며, 예를 들어 코 왼쪽, 콧구멍 아래쪽, 동공 위치, 입술 아래쪽 등 위치이다. Next, by detecting the area where the face is located through the facial keypoint detection model, the keypoint coordinates of the face are obtained. Keypoints on the face are predefined, for example, the left side of the nose, below the nostril, the position of the pupil, the position of the lower lip, etc.

하나의 예시로서, 72개의 안면 키포인트를 정의할 경우, 안면 키포인트 검출 모델은 72개의 좌표 즉

…

를 출력할 수 있다. As an example, if 72 facial keypoints are defined, the facial keypoint detection model is 72 coordinates, i.e.

…

can be printed out.

안면 키포인트를 획득한 후, 안면 키포인트 좌표를 기반으로 안면 이미지 교정을 수행할 수 있으며, 안면 이미지 교정은 안면 정렬로 불리울 수 있고, 아핀 변환을 통해 구현될 수 있다. 구체적으로, 검출된 안면 키포인트 및 사전 설정된 가상 정면 안면 키포인트에 따라, 아핀 변환의 아핀 행렬 R, T를 계산한 후, 아핀 행렬을 사용하여 안면 이미지를 정면에 매핑하고, 아핀 변환된 안면 영역을 절취한다. 즉, 안면 이미지 교정을 통해 각도가 잘못된 안면 이미지를 각도가 올바른 안면 이미지에 매핑할 수 있다. After acquiring the facial keypoint, facial image correction may be performed based on the facial keypoint coordinates, and the facial image correction may be referred to as facial alignment, and may be implemented through affine transformation. Specifically, according to the detected facial keypoints and preset virtual frontal facial keypoints, after calculating the affine matrix R, T of the affine transformation, the affine matrix is used to map the facial image to the front, and the affine transformed facial region is cut out do. In other words, facial image correction can map an angled face image to a correctly angled face image.

본 개시의 실시예에서, 안면 생체 검출의 강건성을 향상시키기 위해, 교정된 이미지에 대해 정규화 처리를 수행할 수 있다. 교정된 안면 이미지 중 각 픽셀점에 대해 정규화 처리를 수행하는 것은 구체적으로 각 픽셀점의 픽셀값에서 128을 뺀 후 256으로 나누어, 각 픽셀점의 픽셀값이 [-0.5, 0.5] 사이에 되도록 하는 것이다. In an embodiment of the present disclosure, in order to improve robustness of facial biometric detection, normalization processing may be performed on the corrected image. Performing normalization processing for each pixel point in the corrected facial image is specifically to subtract 128 from the pixel value of each pixel point and divide by 256 so that the pixel value of each pixel point is between [-0.5, 0.5]. will be.

이리하여, 본 개시의 실시예에서, 안면 컬러 이미지에 대해 안면 영역 검출, 안면 키포인트 검출, 안면 이미지 교정 및 정규화 처리를 수행한 다음, 제1 코덱 재구성 모델, 제2 코덱 재구성 모델 및 멀티모달 검출 네트워크 모델의 입력으로 사용하여, 안면 생체 검출의 정확도를 더 향상시킬 수 있다.Thus, in the embodiment of the present disclosure, facial region detection, facial keypoint detection, facial image correction and normalization processing are performed on the facial color image, and then the first codec reconstruction model, the second codec reconstruction model, and the multimodal detection network By using it as input to the model, the accuracy of facial biometric detection can be further improved.

본 개시의 일 실시예에서, 샘플 이미지 집합 중의 안면 컬러 샘플 이미지, 안면 적외선 샘플 이미지 및 안면 딥 샘플 이미지도 안면 영역 검출, 안면 이미지 교정 및 정규화 처리를 통해 획득될 수 있다. In an embodiment of the present disclosure, a facial color sample image, a facial infrared sample image, and a facial deep sample image in the sample image set may also be acquired through facial region detection, facial image correction, and normalization processing.

구체적으로, 도 2를 참조하면, 도 2는 본 개시의 실시예에서 제공되는 안면 샘플 이미지를 획득하는 개략적인 흐름도이다. 도 2에 도시된 바와 같이, 하기의 방식을 사용하여 안면 컬러 샘플 이미지, 안면 적외선 샘플 이미지 및 안면 딥 샘플 이미지를 획득할 수 있다. Specifically, referring to FIG. 2 , FIG. 2 is a schematic flowchart of acquiring a facial sample image provided in an embodiment of the present disclosure. As shown in FIG. 2 , a facial color sample image, a facial infrared sample image, and a facial deep sample image may be acquired using the following methods.

S201, 서로 정합된 초기 안면 컬러 이미지, 초기 안면 적외선 이미지 및 초기 안면 딥 이미지를 획득한다.S201, acquire an initial facial color image, an initial facial infrared image and an initial facial deep image matched with each other.

서로 정합된다는 의미는 전술한 내용을 참조할 수 있다.The meaning of matching with each other may refer to the above description.

하나의 예시로서, RGB 카메라, NIR 카메라 및 딥 카메라를 포함하는 멀티 카메라를 사용하여, 생체 안면을 동시 촬영하여, 초기 안면 컬러 이미지, 초기 안면 적외선 이미지 및 초기 안면 딥 이미지를 각각 획득한다. As an example, using a multi-camera including an RGB camera, an NIR camera, and a deep camera, a bio-face is simultaneously photographed to obtain an initial facial color image, an initial facial infrared image, and an initial facial deep image, respectively.

S202, 초기 안면 컬러 이미지에 대해 안면 키포인트 검출을 수행하고, 안면 키포인트 검출 결과를 기반으로 안면 이미지 교정을 수행하고, 교정된 이미지에 대해 정규화 처리를 수행하여, 안면 컬러 샘플 이미지를 획득한다. S202, facial keypoint detection is performed on the initial facial color image, facial image correction is performed based on the facial keypoint detection result, and normalization processing is performed on the corrected image to obtain a facial color sample image.

안면 키포인트 검출, 안면 이미지 교정 및 정규화 처리 과정에 관련하여 전술한 내용을 참조할 수 있으나, 여기서 반복하지 않는다.Reference may be made to the foregoing in relation to facial keypoint detection, facial image correction and normalization processing, but this will not be repeated here.

S203, 초기 안면 컬러 이미지의 안면 키포인트 검출 결과를 기반으로, 초기 안면 적외선 이미지 및 초기 안면 딥 이미지에 대해 안면 이미지 교정을 각각 수행하고, 교정된 이미지에 대해 정규화 처리를 각각 수행하여, 안면 적외선 샘플 이미지 및 안면 딥 샘플 이미지를 획득한다. S203, based on the facial keypoint detection result of the initial facial color image, facial image correction is performed on the initial facial infrared image and the initial facial deep image, respectively, and normalization processing is performed on the corrected image, respectively, so that the facial infrared sample image and acquiring a facial dip sample image.

본 개시의 실시예에서, 서로 정합된 초기 안면 컬러 이미지, 초기 안면 적외선 이미지 및 초기 안면 딥 이미지에 대해, 이미지 사이즈가 같고, 픽셀 수량이 같고, 픽셀점이 일대일로 대응된다. 따라서 초기 안면 컬러 이미지의 안면 키포인트 검출 결과에 따라, 초기 안면 적외선 이미지 및 초기 안면 딥 이미지에 대해 안면 이미지 교정을 직접 수행할 수 있다. 즉, 초기 안면 컬러 이미지의 안면 키포인트 검출 결과는 초기 안면 적외선 이미지 및 초기 안면 딥 이미지의 안면 키포인트 검출 결과로도 사용할 수 있으며, 따라서 서로 정합된 상기 3개의 이미지에 대해 같은 아핀 행렬을 사용하여 아핀 변환을 수행하여 안면 이미지 교정을 구현할 수 있다. In an embodiment of the present disclosure, for the initial facial color image, the initial facial infrared image, and the initial facial deep image matched with each other, the image size is the same, the pixel quantity is the same, and the pixel points are one-to-one correspondence. Therefore, according to the facial keypoint detection result of the initial facial color image, facial image correction can be directly performed on the initial facial infrared image and the initial facial deep image. That is, the facial keypoint detection result of the initial facial color image can also be used as the facial keypoint detection result of the initial facial infrared image and the initial facial deep image. to implement facial image correction.

안면 컬러 샘플 이미지, 안면 적외선 샘플 이미지 및 안면 딥 샘플 이미지를 획득한 후, 이미지에 대해 랜덤 데이터 향상 처리를 수행할 수 있으며, 예를 들어, 랜덤으로 재단, 반전, 콘트라스트 설정 및 광도 설정을 수행하여, 더 많은 샘플 이미지를 획득하고, 더 우수한 트레이닝 모델을 획득하고, 모델의 일반화 능력을 향상시킨다. After acquiring a facial color sample image, a facial infrared sample image and a facial deep sample image, random data enhancement processing can be performed on the image, for example, by randomly cutting, inverting, contrast setting and luminance setting , obtain more sample images, obtain a better training model, and improve the generalization ability of the model.

이리하여, 본 개시의 실시예에서, 초기 안면 이미지의 기초에서, 안면 영역 검출, 안면 키포인트 검출, 안면 이미지 교정 및 정규화 처리를 순차적으로 수행한 다음, 모델의 트레이닝 샘플로 사용하여, 모델에 의해 유효한 이미지 특징을 추출하는데 편리하고, 안면 생체 검출의 검출 정확성을 더 향상시킨다. Thus, in the embodiment of the present disclosure, on the basis of the initial facial image, facial region detection, facial keypoint detection, facial image correction and normalization processing are sequentially performed, and then used as a training sample of the model, validated by the model It is convenient for extracting image features, and further improves the detection accuracy of facial biometric detection.

이해를 돕기 위해, 이하, 도 3을 결부하여, 본 개시의 실시예에서 제공되는 안면 생체 검출 방법에 대해 더 설명한다. 도 3은 본 개시의 실시예에서 제공되는 안면 생체 검출 방법의 개략도이다.For better understanding, the facial biometric detection method provided in the embodiment of the present disclosure will be further described below with reference to FIG. 3 . 3 is a schematic diagram of a facial biometric detection method provided in an embodiment of the present disclosure;

도 3에 도시된 바와 같이, 검출할 안면 컬러 이미지에 대해 안면 영역 검출, 안면 이미지 교정 및 이미지 사전 처리를 순차적으로 수행한다. 이미지 사전 처리는 정규화 처리일 수 있다. 이미지 사전 처리된 안면 컬러 이미지를 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델에 각각 입력하여, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 각각 획득한다. 다음, 이미지 사전 처리된 안면 컬러 이미지, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 멀티모달 검출 네트워크 모델 중의 각 MobileNet 컨볼루션 계층에 각각 입력하여, SE 어텐션 메커니즘 모듈, 전역 평균 풀링 계층 및 완전 연결 계층을 순차적으로 경과하여, 안면 생체 검출 결과를 획득한다.As shown in FIG. 3 , facial region detection, facial image correction, and image pre-processing are sequentially performed on a facial color image to be detected. The image pre-processing may be a normalization process. The image pre-processed facial color image is input to the first codec reconstruction model and the second codec reconstruction model, respectively, to obtain a facial reconstruction infrared image and a facial reconstruction deep image, respectively. Next, the image pre-processed facial color image, facial reconstruction infrared image, and facial reconstruction deep image were respectively input into each MobileNet convolutional layer in the multimodal detection network model to generate the SE attention mechanism module, global average pooling layer, and fully connected layer. Sequentially, a facial biometric detection result is obtained.

그 외에, 네트워크 트레이닝의 수렴 속도를 높이고, 현실 시나리오에서 안면 생체 검출 알고리즘을 사용하는 일반화 성능 및 정밀도를 향상시킬 수 있고, 안면 생체 검출 기술의 성능을 향상시켜, 안면 생체 검출 기술을 기반으로 하는 많은 응용 프로그램의 효과 및 사용자 체험을 개선하는 데 도움이 되고, 업무 프로젝트의 추가 촉진에 도움이 될 수 있다. In addition, it can increase the convergence speed of network training, improve the generalization performance and precision of using the facial biometric detection algorithm in real scenarios, and improve the performance of the facial biometric detection technology, so that many It helps to improve the effectiveness and user experience of the application, and can help further facilitating business projects.

도 4를 참조하면, 도 4는 본 개시의 실시예의 안면 생체 검출 방법을 구현하는 장치의 블록도이다. 도 4에 도시된 바와 같이, 장치는,Referring to FIG. 4 , FIG. 4 is a block diagram of an apparatus implementing the facial biometric detection method according to an embodiment of the present disclosure. As shown in Figure 4, the device is

검출할 안면 컬러 이미지를 획득하는 획득 모듈(401);an acquiring module 401 for acquiring a face color image to be detected;

상기 안면 컬러 이미지를 사전 트레이닝된 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델에 각각 입력하여, 안면 재구성 적외선 이미지 및 안면 재구성 딥 이미지를 각각 획득하는 재구성 모듈(402); 및 a reconstruction module 402 for respectively inputting the facial color image to a pre-trained first codec reconstruction model and a second codec reconstruction model to obtain a facial reconstruction infrared image and a facial reconstruction deep image, respectively; and

상기 안면 컬러 이미지, 상기 안면 재구성 적외선 이미지 및 상기 안면 재구성 딥 이미지를 사전 트레이닝된 멀티모달 검출 네트워크 모델에 입력하여, 생체 검출 결과를 획득하는 검출 모듈(403)을 포함할 수 있다. and a detection module 403 for inputting the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image into a pre-trained multimodal detection network model to obtain a biometric detection result.

본 개시의 일 실시예에서, 상기 제1 코덱 재구성 모델은 복수의 제1 샘플 이미지 집합에 따라 트레이닝된 바, 각 제1 샘플 이미지 집합은 서로 정합된 안면 컬러 샘플 이미지 및 안면 적외선 샘플 이미지를 포함하고; 상기 제2 코덱 재구성 모델은 복수의 제2 샘플 이미지 집합에 따라 트레이닝된 바, 각 제2 샘플 이미지 집합은 서로 정합된 안면 컬러 샘플 이미지 및 안면 딥 샘플 이미지를 포함하며; 상기 멀티모달 검출 네트워크 모델은 복수의 생체 샘플 이미지 집합 및 복수의 비생체 샘플 이미지 집합 중의 적어도 하나에 따라 트레이닝된 바, 각 생체 샘플 이미지 집합은 서로 정합된 생체 안면 컬러 이미지, 생체 안면 적외선 이미지, 생체 안면 딥 이미지를 포함하고; 각 비생체 샘플 이미지 집합은 서로 정합된 비생체 안면 컬러 이미지, 비생체 안면 적외선 이미지, 비생체 안면 딥 이미지를 포함한다. In an embodiment of the present disclosure, the first codec reconstruction model is trained according to a plurality of first sample image sets, each first sample image set comprising a facial color sample image and a facial infrared sample image matched with each other, ; the second codec reconstruction model is trained according to a plurality of second sample image sets, each second sample image set comprising a facial color sample image and a facial deep sample image matched with each other; The multimodal detection network model is trained according to at least one of a plurality of biological sample image sets and a plurality of non-living sample image sets. including facial deep images; Each non-living sample image set includes an in vivo facial color image, an in vivo facial infrared image, and an in vivo facial deep image that are matched to each other.

본 개시의 일 실시예에서, 도 4에 도시된 장치의 기초에서,In one embodiment of the present disclosure, on the basis of the apparatus shown in Fig. 4,

상기 안면 컬러 이미지를 사전 트레이닝된 제1 코덱 재구성 모델 및 제2 코덱 재구성 모델에 각각 입력하기 전에, 상기 안면 컬러 이미지에 대해 안면 키포인트 검출을 수행하고, 안면 키포인트 검출 결과를 기반으로 안면 이미지 교정을 수행하고, 교정된 이미지에 대해 정규화 처리를 수행하는 사전 처리 모듈을 더 포함할 수 있다.Before inputting the facial color image to the pre-trained first codec reconstruction model and the second codec reconstruction model, respectively, facial keypoint detection is performed on the facial color image, and facial image correction is performed based on the facial keypoint detection result and may further include a pre-processing module that performs normalization processing on the corrected image.

안면 컬러 샘플 이미지, 안면 적외선 샘플 이미지 및 안면 딥 샘플 이미지를 획득하는 샘플 이미지 획득 모듈을 더 포함할 수 있는 바, 상기 샘플 이미지 획득 모듈은, The sample image acquisition module may further include a sample image acquisition module configured to acquire a facial color sample image, a facial infrared sample image, and a facial deep sample image, the sample image acquisition module comprising:

서로 정합된 초기 안면 컬러 이미지, 초기 안면 적외선 이미지 및 초기 안면 딥 이미지를 획득하고;acquiring an initial facial color image, an initial facial infrared image, and an initial facial deep image that are registered with each other;

상기 초기 안면 컬러 이미지에 대해 안면 키포인트 검출을 수행하고, 안면 키포인트 검출 결과를 기반으로 안면 이미지 교정을 수행하고, 교정된 이미지에 대해 정규화 처리를 수행하여, 상기 안면 컬러 샘플 이미지를 획득하고;performing facial keypoint detection on the initial facial color image, performing facial image correction based on the facial keypoint detection result, and performing normalization processing on the corrected image to obtain the facial color sample image;

상기 초기 안면 컬러 이미지의 안면 키포인트 검출 결과를 기반으로 상기 초기 안면 적외선 이미지 및 상기 초기 안면 딥 이미지에 대해 안면 이미지 교정을 각각 수행하고, 교정된 이미지에 대해 정규화 처리를 각각 수행하여, 상기 안면 적외선 샘플 이미지 및 상기 안면 딥 샘플 이미지를 획득한다. Based on the facial keypoint detection result of the initial facial color image, facial image correction is respectively performed on the initial facial infrared image and the initial facial deep image, and normalization processing is performed on the corrected image, respectively, and the facial infrared sample Acquire an image and an image of the facial dip sample.

본 개시의 일 실시예에서, 멀티모달 검출 네트워크 모델은, In an embodiment of the present disclosure, the multimodal detection network model includes:

컨볼루션 계층, 어텐션 메커니즘 모듈, 전역 평균 풀링 계층 및 완전 연결 계층을 포함하며, 컨볼루션 계층은 병행하는 제1 서브 컨볼루션 계층, 제2 서브 컨볼루션 계층 및 제3 서브 컨볼루션 계층을 포함한다.It includes a convolutional layer, an attention mechanism module, a global average pooling layer, and a fully connected layer, and the convolutional layer includes a first subconvolutional layer, a second subconvolutional layer and a third subconvolutional layer in parallel.

상기 검출 모듈은 구체적으로 상기 안면 컬러 이미지, 상기 안면 재구성 적외선 이미지 및 상기 안면 재구성 딥 이미지를 상기 멀티모달 검출 네트워크 모델의 제1 서브 컨볼루션 계층, 제2 서브 컨볼루션 계층 및 제3 서브 컨볼루션 계층에 각각 입력하는데 사용될 수 있다. The detection module is specifically configured to apply the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image to the first sub-convolutional layer, the second sub-convolutional layer and the third sub-convolutional layer of the multimodal detection network model. can be used to input each

상기 실시예에서 제공되는 안면 생체 검출 수단에서, 샘플 이미지 집합을 통해 2개의 코덱 재구성 모델을 트레이닝하여, 제1 코덱 재구성 모델은 안면 컬러 이미지의 이미지 특징에 대응되는 안면 적외선 이미지의 이미지 특징을 러닝하고, 제2 코덱 재구성 모델은 안면 컬러 이미지의 이미지 특징에 대응되는 안면 딥 이미지의 이미지 특징을 러닝한다. 따라서 검출할 안면 컬러 이미지에 따라 안면 적외선 이미지 및 안면 딥 이미지를 재구성한 후, 안면 컬러 이미지 및 재구성된 안면 적외선 이미지, 안면 딥 이미지를 멀티모달 네트워크 모델에 입력하여, 멀티모달 네트워크 모델에 의해 안면의 컬러 이미지 특징, 적외선 이미지 특징 및 딥 이미지 특징을 융합시켜, 컬러 이미지 만을 기반으로 하는 안면 생체 검출에 비해, 빛에 대한 민감도를 감소시킬 수 있으므로, 검출 정확도가 크게 향상되고, 네트워크의 일반화 능력을 향상시킬 수 있으므로, 사진, 비디오 등 평면적인 공격에 대한 방어 효과가 향상되는 동시에, 미지의 공격 샘플에 대한 방어 효과도 향상된다.In the facial biometric detection means provided in the above embodiment, by training two codec reconstruction models through a set of sample images, the first codec reconstruction model learns the image features of the facial infrared image corresponding to the image features of the facial color image, , the second codec reconstruction model learns the image features of the deep facial image corresponding to the image features of the facial color image. Therefore, after reconstructing the facial infrared image and the facial deep image according to the facial color image to be detected, the facial color image, the reconstructed facial infrared image, and the facial deep image are input to the multimodal network model, By fusing color image features, infrared image features and deep image features, compared to facial biometric detection based on color images alone, the sensitivity to light can be reduced, greatly improving the detection accuracy and improving the generalization ability of the network. Therefore, the defensive effect against flat attacks such as photos and videos is improved, and at the same time, the defensive effect against unknown attack samples is also improved.

본 개시의 기술적 수단에서, 관련된 사용자 개인 정보의 획득, 저장 및 적용 등은 모두 관계법령을 준수하고, 공서양속에 반하지 않는다.In the technical means of the present disclosure, the acquisition, storage, and application of related user personal information all comply with the relevant laws and regulations, and do not go against public order and morals.

본 개시의 실시예에 따르면, 본 개시는 전자 기기, 판독 가능 저장 매체 및 컴퓨터 프로그램을 더 제공한다.According to an embodiment of the present disclosure, the present disclosure further provides an electronic device, a readable storage medium, and a computer program.

본 개시는 전자 기기를 제공하며, The present disclosure provides an electronic device,

적어도 하나의 프로세서; 및 at least one processor; and

본 개시는 컴퓨터 명령이 저장되어 있는 비일시적 컴퓨터 판독 가능 저장 매체를 제공하며, 상기 컴퓨터 명령은 컴퓨터가 안면 생체 검출 방법을 수행하는데 사용된다. The present disclosure provides a non-transitory computer readable storage medium having computer instructions stored thereon, wherein the computer instructions are used by a computer to perform a facial biometric detection method.

본 개시는 컴퓨터 판독 가능 저장 매체에 저장된 컴퓨터 프로그램을 제공하며, 상기 컴퓨터 프로그램 중의 명령이 프로세서에 의해 수행될 경우, 안면 생체 검출 방법을 수행한다. The present disclosure provides a computer program stored in a computer-readable storage medium, and when instructions in the computer program are executed by a processor, a facial biometric detection method is performed.

도 5는 본 개시의 실시예를 실시하기 위한 예시적인 전자 기기(500)의 개략적인 블록도이다. 전자 기기는 랩톱 컴퓨터, 데스크톱 컴퓨터, 워크 스테이션, 개인용 디지털 비서, 서버, 블레이드 서버, 메인 프레임워크 컴퓨터 및 기타 적합한 컴퓨터와 같은 다양한 형태의 디지털 컴퓨터를 나타내기 위한 것이다. 전자 기기는 또한 개인용 디지털 처리, 셀룰러 폰, 스마트 폰, 웨어러블 기기 및 기타 유사한 컴퓨팅 장치와 같은 다양한 형태의 모바일 장치를 나타낼 수도 있다. 본 명세서에서 제시된 구성 요소, 이들의 연결 및 관계, 또한 이들의 기능은 단지 예일 뿐이며 본문에서 설명되거나 및/또는 요구되는 본 개시의 구현을 제한하려는 의도가 아니다.5 is a schematic block diagram of an exemplary electronic device 500 for practicing an embodiment of the present disclosure. Electronic device is intended to represent various types of digital computers such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, main framework computers and other suitable computers. Electronic devices may also refer to various forms of mobile devices such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components presented herein, their connections and relationships, and their functions, are by way of example only and are not intended to limit the implementation of the present disclosure as described and/or required herein.

도 5에 도시된 바와 같이, 기기(500)는 컴퓨팅 유닛(501)을 포함하며, 읽기 전용 메모리(ROM)(502)에 저장된 컴퓨터 프로그램에 의해 또는 저장 유닛(508)으로부터 랜덤 액세스 메모리(RAM)(503)에 로딩된 컴퓨터 프로그램에 의해 수행되어 각종 적절한 동작 및 처리를 수행할 수 있다. RAM(503)에, 또한 기기(500)가 조작을 수행하기 위해 필요한 각종 프로그램 및 데이터가 저장되어 있다. 컴퓨팅 유닛(501), ROM(502) 및 RAM(503)은 버스(504)를 통해 서로 연결되어 있다. 입력/출력(I/O) 인터페이스(505)도 버스(504)에 연결되어 있다.As shown in FIG. 5 , device 500 includes a computing unit 501 , either by or from a computer program stored in read-only memory (ROM) 502 or from a storage unit 508 , random access memory (RAM). may be performed by a computer program loaded in 503 to perform various appropriate operations and processing. The RAM 503 also stores various programs and data necessary for the device 500 to perform an operation. The computing unit 501 , the ROM 502 , and the RAM 503 are connected to each other via a bus 504 . An input/output (I/O) interface 505 is also coupled to the bus 504 .

키보드, 마우스 등과 같은 입력 유닛(506); 각종 유형의 모니터, 스피커 등과 같은 출력 유닛(507); 자기 디스크, 광 디스크 등과 같은 저장 유닛(508); 및 네트워크 카드, 모뎀, 무선 통신 트랜시버 등과 같은 통신 유닛(509)을 포함하는 기기(500) 중의 복수의 부품은 I/O 인터페이스(505)에 연결된다. 통신 유닛(509)은 기기(500)가 인터넷과 같은 컴퓨터 네트워크 및/또는 다양한 통신 네트워크를 통해 다른 기기와 정보/데이터를 교환하는 것을 허락한다. an input unit 506 such as a keyboard, mouse, or the like; output units 507 such as various types of monitors, speakers, and the like; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 , such as a network card, modem, wireless communication transceiver, and the like, are coupled to the I/O interface 505 . The communication unit 509 allows the device 500 to exchange information/data with other devices via a computer network such as the Internet and/or various communication networks.

컴퓨팅 유닛(501)은 프로세싱 및 컴퓨팅 능력을 구비한 다양한 범용 및/또는 전용 프로세싱 컴포넌트일 수 있다. 컴퓨팅 유닛(501)의 일부 예시는 중앙 처리 유닛(CPU), 그래픽 처리 유닛(GPU), 다양한 전용 인공 지능(AI) 컴퓨팅 칩, 기계 러닝 모델 알고리즘을 수행하는 다양한 컴퓨팅 유닛, 디지털 신호 처리기(DSP), 및 임의의 적절한 프로세서, 컨트롤러, 마이크로 컨트롤러 등을 포함하지만, 이에 제한되지 않는다. 컴퓨팅 유닛(501)은 예를 들어 안면 생체 검출 방법과 같은 윗글에서 설명된 각각의 방법 및 처리를 수행한다. 예를 들어, 일부 실시예에서, 안면 생체 검출 방법은 저장 유닛(508)과 같은 기계 판독 가능 매체에 유형적으로 포함되어 있는 컴퓨터 소프트웨어 프로그램으로 구현될 수 있다. 일부 실시예에서, 컴퓨터 프로그램의 일부 또는 전부는 ROM(502) 및/또는 통신 유닛(509)을 통해 기기(500)에 로드 및/또는 설치될 수 있다. 컴퓨터 프로그램이 RAM(503)에 로딩되고 컴퓨팅 유닛(501)에 의해 수행될 경우, 전술한 안면 생체 검출 방법의 하나 또는 하나 이상의 단계를 수행할 수 있다. 대안적으로, 다른 실시예에서, 컴퓨팅 유닛(501)은 임의의 다른 적절한 방식을 통해(예를 들어, 펌웨어에 의해) 구성되어 안면 생체 검출 방법을 수행하도록 한다. The computing unit 501 may be a variety of general purpose and/or dedicated processing components with processing and computing capabilities. Some examples of computing unit 501 include a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units that perform machine learning model algorithms, and a digital signal processor (DSP). , and any suitable processor, controller, microcontroller, and the like. The computing unit 501 performs each method and processing described in the above, such as a facial biometric detection method, for example. For example, in some embodiments, the facial biometric detection method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508 . In some embodiments, some or all of the computer program may be loaded and/or installed into the device 500 via the ROM 502 and/or the communication unit 509 . When the computer program is loaded into the RAM 503 and executed by the computing unit 501 , one or more steps of the facial biometric detection method described above may be performed. Alternatively, in other embodiments, computing unit 501 is configured in any other suitable manner (eg, by firmware) to perform the facial biometric detection method.

여기서 설명되는 시스템 및 기술의 다양한 실시 방식은 디지털 전자 회로 시스템, 집적 회로 시스템, 필드 프로그래머블 게이트 어레이(FPGA), 주문형 집적 회로(ASIC), 특정 용도 표준 제품(ASSP), 시스템온칩(SOC), 복합 프로그래머블 논리 소자(CPLD), 컴퓨터 하드웨어, 펌웨어, 소프트웨어 및 이들의 조합 중의 적어도 하나로 구현될 수 있다. 이러한 다양한 실시 방식은 하나 또는 하나 이상의 컴퓨터 프로그램에서의 구현을 포함할 수 있으며, 당해 하나 또는 하나 이상의 컴퓨터 프로그램은 적어도 하나의 프로그램 가능 프로세서를 포함하는 프로그램 가능 시스템에서 수행 및/또는 해석될 수 있고, 당해 프로그램 가능 프로세서는 전용 또는 일반용일 수 있고, 저장 시스템, 적어도 하나의 입력 장치 및 적어도 하나의 출력 장치로부터 데이터 및 명령을 수신하고 또한 데이터 및 명령을 당해 저장 시스템, 당해 적어도 하나의 입력 장치 및 당해 적어도 하나의 출력 장치에 전송할 수 있다. Various implementations of the systems and techniques described herein may include digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application-specific standard products (ASSPs), system-on-chips (SOCs), and composites. It may be implemented in at least one of a programmable logic element (CPLD), computer hardware, firmware, software, and a combination thereof. These various modes of implementation may include implementation in one or more computer programs, wherein the one or more computer programs may be performed and/or interpreted in a programmable system including at least one programmable processor, The programmable processor may be dedicated or general purpose, and receives data and instructions from a storage system, at least one input device, and at least one output device, and transmits data and instructions to the storage system, at least one input device and at least one output device. may be transmitted to at least one output device.

본 개시의 방법을 구현하기 위해 사용되는 프로그램 코드는 하나 또는 하나 이상의 프로그래밍 언어의 임의의 조합으로 작성될 수 있다. 이러한 프로그램 코드는 범용 컴퓨터, 전용 컴퓨터 또는 기타 프로그래머블 데이터 처리 장치의 프로세서 또는 컨트롤러에 제공될 수 있으므로, 프로그램 코드가 프로세서 또는 컨트롤러에 의해 수행될 경우, 흐름도 및/또는 블록도에서 규정한 기능/조작을 구현하도록 한다. 프로그램 코드는 전체적으로 기계에서 수행되거나, 부분적으로 기계에서 수행되거나, 독립 소프트웨어 패키지로서 부분적으로 기계에서 수행되고 부분적으로 원격 기계에서 수행되거나 또는 전체적으로 원격 기계 또는 서버에서 수행될 수 있다. The program code used to implement the methods of the present disclosure may be written in one or any combination of one or more programming languages. Such program code may be provided to the processor or controller of a general-purpose computer, dedicated computer, or other programmable data processing device, so that when the program code is executed by the processor or controller, the functions/operations specified in the flowchart and/or block diagram may be performed. to implement it. The program code may run entirely on the machine, partly on the machine, as a standalone software package partly on the machine and partly on the remote machine, or entirely on the remote machine or server.

본 개시의 문맥에서, 기계 판독 가능 매체는 자연어 수행 시스템, 장치 또는 기기에 의해 사용되거나 자연어 수행 시스템, 장치 또는 기기와 결합하여 사용되는 프로그램을 포함하거나 저장할 수 있는 유형의 매체일 수 있다. 기계 판독 가능 매체는 기계 판독 가능 신호 매체 또는 기계 판독 가능 저장 매체일 수 있다. 기계 판독 가능 매체는 전자, 자기, 광학, 전자기, 적외선 또는 반도체 시스템, 장치 또는 기기, 또는 상기 내용의 임의의 적절한 조합을 포함할 수 있지만 이에 제한되지 않는다. 기계 판독 가능 저장 매체의 더 구체적인 예시는 하나 또는 하나 이상의 전선을 기반하는 전기 연결, 휴대용 컴퓨터 디스크, 하드 디스크, 랜덤 액세스 메모리(RAM), 읽기 전용 메모리(ROM), 지울 수 있는 프로그래머블 읽기 전용 메모리(EPROM 또는 플래시 메모리), 광섬유, 휴대용 컴팩트 디스크 읽기 전용 메모리(CD-ROM), 광학 저장 기기, 자기 저장 기기 또는 상기 내용의 임의의 적절한 조합을 포함할 수 있지만 이에 제한되지 않는다.In the context of the present disclosure, a machine-readable medium may be a tangible medium that can contain or store a program for use by or in combination with a natural language performing system, device or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared or semiconductor systems, devices or appliances, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory ( EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

사용자와의 인터랙션을 제공하기 위해 여기에 설명된 시스템 및 기술은 컴퓨터에서 실시될 수 있다. 당해 컴퓨터는 사용자에게 정보를 디스플레이하기 위한 디스플레이 장치(예를 들어, CRT(음극선관) 또는 LCD(액정 디스플레이) 모니터); 및 키보드 및 포인팅 장치(예를 들어, 마우스 또는 트랙볼)를 구비하며, 사용자는 당해 키보드 및 당해 포인팅 장치를 통해 컴퓨터에 입력을 제공할 수 있다. 다른 유형의 장치를 사용하여 사용자와의 인터랙션을 제공할 수도 있으며, 예를 들어, 사용자에게 제공되는 피드백은 임의의 형태의 감지 피드백(예를 들어, 시각적 피드백, 청각적 피드백 또는 촉각적 피드백)일 수 있고; 임의의 형태(소리 입력, 음성 입력 또는 촉각 입력을 포함)로 사용자로부터의 입력을 수신할 수 있다. The systems and techniques described herein for providing interaction with a user may be implemented on a computer. The computer may include a display device (eg, a cathode ray tube (CRT) or liquid crystal display (LCD) monitor) for displaying information to a user; and a keyboard and a pointing device (eg, a mouse or a trackball), wherein the user can provide an input to the computer through the keyboard and the pointing device. Other types of devices may be used to provide interaction with the user, for example, the feedback provided to the user may be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback). can; An input from a user may be received in any form (including a sound input, a voice input, or a tactile input).

여기서 설명된 시스템 및 기술은 백엔드 부품을 포함하는 컴퓨팅 시스템(예를 들어, 데이터 서버로서), 또는 미들웨어 부품을 포함하는 컴퓨팅 시스템(예를 들어, 응용 서버), 또는 프런트 엔드 부품을 포함하는 컴퓨팅 시스템(예를 들어, 그래픽 사용자 인터페이스 또는 네트워크 브라우저를 구비하는 사용자 컴퓨터인 바, 사용자는 당해 그래픽 사용자 인터페이스 또는 네트워크 브라우저를 통해 여기서 설명된 시스템 및 기술의 실시 방식과 인터랙션할 수 있음), 또는 이러한 백엔드 부품, 미들웨어 부품 또는 프런트 엔드 부품의 임의의 조합을 포함하는 컴퓨팅 시스템에서 실시될 수 있다. 시스템의 부품은 임의의 형태 또는 매체의 디지털 데이터 통신(예를 들어, 통신 네트워크)을 통해 서로 연결될 수 있다. 통신 네트워크의 예시는 근거리 통신망(LAN), 광역 통신망(WAN) 및 인터넷을 포함한다. The systems and techniques described herein include a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components. (eg, a user computer having a graphical user interface or network browser through which the user may interact with the manners of implementation of the systems and techniques described herein), or such backend components , any combination of middleware components or front end components. The components of the system may be interconnected through digital data communications (eg, communication networks) in any form or medium. Examples of communication networks include local area networks (LANs), wide area networks (WANs), and the Internet.

컴퓨터 시스템은 클라이언트 및 서버를 포함할 수 있다. 클라이언트 및 서버는 일반적으로 서로 멀리 떨어져 있고, 통신 네트워크를 통해 인터랙션한다. 서로 클라이언트-서버 관계를 가지는 컴퓨터 프로그램을 대응되는 컴퓨터에서 수행하여 클라이언트와 서버 간의 관계를 생성한다. 서버는 클라우드 컴퓨팅 서버일 수 있고, 분산 시스템의 서버일 수도 있고, 또는 블록체인을 결합한 서버일 수도 있다. A computer system may include a client and a server. A client and server are generally remote from each other and interact through a communication network. A relationship between a client and a server is created by executing a computer program having a client-server relationship with each other on a corresponding computer. The server may be a cloud computing server, a server of a distributed system, or a server combined with a blockchain.

이해 가능한 바로는, 전술한 다양한 형식의 프로세스에 있어서 단계 재정렬, 추가 또는 삭제를 할 수 있다. 예를 들어, 본 개시에 개시된 기술 솔루션이 이루고자 하는 결과를 구현할 수 있는 한, 본 개시에 기재된 각 단계들은 병렬로, 순차적으로 또는 다른 순서로 수행될 수 있으나, 본 명세서에서 이에 대해 한정하지 않는다. As will be appreciated, steps can be rearranged, added or deleted in the various types of processes described above. For example, each step described in the present disclosure may be performed in parallel, sequentially, or in a different order, as long as the technical solution disclosed in the present disclosure can implement the desired result, but the present disclosure is not limited thereto.

전술한 구체적인 실시 방식들은 본 개시의 보호 범위에 대한 한정을 구성하지 않는다. 당업자라면 본 개시의 설계 요건 및 기타 요인에 따라 다양한 수정, 조합, 서비스 조합 및 대체가 이루어질 수 있음을 이해해야 한다. 본 개시의 정신과 원칙 내에서 이루어진 모든 수정, 동등한 대체 및 개선은 본 개시의 보호 범위에 포함된다. The specific implementation manners described above do not constitute a limitation on the protection scope of the present disclosure. Those skilled in the art should understand that various modifications, combinations, service combinations, and substitutions may be made according to the design requirements of the present disclosure and other factors. All modifications, equivalent substitutions and improvements made within the spirit and principle of the present disclosure shall fall within the protection scope of the present disclosure.

Claims

In the facial biometric detection method,
acquiring a facial color image to be detected;
inputting the facial color image to a pre-trained first codec reconstruction model and a second codec reconstruction model, respectively, to obtain a facial reconstruction infrared image and a facial reconstruction deep image, respectively; and
Inputting the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image to a pre-trained multimodal detection network model to obtain a biometric detection result; including,
Facial biometric detection method, characterized in that.

According to claim 1,
the first codec reconstruction model is trained according to a plurality of first sample image sets, each first sample image set comprising a facial color sample image and a facial infrared sample image matched with each other; the second codec reconstruction model is trained according to a plurality of second sets of sample images, each second set of sample images including face color sample images and face deep sample images matched with each other;
The multimodal detection network model is trained according to at least one of a plurality of biological sample image sets and a plurality of non-living sample image sets, and each biological sample image set includes a bio-facial color image, a bio-facial infrared image, and a bio-face that are matched with each other. contains deep images; each set of non-living sample images comprising an in vivo facial color image, an in vivo facial infrared image, and an in vivo facial deep image that are matched to each other;
Facial biometric detection method, characterized in that.

According to claim 1,
Before inputting the facial color image to a pre-trained first codec reconstruction model and a second codec reconstruction model, respectively,
Performing facial keypoint detection on the facial color image, performing facial image correction based on the facial keypoint detection result, and performing normalization processing on the corrected image, further comprising:
Facial biometric detection method, characterized in that.

3. The method of claim 2,
acquiring an initial facial color image, an initial facial infrared image, and an initial facial deep image that are registered with each other;
performing facial keypoint detection on the initial facial color image, performing facial image correction based on the facial keypoint detection result, and performing normalization processing on the corrected image to obtain the facial color sample image;
Based on the facial keypoint detection result of the initial facial color image, facial image correction is respectively performed on the initial facial infrared image and the initial facial deep image, and normalization processing is performed on the corrected image, respectively, and the facial infrared sample acquiring an image and the facial dip sample image; and acquiring a facial color sample image, a facial infrared sample image, and a facial deep sample image through
Facial biometric detection method, characterized in that.

According to claim 1,
The multimodal detection network model is
a convolutional layer, an attention mechanism module, a global average pooling layer, and a fully connected layer, wherein the convolutional layer includes a first subconvolutional layer, a second subconvolutional layer and a third subconvolutional layer in parallel, ;
The step of inputting the facial color image, the facial reconstruction infrared image and the facial reconstruction deep image to a pretrained multimodal detection network model comprises:
Inputting the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image to the first sub-convolutional layer, the second sub-convolutional layer, and the third sub-convolutional layer of the multimodal detection network model, respectively. containing,
Facial biometric detection method, characterized in that.

In the facial biometric detection device,
an acquisition module for acquiring a facial color image to be detected;
a reconstruction module for respectively inputting the facial color image to a pre-trained first codec reconstruction model and a second codec reconstruction model to obtain a facial reconstruction infrared image and a facial reconstruction deep image, respectively; and
A detection module configured to obtain a biometric detection result by inputting the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image to a pre-trained multimodal detection network model;
Facial biometric detection device, characterized in that.

7. The method of claim 6,
the first codec reconstruction model is trained according to a plurality of first sample image sets, each first sample image set comprising a facial color sample image and a facial infrared sample image matched with each other; the second codec reconstruction model is trained according to a plurality of second sets of sample images, each second set of sample images including face color sample images and face deep sample images matched with each other;
The multimodal detection network model is trained according to at least one of a plurality of biological sample image sets and a plurality of non-living sample image sets, and each biological sample image set includes a bio-facial color image, a bio-facial infrared image, and a bio-face that are matched with each other. contains deep images; each set of non-living sample images comprising an in vivo facial color image, an in vivo facial infrared image, and an in vivo facial deep image that are matched to each other;
Facial biometric detection device, characterized in that.

7. The method of claim 6,
Before inputting the facial color image to the pre-trained first codec reconstruction model and the second codec reconstruction model, respectively, facial keypoint detection is performed on the facial color image, and facial image correction is performed based on the facial keypoint detection result And, further comprising a pre-processing module for performing normalization processing on the corrected image,
Facial biometric detection device, characterized in that.

9. The method according to claim 7 or 8,
acquiring an initial facial color image, an initial facial infrared image, and an initial facial deep image that are registered with each other;
performing facial keypoint detection on the initial facial color image, performing facial image correction based on the facial keypoint detection result, and performing normalization processing on the corrected image to obtain the facial color sample image;
Based on the facial keypoint detection result of the initial facial color image, facial image correction is respectively performed on the initial facial infrared image and the initial facial deep image, and normalization processing is performed on the corrected image, respectively, and the facial infrared sample acquiring an image and the facial dip sample image; A sample image acquisition module for acquiring a facial color sample image, a facial infrared sample image, and a facial deep sample image through
Facial biometric detection device, characterized in that.

7. The method of claim 6,
The multimodal detection network model is
a convolutional layer, an attention mechanism module, a global average pooling layer, and a fully connected layer, wherein the convolutional layer includes a first subconvolutional layer, a second subconvolutional layer and a third subconvolutional layer in parallel, ;
The detection module is configured to apply the facial color image, the facial reconstruction infrared image, and the facial reconstruction deep image to the first sub-convolutional layer, the second sub-convolutional layer and the third sub-convolutional layer of the multimodal detection network model. used to input each,
Facial biometric detection device, characterized in that.

In an electronic device,
at least one processor; and
a memory communicatively connected to the at least one processor; and
An instruction executable by the at least one processor is stored in the memory, and when the instruction is executed by the at least one processor, the at least one processor according to any one of claims 1 to 5 performing the facial biometric detection method according to the
Electronic device, characterized in that.

A non-transitory computer-readable storage medium having computer instructions stored thereon, comprising:
wherein the computer instructions are used for a computer to perform a method for detecting a facial biometric according to any one of claims 1 to 5,
A non-transitory computer-readable storage medium having stored thereon computer instructions.

In a computer program stored in a computer-readable storage medium,
When the instructions in the computer program are executed by a processor, the facial biometric detection method according to any one of claims 1 to 5 is implemented,
A computer program stored in a computer-readable storage medium, characterized in that.