KR102530240B1

KR102530240B1 - Face fake detection method based on remote photoplethysmography signal and analysis apparatus

Info

Publication number: KR102530240B1
Application number: KR1020220031301A
Authority: KR
Inventors: 남운성; 남용한
Original assignee: (주)씨유박스
Priority date: 2021-11-24
Filing date: 2022-03-14
Publication date: 2023-05-10
Also published as: WO2023096032A1

Abstract

Provided is a non-contact photoplethysmography (PPG)-based facial image forgery determination method, which includes the steps of: an analysis device receiving a facial image; the analysis device detecting a region of interest in the facial image; the analysis device detecting a PPG signal based on a change in skin color in the region of interest; the analysis device detecting the heart rate based on the PPG signal; and the analysis device inputting at least one of the PPG and the heart rate into a pre-trained learning model to calculate a value indicating whether the facial image has been forged or altered.

Description

Non-contact PPG-based face image forgery detection method and analysis device

이하 설명하는 기술은 얼굴 영상의 위변조를 판별하는 기법에 관한 것이다.The technology to be described below relates to a technique for determining forgery or alteration of a face image.

생체 인증은 다양한 분야에서 활용되고 있다. 얼굴 인증은 영상을 이용한 인증 방식으로 스마트폰, 노트북, 출입 통제 시스템 등 다양한 장치에서 이용되고 있다. 한편, 얼굴 인증에 대한 공격(spoofing attack)도 이슈가 되었다. 대표적인 스프핑 공격은 인화된 사진, 재생되는 영상, 3D 마스크 등을 이용한 공격이 있다.Biometric authentication is used in various fields. Face authentication is an authentication method using video and is used in various devices such as smartphones, laptops, and access control systems. Meanwhile, a spoofing attack on face authentication has also become an issue. Representative spying attacks include attacks using printed photos, played videos, and 3D masks.

한국등록특허 제10-2215557호Korean Patent Registration No. 10-2215557

이하 설명하는 기술은 얼굴 영상에서 PPG (photoplethysmography) 신호를 추출하여 얼굴 영상 위변조를 판별하는 기법을 제공하고자 한다. The technology to be described below is intended to provide a technique for determining forgery of a face image by extracting a photoplethysmography (PPG) signal from a face image.

비접촉 PPG 기반의 얼굴 영상 위변조 판별 방법은 분석장치가 얼굴 영상을 입력받는 단계, 상기 분석장치가 상기 얼굴 영상에서 관심 영역을 검출하는 단계, 상기 분석장치가 상기 관심 영역의 피부색 변화를 기준으로 PPG 신호를 검출하는 단계, 상기 분석장치가 상기 PPG 신호를 기준으로 심박수를 검출하는 단계 및 상기 분석장치가 상기 PPG 신호 및 상기 심박수 중 적어도 하나를 사전에 학습된 학습모델에 입력하여 상기 얼굴 영상의 위변조 여부를 나타내는 확률값을 산출하는 단계를 포함한다.The non-contact PPG-based face image forgery detection method includes the steps of receiving a face image by an analysis device, detecting a region of interest from the face image by the analysis device, and the analysis device using a PPG signal based on a change in skin color of the region of interest. detecting a heart rate based on the PPG signal by the analysis device, and whether the face image is forged or falsified by inputting at least one of the PPG signal and the heart rate to a previously learned learning model. Calculating a probability value representing

비접촉 PPG 기반으로 얼굴 영상 위변조를 분류하는 분석장치는 얼굴 영상을 입력받는 입력장치, PPG 신호 및 심박수를 기준으로 영상 위변조를 분류하는 학습모델을 저장하는 저장장치 및 상기 얼굴 영상의 관심 영역에 대한 피부색 변화를 기준으로 검출한 PPG 신호 및 심박수를 상기 학습모델에 입력하여 상기 얼굴 영상의 위변조 여부를 나타내는 확률값을 산출하는 연산장치를 포함한다.An analysis device for classifying face image forgery based on non-contact PPG includes an input device for receiving a face image, a storage device for storing a learning model for classifying image forgery based on a PPG signal and heart rate, and a skin color for a region of interest in the face image. and an arithmetic device for calculating a probability value indicating whether the face image is forged or falsified by inputting the PPG signal and heart rate detected based on the change into the learning model.

이하 설명하는 기술은 얼굴 영상에서 추출한 비접촉 PPG 관련 정보를 이용하여 손쉽게 얼굴 영상 위변조 판별할 수 있다. 이하 설명하는 기술은 스마트폰과 같이 성능이 제한된 장치에서도 얼굴 영상 위변조 판별이 가능하여 생체 인증 애플리케이션, 핀테크 분야 등에서 활용가능하다.The technology to be described below can easily determine forgery or alteration of a face image using non-contact PPG-related information extracted from a face image. The technology described below can detect forgery and falsification of facial images even in a device with limited performance, such as a smartphone, and can be used in biometric authentication applications and fintech fields.

도 1은 비접촉 PPG 기반의 얼굴 영상 위변조 판별 시스템에 대한 예이다.
도 2는 얼굴 영상의 위변조 판별 과정에 대한 순서도의 예이다.
도 3은 얼굴 영상에서 PPG 신호 및 심박수를 추출하는 과정의 예이다.
도 4는 신경망 모델을 이용하여 얼굴 영상의 위변조 판별을 하는 과정의 예이다.
도 5는 얼굴 영상의 위변조를 판별하는 분석장치에 대한 예이다.1 is an example of a non-contact PPG based face image forgery detection system.
2 is an example of a flowchart for a forgery detection process of a face image.
3 is an example of a process of extracting a PPG signal and heart rate from a face image.
4 is an example of a process of forgery discrimination of a face image using a neural network model.
5 is an example of an analysis device for determining forgery or alteration of a face image.

이하 설명하는 기술은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나, 이는 이하 설명하는 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 이하 설명하는 기술의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.Since the technology to be described below can have various changes and various embodiments, specific embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the technology described below to specific embodiments, and it should be understood to include all modifications, equivalents, or substitutes included in the spirit and scope of the technology described below.

제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 해당 구성요소들은 상기 용어들에 의해 한정되지는 않으며, 단지 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 이하 설명하는 기술의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as first, second, A, B, etc. may be used to describe various elements, but the elements are not limited by the above terms, and are merely used to distinguish one element from another. used only as For example, without departing from the scope of the technology described below, a first element may be referred to as a second element, and similarly, the second element may be referred to as a first element. The terms and/or include any combination of a plurality of related recited items or any of a plurality of related recited items.

본 명세서에서 사용되는 용어에서 단수의 표현은 문맥상 명백하게 다르게 해석되지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함한다" 등의 용어는 설명된 특징, 개수, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 의미하는 것이지, 하나 또는 그 이상의 다른 특징들이나 개수, 단계 동작 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 배제하지 않는 것으로 이해되어야 한다.In terms used in this specification, singular expressions should be understood to include plural expressions unless clearly interpreted differently in context, and terms such as “comprising” refer to the described features, numbers, steps, operations, and components. , parts or combinations thereof, but it should be understood that it does not exclude the possibility of the presence or addition of one or more other features or numbers, step-action components, parts or combinations thereof.

도면에 대한 상세한 설명을 하기에 앞서, 본 명세서에서의 구성부들에 대한 구분은 각 구성부가 담당하는 주기능 별로 구분한 것에 불과함을 명확히 하고자 한다. 즉, 이하에서 설명할 2개 이상의 구성부가 하나의 구성부로 합쳐지거나 또는 하나의 구성부가 보다 세분화된 기능별로 2개 이상으로 분화되어 구비될 수도 있다. 그리고 이하에서 설명할 구성부 각각은 자신이 담당하는 주기능 이외에도 다른 구성부가 담당하는 기능 중 일부 또는 전부의 기능을 추가적으로 수행할 수도 있으며, 구성부 각각이 담당하는 주기능 중 일부 기능이 다른 구성부에 의해 전담되어 수행될 수도 있음은 물론이다.Prior to a detailed description of the drawings, it is to be clarified that the classification of components in the present specification is merely a classification for each main function in charge of each component. That is, two or more components to be described below may be combined into one component, or one component may be divided into two or more for each more subdivided function. In addition, each component to be described below may additionally perform some or all of the functions of other components in addition to its main function, and some of the main functions of each component may be performed by other components. Of course, it may be dedicated and performed by .

또, 방법 또는 동작 방법을 수행함에 있어서, 상기 방법을 이루는 각 과정들은 문맥상 명백하게 특정 순서를 기재하지 않은 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 과정들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In addition, in performing a method or method of operation, each process constituting the method may occur in a different order from the specified order unless a specific order is clearly described in context. That is, each process may occur in the same order as specified, may be performed substantially simultaneously, or may be performed in the reverse order.

이하 설명하는 기술은 비접촉 PPG 기반으로 얼굴 영상의 위변조를 판별하는 기술이다.The technology to be described below is a technology for discriminating forgery and falsification of a face image based on non-contact PPG.

비접촉 PPG는 영상을 이용하여 산출되는 PPG 신호를 의미한다. 비접촉 PPG는 원거리 PPG(remote PPG, rPPG)로 명명되기도 한다.Non-contact PPG means a PPG signal calculated using an image. Non-contact PPG is also called remote PPG (rPPG).

분석장치가 대상자의 얼굴 영상을 분석하여 영상의 위변조를 판별한다. 분석장치는 영상 및 데이터 처리가 가능한 다양한 장치로 구현될 수 있다. 예컨대, 분석장치는 PC, 네트워크상의 서버, 스마트 기기, 전용 프로그램이 임베딩된 칩셋 등으로 구현될 수 있다. 분석장치는 얼굴 인증을 수행하는 인증 장치일 수도 있다. 다만, 이하 설명하는 기술은 얼굴을 이용한 사용자에 대한 인증 전에 얼굴 영상의 위변조를 판별하는 기법이므로, 위변조 판별을 위한 내용을 중심으로 설명한다.The analysis device analyzes the face image of the subject and determines whether the image is forged or altered. The analysis device may be implemented with various devices capable of image and data processing. For example, the analysis device may be implemented as a PC, a server on a network, a smart device, a chipset in which a dedicated program is embedded, and the like. The analysis device may be an authentication device that performs face authentication. However, since the technology to be described below is a technique for determining forgery or alteration of a face image before authentication of a user using a face, description will be focused on the content for forgery or alteration detection.

분석장치는 학습모델을 이용하여 얼굴 영상의 위변조를 판별할 수 있다.The analysis device may determine forgery or falsification of the face image using the learning model.

학습모델은 기계 학습모델을 의미한다. 기계 학습모델은 결정 트리, 랜덤 포레스트(random forest), KNN(K-nearest neighbor), 나이브 베이즈(Naive Bayes), SVM(support vector machine), ANN(artificial neural network) 등이 있다. The learning model means a machine learning model. Machine learning models include decision trees, random forests, K-nearest neighbors (KNNs), Naive Bayes, support vector machines (SVMs), and artificial neural networks (ANNs).

ANN은 생물의 신경망을 모방한 통계학적 학습 알고리즘이다. 다양한 신경망 모델이 연구되고 있다. DNN(deep learning network)은 일반적인 인공신경망과 마찬가지로 복잡한 비선형 관계(non-linear relationship)들을 모델링할 수 있다. DNN은 다양한 유형의 모델이 연구되었다. 예컨대, CNN(Convolutional Neural Network), RNN(Recurrent Neural Network), RBM(Restricted Boltzmann Machine), DBN(Deep Belief Network), GAN(Generative Adversarial Network), RL(Relation Networks) 등이 있다.ANN is a statistical learning algorithm that mimics biological neural networks. Various neural network models are being studied. A deep learning network (DNN) can model complex non-linear relationships like a general artificial neural network. Various types of DNN models have been studied. For example, there are a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), a Restricted Boltzmann Machine (RBM), a Deep Belief Network (DBN), a Generative Adversarial Network (GAN), and Relation Networks (RL).

도 1은 비접촉 PPG 기반의 얼굴 영상 위변조 판별 시스템에 대한 예이다.1 is an example of a non-contact PPG based face image forgery detection system.

도 1(A)는 얼굴 영상의 위변조를 판별하는 시스템(100)에 대한 예이다. 도 1(A)는 단일 장치가 내장된 알고리즘 내지 모델을 이용하여 얼굴 영상의 위변조를 판별하는 예이다. 도 1(A)에서 분석장치는 스마트폰, 노트북, 키오스크, 출입통제 장치(출입 게이트), ATM (Automated Teller Machine) 등과 같은 단말 장치(110)일 수 있다. 1(A) is an example of a system 100 for determining forgery or alteration of a face image. 1(A) is an example of determining forgery or forgery of a face image using an algorithm or model built into a single device. In FIG. 1(A), the analysis device may be a terminal device 110 such as a smart phone, a laptop computer, a kiosk, an access control device (entrance gate), an automated teller machine (ATM), and the like.

단말 장치(110)는 카메라를 이용하여 사용자 A의 얼굴을 획득한다. 단말 장치(110)는 얼굴 영상에서 얼굴 영역을 추출하고, 비접촉 PPG 신호 및 관련 데이터를 추출한다. 단말 장치(110)는 추출한 정보를 기준으로 현재 얼굴 영상이 위변조 된 것인지 판별한다. 상세한 위변조 판별 과정은 후술한다.The terminal device 110 obtains user A's face using a camera. The terminal device 110 extracts a face region from a face image and extracts a non-contact PPG signal and related data. The terminal device 110 determines whether the current face image is forged or altered based on the extracted information. A detailed forgery detection process will be described later.

나아가, 단말 장치(110)는 얼굴 영상 기반한 인증을 수행하는 서버 또는 얼굴 영상 위변조 정보를 필요로하는 서버 등에 얼굴 영상의 위변조 판별 결과를 전달할 수 있다.Furthermore, the terminal device 110 may transmit a forgery determination result of a face image to a server that performs face image-based authentication or a server that requires face image forgery information.

도 1(B)는 얼굴 영상의 위변조를 판별하는 시스템(200)에 대한 다른 예이다. 시스템(200)은 단말 장치(210) 및 서버(220)를 포함한다. 도 1(B)에서 분석장치는 서버(220)이다.1(B) is another example of a system 200 for determining forgery or alteration of a face image. The system 200 includes a terminal device 210 and a server 220 . In FIG. 1(B), the analysis device is the server 220.

단말 장치(210)는 스마트폰, 노트북, 키오스크, 출입통제 장치, ATM 등의 장치일 수 있다. 단말 장치(210)는 카메라를 이용하여 사용자 A의 얼굴을 획득한다. 단말 장치(210)는 획득한 영상을 서버(220)에 전달한다. The terminal device 210 may be a device such as a smart phone, a laptop computer, a kiosk, an access control device, or an ATM. The terminal device 210 obtains user A's face using a camera. The terminal device 210 transmits the acquired image to the server 220 .

서버(220)는 얼굴 영상에서 얼굴 영역을 추출하고, 비접촉 PPG 신호 및 관련 데이터를 추출한다. 서버(220)는 추출한 정보를 기준으로 현재 얼굴 영상이 위변조 된 것인지 판별한다. 서버(220)는 위변조 판별 결과(위변조 정보)를 단말 장치(210)에 전달할 수 있다.The server 220 extracts a face region from the face image, and extracts a non-contact PPG signal and related data. The server 220 determines whether the current face image is forged based on the extracted information. The server 220 may deliver the forgery determination result (forgery information) to the terminal device 210 .

한편, 경우에 따라서 단말 장치(210)가 얼굴 영상 위변조 판별 과정의 일부 과정을 수행하고, 수행한 결과를 서버(220)에 전달할 수 있다. 이 경우, 서버(220)가 얼굴 영상 위변조 판별의 나머지 과정을 수행하게 할 수도 있다. 예컨대, 단말 장치(210)가 수행하는 일부 과정은 영상 전처리, 얼굴 영역 검출, PPG 신호 검출, 심박수 검출 등과 같은 과정 중 적어도 하나일 수 있다.Meanwhile, in some cases, the terminal device 210 may perform a part of the facial image forgery detection process and deliver the performed result to the server 220 . In this case, the server 220 may perform the rest of the face image forgery detection process. For example, some of the processes performed by the terminal device 210 may be at least one of processes such as image pre-processing, face region detection, PPG signal detection, and heart rate detection.

도 2는 얼굴 영상의 위변조 판별 과정(300)에 대한 순서도의 예이다.2 is an example of a flowchart of a forgery and forgery discrimination process 300 of a face image.

분석장치는 얼굴 영상을 입력받는다(310). 얼굴 영상은 분석장치가 직접 카메라로 획득하거나, 다른 객체로부터 전달받을 수 있다. 이때 얼굴 영상은 PPG 신호 검출을 위하여 일정한 시간 동안의 영상(복수의 프레임)이어야 한다. 이때 프레임들은 연속된 두 개 이상의 프레임들 또는 일정한 시간 간격을 갖는 두 개 이상의 프레임들일 수 있다.The analysis device receives a face image (310). The face image may be acquired by the analysis device directly with a camera or may be received from another object. At this time, the face image must be an image (a plurality of frames) for a certain period of time in order to detect the PPG signal. In this case, the frames may be two or more consecutive frames or two or more frames having a predetermined time interval.

분석장치는 입력된 영상에서 얼굴 영역을 검출한다(320). 분석장치는 얼굴 영역 전체를 사용할 수도 있다. 나아가, 분석장치는 얼굴 영역 중 관심 영역(region of interest, ROI)을 구분할 수도 있다. 관심 영역은 얼굴의 피부 영역 중 혈류량 변화 관찰이 용이한 영역일 수 있다. 예컨대, 관심 영역은 볼, 코, 귀, 이마 등과 같은 부위일 수 있다. 한편, 관심 영역 자체가 배경이나 다른 신체 부위가 아닌 얼굴 영역 전체일 수도 있다.The analysis device detects a face region from the input image (320). The analysis device may use the entire face area. Furthermore, the analysis device may classify a region of interest (ROI) among facial regions. The region of interest may be a region in which changes in blood flow can be easily observed among skin regions of the face. For example, the region of interest may be a region such as a cheek, nose, ear, or forehead. Meanwhile, the ROI itself may be the entire face region instead of the background or other body parts.

분석장치는 종래 영상 처리 기술을 이용하여 얼굴 영역 또는 관심 영역을 검출할 수 있다. 예컨대, 분석장치는 영상 특징에 기반한 랜드마크들을 기준으로 얼굴 영역 또는 관심 영역을 검출할 수 있다.The analysis device may detect a face region or a region of interest using a conventional image processing technique. For example, the analysis device may detect a face region or a region of interest based on landmarks based on image features.

나아가, 분석장치는 입력 영상에서 얼굴 영역 또는 관심 영역을 검출하는 세그멘테이션 모델(segmentation model)을 이용할 수도 있다. 세그멘테이션 모델은 딥러닝 기반의 모델일 수 있다. 예컨대, 세그멘테이션 모델은 U-net, FCN(Fully convolutional network) 등과 같은 모델일 수 있다. 세그멘테이션 모델은 사전에 얼굴 영상을 이용하여 학습되어야 한다.Furthermore, the analysis device may use a segmentation model for detecting a face region or a region of interest in an input image. The segmentation model may be a deep learning-based model. For example, the segmentation model may be a model such as U-net or Fully Convolutional Network (FCN). The segmentation model must be trained using face images in advance.

분석장치는 얼굴 영역 또는 관심 영역을 기준으로 PPG 신호를 추출하기 전에 현재 영상이 정상적인 영상인지 또는 일종의 위변조 시도 영상인지 사전 검출할 수도 있다(330). 이때 검출되는 공격 시나리오는 구부림 공격, 미세 흔들림, 얼굴 가림 등이다. 즉, 얼굴 영상 자체가 인증에 사용할만한 일정한 품질이나 규격을 갖지 못한 경우라고 하겠다. 분석장치는 영상 처리 기술로 영상의 특징을 검출하여 현재 영상이 구부림, 흔들림, 얼굴 가림 등의 시도가 있는지 검출할 수 있다. 분석장치가 현재 영상에서 해당 위변조 시도가 있다고 판단한 경우(330의 YES), 인증을 종료할 수 있다(340). 또는 분석장치는 현재 입력 영상이 비정상이라는 알람을 하고, 재차 얼굴 영상을 획득하는 과정을 수행할 수 있다.The analysis device may pre-detect whether the current image is a normal image or a forgery attempt image before extracting the PPG signal based on the face region or the region of interest (330). At this time, the detected attack scenarios include a bending attack, a slight shaking, and a face covering. That is, this is a case in which the face image itself does not have a certain quality or standard suitable for authentication. The analysis device may detect whether there is an attempt such as bending, shaking, or covering a face in the current image by detecting characteristics of the image using image processing technology. If the analysis device determines that there is a corresponding forgery attempt in the current video (YES in 330), authentication may be terminated (340). Alternatively, the analysis device may issue an alarm that the current input image is abnormal and perform a process of obtaining a face image again.

물리적인 위변조 시도가 없어 사전 검출 단계를 통과한 경우(330의 NO), 분석장치는 얼굴 영역 또는 관심 영역의 피부색 변화를 기준으로 비접촉 PPG 및 심박수를 결정할 수 있다(350). 심박수는 PPG 신호를 이용하여 검출될 수 있다. 따라서, PPG 및 심박수는 PPG 관련 정보라고 할 수 있다.If the pre-detection step is passed because there is no physical forgery attempt (NO in 330), the analysis device may determine the non-contact PPG and heart rate based on the change in skin color of the face region or region of interest (350). Heart rate can be detected using the PPG signal. Therefore, PPG and heart rate can be referred to as PPG-related information.

분석장치는 영상에서 추출한 PPG 관련 정보를 사전에 학습한 학습모델에 입력하여 얼굴 영상 위변조를 판별할 수 있다(360). 분석장치는 학습모델이 출력하는 확률값을 기준으로 현재 얼굴 영상의 위변조를 판별할 수 있다.The analysis device may determine forgery or alteration of the face image by inputting the PPG-related information extracted from the image to the previously learned learning model (360). The analysis device may determine forgery or alteration of the current face image based on the probability value output by the learning model.

도 3은 얼굴 영상에서 PPG 신호 및 심박수를 추출하는 과정(400)의 예이다.3 is an example of a process 400 of extracting a PPG signal and heart rate from a face image.

분석 장치는 입력되는 얼굴 영상에서 얼굴 영역을 검출한다(410). The analysis device detects a face region from the input face image (410).

분석장치는 얼굴 영역에서 피부색 변화를 관찰하기 위한 피부 영역을 검출할 수 있다(420). 여기서, 피부 영역은 전술한 관심 영역일 수 있다. The analysis device may detect a skin area for observing a skin color change in the face area (420). Here, the skin region may be the aforementioned region of interest.

분석장치는 피부 영역을 이용하여 비접촉 PPG(rPPG) 신호를 검출한다(430). PPG 신호는 피부색의 변화를 통해 추정될 수 있다. 즉, rPPG 신호는 일련의 연속된 영상을 분석하여 산출된다. 분석장치는 영상처리기법을 이용하여 아래와 같이 비접촉 PPG를 검출할 수 있다. The analysis device detects a non-contact PPG (rPPG) signal using the skin region (430). The PPG signal can be estimated through changes in skin color. That is, the rPPG signal is calculated by analyzing a series of consecutive images. The analysis device can detect non-contact PPG as follows using an image processing technique.

혈액은 주변 조직에 비해 더 많은 빛을 흡수하기 때문에 혈액이 혈관을 통과할 때 흡수되는 빛의 양이 많고 반사되는 빛의 양이 감소한다. 따라서, 분석장치는 피부색의 주기적인 변화를 기준으로 비접촉 PPG를 검출할 수 있다.Since blood absorbs more light than surrounding tissue, the amount of light absorbed and the amount of reflected light decreases when blood passes through blood vessels. Therefore, the analysis device can detect non-contact PPG based on periodic changes in skin color.

분석장치는 얼굴 영역에 대한 YCbCr 범위의 피부색을 이용하여 피부 화소 필터링을 수행한 후, 색차 기반 방법을 이용하여 비접촉 PPG 신호를 획득할 수 있다. RGB 이미지 기반의 피부색 변화 분석은 환경적 요인에 따라 왜곡이 발생할 수 있다. 따라서, 분석장치는 RGB 프레임을 YCbCr로 변환하고, Cb-Cr 평면에서 피부 픽셀 클러스터링을 수행할 수 있다. The analysis device may obtain a non-contact PPG signal using a color difference-based method after performing skin pixel filtering using skin colors in the YCbCr range for the face area. RGB image-based skin color change analysis can cause distortion depending on environmental factors. Therefore, the analyzer can convert the RGB frame into YCbCr and perform skin pixel clustering in the Cb-Cr plane.

이후 분석장치는 클러스터 중심값 P(Cb-중심값, Cr-중심값)에서 Cb, Cr 성분을 n배 랜덤하게 확장하여 미세한 피부 화소의 분포를 확장할 수 있다. 분석장치는 확장된 Cb 및 Cr 신호를 사용하여 심박수를 검출할 수 있다(440). 분석장치는 PPG 신호에 대하여 일정한 크기의 윈도우를 적용하여 일정한 시간동안 심박의 횟수(beats per minute, bpm)를 결정할 수 있다. 이때 결정되는 bpm도 비적촉 방식이므로 비접촉 bpm 내지 rbpm(remote bpm)라고 명명할 수 있다.Thereafter, the analysis device may expand the distribution of fine skin pixels by randomly expanding the Cb and Cr components n times from the cluster central value P (Cb-central value, Cr-central value). The analyzer may detect the heart rate using the expanded Cb and Cr signals (440). The analyzer may determine the number of beats per minute (bpm) for a certain period of time by applying a window of a certain size to the PPG signal. Since the bpm determined at this time is also a non-contact method, it can be named non-contact bpm or rbpm (remote bpm).

한편, 분석장치는 신호 품질을 개선하기 위하여 개인의 호흡 경향을 제거하고, 버터워스(Butterworth) 필터를 이용하여 잡음을 제거할 수 있다. 즉, 분석장치는 얼굴 영상 위변조 판별 과정에서 필요한 추가적인 신호 처리를 수행할 수 있다.On the other hand, the analyzer may remove the individual's breathing tendency and remove noise using a Butterworth filter in order to improve signal quality. That is, the analysis device may perform additional signal processing required in the face image forgery detection process.

도 3은 분석장치가 전통적인 영상 처리 기법을 이용하여 비접촉 PPG(rPPG) 및 심박수를 검출하는 예이다. 나아가, 분석장치는 입력된 영상에서 추출한 특징값을 기준으로 PPG 내지 심박수를 추출하는 딥러닝 모델을 이용할 수도 있다. 이 경우 분석장치는 얼굴 영상을 딥러닝 모델에 입력하여 산출되는 값을 기준으로 PPG 내지 심박수를 결정할 수도 있다.3 is an example in which an analysis device detects non-contact PPG (rPPG) and heart rate using a traditional image processing technique. Furthermore, the analysis device may use a deep learning model that extracts PPG or heart rate based on feature values extracted from an input image. In this case, the analysis device may determine PPG or heart rate based on a value calculated by inputting a face image to a deep learning model.

이후 분석장치는 PPG 관련 정보를 이용하여 현재 얼굴 영상의 위변조 여부를 판별할 수 있다. 이때 분석장치는 사전에 학습된 학습모델을 이용할 수 있다. 도 4는 신경망 모델을 이용하여 얼굴 영상의 위변조 판별을 하는 과정(500)의 예이다.Thereafter, the analysis device may determine whether the current face image is forged or falsified using the PPG-related information. At this time, the analysis device may use a learning model learned in advance. 4 is an example of a process 500 of determining forgery or alteration of a face image using a neural network model.

분석장치는 얼굴 영상 프레임에서 학습모델에 입력할 입력 데이터를 추출한다(550). The analysis device extracts input data to be input to the learning model from the face image frame (550).

분석장치가 얼굴 영상에서 추출 가능한 데이터는 rPPG, 심박수(bpm) 등이다. 한편, 분석장치는 얼굴 영상 자체 또는 관심 영역에 대한 픽셀값인 로데이터(raw data)를 입력 데이터로 삼을 수도 있다.Data that the analysis device can extract from the face image include rPPG and heart rate (bpm). Meanwhile, the analysis device may take as input data raw data that is a pixel value of a face image itself or a region of interest.

입력 데이터는 다양한 데이터의 조합일 수 있다. 예컨대, 입력 데이터는 (i)rPPG + 심박수, (ii)rPPG + 로데이터, (iii)심박수 + 로데이터 및 (iv)rPPG + 심박수 + 로데이터 중 어느 하나의 유형일 수 있다.Input data may be a combination of various data. For example, the input data may be any one of (i) rPPG + heart rate, (ii) rPPG + raw data, (iii) heart rate + raw data, and (iv) rPPG + heart rate + raw data.

한편, 분석장치는 입력 데이터 중 일정한 값을 원-핫 인코딩(one-hot encoding)하여 일정한 벡터값으로 변환할 수 있다. 분석장치는 다양한 유형의 입력데이터를 하나의 매트릭스로 가공하여 학습모델에 입력할 수 있다.Meanwhile, the analysis device may convert a constant value of input data into a constant vector value by one-hot encoding. The analysis device may process various types of input data into a single matrix and input the data to the learning model.

분석장치는 입력데이터를 사전에 학습된 학습모델에 입력하여 얼굴 영상 위변조를 분류할 수 있다(520). 분석장치는 학습모델이 출력하는 확률값을 기준으로 얼굴 영상 위변조를 분류할 수 있다. 예컨대, 분석장치는 학습모델이 출력하는 값이 0에 가까우면 위변조로 판단하거나, 1에 가까우면 원본 영상으로 판단할 수 있다.The analysis device may classify facial image forgery by inputting the input data to a previously learned learning model (520). The analysis device may classify facial image forgery based on the probability value output by the learning model. For example, the analysis device may determine that the value output by the learning model is forgery when it is close to 0, or determine the original image when it is close to 1.

학습모델은 영상 및 벡터를 입력받아 일정한 분류값을 산출하는 딥러닝 모델일 수 있다. The learning model may be a deep learning model that calculates a certain classification value by receiving images and vectors.

도 4에서 하단의 (A)는 CNN 기반의 모델을 예시한다. CNN 기반의 모델은 입력데이터에서 특징을 추출하고, 특징값을 기준으로 위변조 확률값을 출력할 수 있다. 이때 입력데이터는 특정 시점에서 측정된 PPG, 심박수 및 로데이터 중 적어도 하나일 수 있다.In FIG. 4, (A) at the bottom illustrates a CNN-based model. A CNN-based model can extract features from input data and output a forgery probability value based on feature values. In this case, the input data may be at least one of PPG, heart rate, and raw data measured at a specific point in time.

도 4에서 하단의 (B)는 시계열 데이터를 처리하는 딥러닝 모델에 대한 예이다. 예컨대, 딥러닝 모델은 RNN 기반의 모델일 수 있다. 이 경우 딥러닝 모델은 시간에 따라 연속적인 입력 데이터를 입력받아 처리하여 최종적으로 위변조 확률값을 출력할 수 있다.In FIG. 4, (B) at the bottom is an example of a deep learning model that processes time series data. For example, the deep learning model may be an RNN-based model. In this case, the deep learning model may receive and process continuous input data according to time and finally output a forgery probability value.

도 4의 학습모델 내지 딥러닝 모델은 사전에 학습되어야 한다. 학습 데이터는 영상에서 추출한 입력 데이터와 해당 영상의 라벨값(위변조 여부)으로 구성된다. 학습 과정은 모델의 계층의 파라미터가 최적화되는 과정에 해당한다.The learning model or deep learning model of FIG. 4 must be trained in advance. Learning data consists of input data extracted from an image and the label value of the corresponding image (whether forged or altered). The learning process corresponds to a process in which the parameters of the model layer are optimized.

연구자는 얼굴 영상에서 추출한 rPPG, 심박수 및 로데이터를 입력데이터로 삼는 딥러닝 모델을 구축하였다. 연구자는 AlexNet을 이용하여 해당 모델을 구축하였다. 연구자는 해당 모델을 학습시키고, 이후 실시간으로 모바일 카메라로 촬영한 입력 영상이 실제 사람의 영상인지 또는 위변조된 영상인 판별하는 검증을 하였다. 아래 표 1은 연구자가 구축한 모델의 분류 정확도(accuracy)를 나타낸다. 연구자는 모바일 기종(갤럭시, 아이폰)과 공격 유형(인화된 사진, 재생 영상)에 따라 정확도를 평가하였다. 아래 표 1을 살펴보면 구축한 모델이 매우 높은 정확도로 얼굴 영상의 위변조를 판별하는 것을 알 수 있다.The researcher built a deep learning model that uses rPPG, heart rate, and raw data extracted from face images as input data. The researcher built the model using AlexNet. The researcher trained the model, and then verified whether the input image captured by the mobile camera in real time was an image of a real person or a forged image. Table 1 below shows the classification accuracy of the model built by the researcher. The researcher evaluated the accuracy according to the type of mobile device (Galaxy, iPhone) and type of attack (printed photo, video playback). Looking at Table 1 below, it can be seen that the built model discriminates forgery of face images with very high accuracy.

원본 영상original footage 인화된 사진printed photo 재생 영상play video 평균average 갤럭시(삼성)Galaxy (Samsung) 99.1299.12 99.1999.19 95.7295.72 98.0198.01 아이폰(애플)iPhone (Apple) 93.9893.98 98.0498.04 100100 97.3497.34 합계Sum 96.5596.55 95.6195.61 97.8697.86 97.6797.67

도 5는 얼굴 영상의 위변조를 판별하는 분석장치(600)에 대한 예이다. 분석장치(600)는 도 1의 분석장치(110 및 220)에 해당하는 장치이다. 분석장치(600)는 물리적으로 다양한 형태로 구현될 수 있다. 예컨대, 분석장치(600)는 PC와 같은 컴퓨터 장치, 네트워크의 서버, 데이터 처리 전용 칩셋 등의 형태를 가질 수 있다. 5 is an example of an analysis device 600 that determines forgery or alteration of a face image. The analyzer 600 is a device corresponding to the analyzers 110 and 220 of FIG. 1 . The analysis device 600 may be physically implemented in various forms. For example, the analysis device 600 may have a form of a computer device such as a PC, a network server, and a data processing dedicated chipset.

분석장치(600)는 저장장치(610), 메모리(620), 연산장치(630), 인터페이스 장치(640) 및 통신장치(650)를 포함할 수 있다. 도 5는 얼굴 영상을 캡쳐하는 카메라를 도시하지 않았지만, 분석장치(600)는 카메라를 포함할 수 있다. 다만, 도 5는 얼굴 영상을 캡쳐한 이후의 동작을 중심으로 설명한다.The analysis device 600 may include a storage device 610, a memory 620, an arithmetic device 630, an interface device 640, and a communication device 650. Although FIG. 5 does not show a camera that captures a face image, the analysis device 600 may include a camera. However, FIG. 5 will mainly describe operations after capturing a face image.

저장장치(610)는 입력되는 얼굴 영상을 저장할 수 있다.The storage device 610 may store an input face image.

저장장치(610)는 영상 처리에 사용되는 영상 처리 프로그램 내지 코드를 저장할 수 있다.The storage device 610 may store image processing programs or codes used for image processing.

저장장치(610)는 얼굴 영상 위변조를 위한 학습모델을 저장할 수 있다.The storage device 610 may store a learning model for face image forgery.

저장장치(610)는 얼굴 영상에 대한 위변조 판별 결과를 저장할 수 있다.The storage device 610 may store a result of forgery and falsification determination for a face image.

메모리(620)는 분석장치(600)가 얼굴 영상 위변조 여부를 분석하는 과정에서 생성되는 데이터 및 정보 등을 저장할 수 있다.The memory 620 may store data and information generated in the process of analyzing whether the analysis device 600 has forged or falsified the face image.

인터페이스 장치(640)는 외부로부터 일정한 명령 및 데이터를 입력받는 장치이다. 인터페이스 장치(640)는 물리적으로 연결된 입력 장치, 외부 저장장치로부터 얼굴 영상을 입력받을 수 있다. 인터페이스 장치(640)는 물리적으로 연결된 입력 장치, 외부 저장장치로부터 얼굴 영상 위변조 분류를 위한 학습모델을 입력받을 수 있다. 인터페이스 장치(640)는 얼굴 영상 위변조 분석 결과를 외부 객체에 전달할 수도 있다. The interface device 640 is a device that receives certain commands and data from the outside. The interface device 640 may receive a face image from a physically connected input device or external storage device. The interface device 640 may receive a learning model for falsification classification of a facial image from a physically connected input device or an external storage device. The interface device 640 may transmit the facial image forgery analysis result to an external object.

통신장치(650)는 유선 또는 무선 네트워크를 통해 일정한 정보를 수신하고 전송하는 구성을 의미한다. 통신장치(650)는 외부 객체로부터 얼굴 영상을 수신할 수 있다. 통신장치(650)는 얼굴 영상 위변조 분류를 위한 학습모델을 수신할 수도 있다. 통신장치(650)는 얼굴 영상 위변조 분석 결과를 외부 객체에 송신할 수도 있다.The communication device 650 refers to a component that receives and transmits certain information through a wired or wireless network. The communication device 650 may receive a face image from an external object. The communication device 650 may receive a learning model for classifying forgery or alteration of a face image. The communication device 650 may transmit the face image forgery analysis result to an external object.

인터페이스 장치(640) 및 통신장치(650)는 사용자 또는 다른 물리적 객체로부터 일정한 데이터를 주고받는 구성이므로, 포괄적으로 입출력장치라고도 명명할 수 있다. 필요한 데이터를 입력받는 기능에 한정하면 인터페이스 장치(640) 및 통신장치(650)는 입력장치라고 할 수도 있다. Since the interface device 640 and the communication device 650 are configured to send and receive certain data from a user or other physical object, they can also be collectively referred to as input/output devices. If the function of receiving necessary data is limited, the interface device 640 and the communication device 650 may be referred to as input devices.

연산장치(630)는 저장장치(610)에 학습모델을 이용하여 얼굴 영상 위변조 여부를 분석할 수 있다.The arithmetic device 630 may use the learning model in the storage device 610 to analyze whether the face image has been forged or altered.

연산장치(630)는 영상 처리 기법을 이용하여 입력 영상에서 얼굴 영역 또는 관심 영역을 검출할 수 있다. 또는 연산장치(630)는 별도의 세그멘테이션 모델을 이용하여 입력 영상에서 얼굴 영역 또는 관심 영역을 검출할 수 있다.The arithmetic unit 630 may detect a face region or a region of interest from an input image using an image processing technique. Alternatively, the arithmetic unit 630 may detect a face region or a region of interest from an input image using a separate segmentation model.

연산장치(630)는 입력 영상에서 위변조 시도(구부림, 흔들림, 얼굴 가림 등)가 있는지 사전에 검출할 수 있다.The arithmetic unit 630 may detect in advance whether there is a forgery attempt (bending, shaking, face covering, etc.) in the input image.

연산장치(630)는 얼굴 영역 또는 관심 영역에서 피부색의 변화를 기준으로 PPG 신호를 검출할 수 있다. 검출 과정은 전술한 바와 같다. 또는, 연산장치(630)는 별도의 딥러닝 모델을 이용하여 얼굴 영역 또는 관심 영역에 대한 PPG 신호를 산출할 수도 있다.The arithmetic unit 630 may detect the PPG signal based on the change in skin color in the face region or the region of interest. The detection process is as described above. Alternatively, the calculator 630 may calculate a PPG signal for a face region or a region of interest using a separate deep learning model.

연산장치(630)는 시계열적인 PPG 신호를 기준으로 심박수(bpm)를 산출할 수 있다.The calculator 630 may calculate the heart rate (bpm) based on the time-sequential PPG signal.

연산장치(630)는 PPG 관련 정보 및/또는 로데이터(얼굴 영역 내지 관심 영역의 픽셀값)를 학습모델에 입력하여 얼굴 영상 위변조 여부를 결정할 수 있다.The arithmetic device 630 may input PPG-related information and/or raw data (pixel values of a face region or a region of interest) to a learning model to determine whether the face image is forged or altered.

연산장치(630)는 다양한 유형의 입력 데이터 중 어느 하나를 학습모델에 입력할 수 있다. 입력 데이터의 유형은 전술한 바와 같다. 이 과정에서 연산장치(630)는 원-핫 인코딩을 통해 입력 데이터를 마련할 수도 있다. 학습모델은 얼굴 영상 위변조를 나타내는 확률값을 산출한다.The computing device 630 may input any one of various types of input data to the learning model. The type of input data is as described above. During this process, the arithmetic unit 630 may prepare input data through one-hot encoding. The learning model calculates a probability value indicating forgery or alteration of the face image.

연산장치(630)는 일정 시점에서 관심 영역의 영상(로데이터), 동일 시점에서 검출된 PPG 신호 및 동일 시점에서 검출된 심박수를 학습모델에 입력하여 확률값을 산출할 수 있다. The arithmetic unit 630 may calculate a probability value by inputting an image (raw data) of the region of interest at a certain point in time, a PPG signal detected at the same point in time, and a heart rate detected at the same point in time to the learning model.

연산장치(630)는 일정한 시간 구간에서 시계열적으로 획득한 관심 영역의 영상(로데이터), 동일 시간 구간에서 검출된 PPG 신호 및 동일 시간 구간에서 검출된 심박수를 학습모델에 입력하여 확률값을 산출할 수도 있다.The arithmetic unit 630 calculates a probability value by inputting the image (raw data) of the region of interest acquired time-sequentially in a certain time interval, the PPG signal detected in the same time interval, and the heart rate detected in the same time interval to the learning model. may be

연산장치(630)는 학습모델이 출력하는 값을 기준으로 얼굴 영상 위변조에 대한 위변조를 결정할 수 있다. The arithmetic device 630 may determine forgery of face image forgery based on a value output by the learning model.

연산장치(630)는 데이터를 처리하고, 일정한 연산을 처리하는 프로세서, AP, 프로그램이 임베디드된 칩과 같은 장치일 수 있다.The arithmetic device 630 may be a device such as a processor, an AP, or a chip in which a program is embedded that processes data and performs certain arithmetic operations.

또한, 상술한 바와 같은 얼굴 영상 위변조 판별 방법은 컴퓨터에서 실행될 수 있는 실행가능한 알고리즘을 포함하는 프로그램(또는 어플리케이션)으로 구현될 수 있다. 상기 프로그램은 일시적 또는 비일시적 판독 가능 매체(non-transitory computer readable medium)에 저장되어 제공될 수 있다.In addition, the forgery detection method for face image as described above may be implemented as a program (or application) including an executable algorithm that can be executed on a computer. The program may be stored and provided in a temporary or non-transitory computer readable medium.

비일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 어플리케이션 또는 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM (read-only memory), PROM (programmable read only memory), EPROM(Erasable PROM, EPROM) 또는 EEPROM(Electrically EPROM) 또는 플래시 메모리 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다.A non-transitory readable medium is not a medium that stores data for a short moment, such as a register, cache, or memory, but a medium that stores data semi-permanently and can be read by a device. Specifically, the various applications or programs described above are CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM (read-only memory), PROM (programmable read only memory), EPROM (Erasable PROM, EPROM) Alternatively, it may be stored and provided in a non-transitory readable medium such as EEPROM (Electrically EPROM) or flash memory.

일시적 판독 가능 매체는 스태틱 램(Static RAM，SRAM), 다이내믹 램(Dynamic RAM，DRAM), 싱크로너스 디램 (Synchronous DRAM，SDRAM), 2배속 SDRAM(Double Data Rate SDRAM，DDR SDRAM), 증강형 SDRAM(Enhanced SDRAM，ESDRAM), 동기화 DRAM(Synclink DRAM，SLDRAM) 및 직접 램버스 램(Direct Rambus RAM，DRRAM) 과 같은 다양한 RAM을 의미한다.Temporary readable media include static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), and enhanced SDRAM (Enhanced SDRAM). SDRAM, ESDRAM), Synchronous DRAM (Synclink DRAM, SLDRAM) and Direct Rambus RAM (DRRAM).

본 실시예 및 본 명세서에 첨부된 도면은 전술한 기술에 포함되는 기술적 사상의 일부를 명확하게 나타내고 있는 것에 불과하며, 전술한 기술의 명세서 및 도면에 포함된 기술적 사상의 범위 내에서 당업자가 용이하게 유추할 수 있는 변형 예와 구체적인 실시례는 모두 전술한 기술의 권리범위에 포함되는 것이 자명하다고 할 것이다.This embodiment and the drawings accompanying this specification clearly represent only a part of the technical idea included in the foregoing technology, and those skilled in the art can easily understand it within the scope of the technical idea included in the specification and drawings of the above technology. It will be obvious that all variations and specific examples that can be inferred are included in the scope of the above-described technology.

Claims

receiving a facial image by an analysis device;
Predetermining, by the analysis device, whether there is forgery or alteration of one of a bending attack, a fine shaking attack, and a face masking attack in the face image;
detecting, by the analysis device, a region of interest from the face image when it is determined in the preliminary determination that there is no forgery;
detecting, by the analysis device, a photoplethysmography (PPG) signal based on a change in skin color of the region of interest;
detecting, by the analyzer, a heart rate based on the PPG signal; and
Non-contact PPG-based facial image forgery discrimination including the step of calculating a probability value indicating whether the facial image is forged or altered by inputting the PPG signal, the heart rate, and the image of the region of interest to a learning model learned in advance by the analysis device. method.

delete

According to claim 1,
The analysis device inputs the image of the region of interest at a certain point in time, the PPG signal detected at the certain point in time, and the heart rate detected at the certain point in time to the learning model to calculate the probability value. .

According to claim 1,
The analysis device is based on non-contact PPG for calculating the probability value by inputting the image of the region of interest time-sequentially acquired in a certain time interval, the PPG signal detected in the time interval, and the heart rate detected in the time interval to the learning model. A method for detecting forgery of face image.

delete

According to claim 1,
The analyzer clusters skin pixels in the Cb-Cr plane of the YCbCr color space, expands the Cb and Cr components by n times based on the center value of the clustered clusters, and then detects the heart rate. method.

an input device that receives a face image;
a storage device for storing a learning model for classifying image forgery based on a photoplethysmography (PPG) signal and heart rate; and
It is determined in advance whether there is forgery of one of a bending attack, a fine shaking attack, or a face covering attack in the face image, and if it is determined that there is no forgery in the preliminary determination, PPG detected based on skin color change for the region of interest of the face image signal, heart rate detected based on the detected PPG signal, and the image of the region of interest are input to the learning model to calculate a probability value indicating whether the face image is forged or not. Forgery of face image based on non-contact PPG An analysis device that classifies.

According to claim 7,
The calculator classifies forgery and falsification of facial images based on non-contact PPG that calculates the probability value by inputting the image of the region of interest at a certain point in time, the PPG signal detected at the certain point in time, and the heart rate detected at the certain point in time to the learning model. analysis device.

According to claim 7,
The calculator is based on non-contact PPG for calculating the probability value by inputting the image of the region of interest time-sequentially acquired in a certain time interval, the PPG signal detected in the time interval, and the heart rate detected in the time interval to the learning model. An analysis device that classifies forgery and alteration of facial images.

delete