KR102548770B1

KR102548770B1 - System and method of calculating face similarity using matrix calculation

Info

Publication number: KR102548770B1
Application number: KR1020200144338A
Authority: KR
Inventors: 이동열
Original assignee: 주식회사 카카오뱅크
Priority date: 2020-11-02
Filing date: 2020-11-02
Publication date: 2023-06-28
Also published as: KR20220059121A

Abstract

본 발명은 매트릭스 연산을 이용한 안면유사도 산출 시스템 및 방법을 개시한다. 상기 안면유사도 산출 방법은, 고객의 얼굴을 포함하는 복수의 영상통화 데이터를 수집하는 단계, 상기 수집된 영상통화 데이터로부터 안면이미지를 도출하여, 각 고객의 계정에 귀속시키는 단계, 복수의 고객 중에서, 안면유사도 판단에 대한 검사대상을 결정하는 단계, 상기 검사대상에 포함된 고객의 계정에 귀속된 안면이미지의 벡터값을 산출하는 단계, 각 고객별로, 해당 고객의 영상통화수와, 각각의 영상통화에 대응되는 상기 안면이미지의 벡터값을 이용하여 매트릭스를 생성하는 단계, 각 고객별 상기 매트릭스의 크기를 동일하게 변환하는 단계, 및 상기 변환된 매트릭스 간 연산을 이용하여, 안면유사도를 산출하는 단계를 포함한다.The present invention discloses a system and method for calculating facial similarity using a matrix operation. The facial similarity calculation method includes the steps of collecting a plurality of video call data including the customer's face, deriving a facial image from the collected video call data and assigning it to each customer's account, among a plurality of customers, Determining the test target for facial similarity determination, calculating the vector value of the facial image belonging to the account of the customer included in the test target, the number of video calls of the customer for each customer, and each video call Generating a matrix using the vector value of the face image corresponding to , converting the size of the matrix to be the same for each customer, and calculating the facial similarity using an operation between the transformed matrices. include

Description

System and method of calculating face similarity using matrix calculation {System and method of calculating face similarity using matrix calculation}

본 발명은 매트릭스 연산을 이용한 안면유사도 산출 시스템 및 방법에 관한 것이다. 구체적으로, 영상통화에서 추출된 안면이미지에 대한 벡터값을 포함하는 매트릭스를 이용하여, 영상통화의 안면유사도를 빠르게 산출하는 시스템 및 방법에 관한 것이다.The present invention relates to a system and method for calculating facial similarity using a matrix operation. Specifically, the present invention relates to a system and method for quickly calculating facial similarity of a video call by using a matrix including vector values of face images extracted from a video call.

이 부분에 기술된 내용은 단순히 본 실시예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The contents described in this part merely provide background information on the present embodiment and do not constitute prior art.

최근 스마트 디바이스와 네트워크의 발전, 그리고 다양한 네트워크 서비스의 발달로 인하여 종래 대면으로 이루어지던 은행업무를 포함하는 여러 업무들이 온라인/무선을 이용한 비대면 업무처리 형태로 전환되었다. Recently, due to the development of smart devices and networks, and the development of various network services, various tasks, including banking, which were previously performed face-to-face, have been converted to online/wireless non-face-to-face processing.

비대면 업무처리를 위하여, 금융사들은 필요한 경우 고객과 영상통화를 수행하며, 영상통화를 수행하는 고객이 본인이 맞는지 여부를 판단하기 위해 안면유사도를 산출한다.For non-face-to-face business processing, financial companies conduct video calls with customers if necessary, and calculate facial similarity to determine whether or not the customer conducting the video call is the person in question.

구체적으로, 금융사들은 고객과의 영상통화에 대해 다른 영상통화 또는 레퍼런스 이미지와의 안면유사도를 산출함으로써, 고객과 블랙리스트가 유사한지 여부, 특정 고객의 복수의 영상통화에 대한 얼굴이 서로 일치하는지 여부, 특정 고객의 영상통화에 대한 얼굴과 다른 고객의 영상통화에 대한 얼굴이 일치하는지 여부를 판단할 수 있다. Specifically, financial companies calculate facial similarity with other video calls or reference images for video calls with customers, whether the customer and the blacklist are similar, and whether the faces of a specific customer's multiple video calls match each other. , It is possible to determine whether a face of a specific customer for a video call matches a face for a video call of another customer.

이를 통해, 금융사는 주기적으로 고객의 신원이 도용되었는지 여부를 판단할 수 있으며, 금융사고 예방과 금융서비스에 대한 안정성을 확보할 수 있다.Through this, financial companies can periodically determine whether a customer's identity has been stolen, and can prevent financial accidents and secure the stability of financial services.

다만, 상술한 안면유사도를 산출하는 방법을 주기적으로 수행함에 있어, 서버에 누적되는 영상통화량이 증가될수록 안면유사도 판단에 이용되는 시스템의 리소스가 증가되며, 안면유사도 산출에 소요되는 시간도 급격하게 증가되는 문제점이 있었다. However, in periodically performing the method of calculating the facial similarity described above, as the amount of video calls accumulated in the server increases, the system resources used for determining the facial similarity increase, and the time required to calculate the facial similarity also increases rapidly. there was a problem with

따라서, 짧은 시간 내에 적은 리소스로 동일한 동작을 수행할 수 있는 안면유사도 산출 방법에 대한 니즈가 존재하였다.Therefore, there was a need for a facial similarity calculation method capable of performing the same operation with fewer resources in a short time.

본 발명의 목적은, 많은 양의 영상통화에서 추출한 안면이미지 간의 안면유사도를 빠른 시간 내에 처리할 수 있는 안면유사도 산출 시스템 및 방법을 제공하는 것이다.An object of the present invention is to provide a facial similarity calculation system and method capable of quickly processing facial similarity between facial images extracted from a large amount of video calls.

또한, 본 발명의 목적은, 딥러닝 모듈을 이용하여 안면이미지에 대한 벡터값을 산출하고, 최대 영상통화수에 대응되는 벡터값을 포함하는 매트릭스를 이용함으로써, 다수의 안면이미지에 대한 안면유사도를 빠르게 산출할 수 있는 시스템 및 방법을 제공하는 것이다.In addition, an object of the present invention is to calculate the facial similarity of a plurality of facial images by calculating vector values for facial images using a deep learning module and using a matrix including vector values corresponding to the maximum number of video calls. It is to provide a system and method that can be calculated quickly.

또한, 본 발명의 목적은, 안면유사도와 기준치를 비교하여 이상여부를 판단하고, 이상여부가 확인된 케이스를 이용하여 딥러닝 모듈을 재학습 시킴으로써, 안면유사도 판단에 대한 정확성을 높일 수 있는 안면유사도 산출 시스템 및 방법을 제공하는 것이다.In addition, an object of the present invention is to compare facial similarity with a reference value to determine whether or not there is an abnormality, and to relearn the deep learning module using the case in which the abnormality is confirmed, thereby increasing the accuracy of the facial similarity judgment. It is to provide a calculation system and method.

본 발명의 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 발명의 실시예에 의해 보다 분명하게 이해될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects and advantages of the present invention not mentioned above can be understood by the following description and will be more clearly understood by the examples of the present invention. It will also be readily apparent that the objects and advantages of the present invention may be realized by means of the instrumentalities and combinations indicated in the claims.

본 발명의 일 실시예에 따른 안면유사도 산출 방법은, 고객의 얼굴을 포함하는 복수의 영상통화 데이터를 수집하는 단계, 상기 수집된 영상통화 데이터로부터 안면이미지를 도출하여, 각 고객의 계정에 귀속시키는 단계, 복수의 고객 중에서, 안면유사도 판단에 대한 검사대상을 결정하는 단계, 상기 검사대상에 포함된 고객의 계정에 귀속된 안면이미지의 벡터값을 산출하는 단계, 각 고객별로, 해당 고객의 영상통화수와, 각각의 영상통화에 대응되는 상기 안면이미지의 벡터값을 이용하여 매트릭스를 생성하는 단계, 각 고객별 상기 매트릭스의 크기를 동일하게 변환하는 단계, 및 상기 변환된 매트릭스 간 연산을 이용하여, 안면유사도를 산출하는 단계를 포함한다.A facial similarity calculation method according to an embodiment of the present invention includes the steps of collecting a plurality of video call data including the customer's face, deriving a facial image from the collected video call data, and assigning it to each customer's account. Step, among a plurality of customers, determining a test target for determining facial similarity, calculating a vector value of a facial image belonging to an account of a customer included in the test target, and performing a video call of the corresponding customer for each customer Generating a matrix using the number and the vector value of the facial image corresponding to each video call, converting the size of the matrix to be the same for each customer, and using the operation between the converted matrices, and calculating facial similarity.

또한, 상기 매트릭스는, 상기 영상통화수에 관한 제1 행렬과, 각각의 영상통화에 대응되는 안면이미지의 벡터값에 관한 제2 행렬의 곱으로 표현되며, 상기 제1 행렬은, 해당 고객의 영상통화의 수에 대응되는 하나 이상의 성분(entry) 또는 원소(element)를 포함하고, 상기 성분 또는 상기 원소는 양의 정수일 수 있다.In addition, the matrix is expressed as a product of a first matrix related to the number of video calls and a second matrix related to vector values of facial images corresponding to each video call, and the first matrix is the image of the corresponding customer. It includes one or more entries or elements corresponding to the number of currencies, and the entries or elements may be positive integers.

또한, 상기 매트릭스의 크기를 동일하게 변환하는 단계는, 상기 검사대상에 속한 고객들의 영상통화수를 기초로, 가장 통화이력이 많은 고객의 영상통화수를 '최대 영상통화수'로 설정하는 단계와, 상기 최대 영상통화수를 기초로, 각 고객의 상기 매트릭스를 구성하는 상기 제1 행렬의 크기를 모두 동일하게 변환하는 단계를 포함할 수 있다.In addition, the step of converting the size of the matrix to be the same may include setting the number of video calls of the customer with the most call history as the 'maximum number of video calls' based on the number of video calls of the customers belonging to the test target; , converting all the first matrix constituting the matrix of each customer to the same size based on the maximum number of video calls.

또한, 상기 벡터값을 산출하는 단계는, 상기 안면이미지에서 랜드마크를 검출하는 단계와, 상기 검출된 랜드마크를 이용하여 상기 안면이미지를 정렬하는 단계와, 상기 정렬된 안면이미지에 대해, 딥러닝 모듈을 이용하여 상기 벡터값을 산출하는 단계를 포함할 수 있다.In addition, the calculating of the vector value may include detecting a landmark in the facial image, arranging the facial image using the detected landmark, and deep learning for the aligned facial image. Calculating the vector value using a module may be included.

또한, 상기 산출된 안면유사도와 미리 설정된 기준치를 비교하여 이상여부를 판단하는 단계와, 상기 이상여부가 발견되는 경우, 해당 케이스에 대한 육안검사결과를 기초로 동일인물여부를 재확인하는 단계와, 상기 이상여부에 대한 판단에 오류가 있는 경우, 상기 해당 케이스에 대한 이미지를 이용하여 딥러닝 모듈을 재학습시키는 단계를 더 포함할 수 있다.In addition, the step of determining whether or not there is an abnormality by comparing the calculated facial similarity with a preset reference value, and if the abnormality is found, reconfirming whether or not the same person is the same person based on the result of the visual inspection for the case; If there is an error in determining whether or not there is an anomaly, a step of re-learning the deep learning module using the image for the corresponding case may be further included.

또한, 상기 벡터값을 산출하는 단계는, 동일 인물을 촬영한 이미지에 대하여 동일한 벡터값이 출력되도록 사전 학습된 딥러닝 모듈을 이용하여, 상기 벡터값을 산출하되, 상기 딥러닝 모듈은, 상기 안면이미지를 입력 노드로 하는 입력 레이어와, 상기 안면유사도를 출력 노드로 하는 출력 레이어와, 상기 입력 레이어와 상기 출력 레이어 사이에 배치되는 하나 이상의 히든 레이어를 포함하고, 상기 입력 노드와 상기 출력 노드 사이의 노드 및 에지의 가중치는 상기 딥러닝 모듈의 학습 과정에 의해 업데이트될 수 있다.In addition, in the step of calculating the vector value, the vector value is calculated using a pre-learned deep learning module so that the same vector value is output for the image of the same person, but the deep learning module calculates the face It includes an input layer using an image as an input node, an output layer using the facial similarity as an output node, and one or more hidden layers disposed between the input layer and the output layer, and between the input node and the output node. The weights of nodes and edges may be updated by the learning process of the deep learning module.

또한, 상기 벡터값을 산출하는 단계는, 동일 인물을 촬영한 이미지에 대하여 동일한 벡터값이 출력되도록 사전 학습된 딥러닝 모듈을 이용하여, 상기 벡터값을 산출하되, 상기 딥러닝 모듈은, 서로 다른 이미지가 입력되는 복수의 뉴럴 네트워크 모듈과, 상기 복수의 뉴럴 네트워크 모듈에서 출력된 값들에 대한 각각의 유사도를 산출하는 유사도 판단 모듈과, 상기 산출된 각각의 유사도에 대한 가중치를 조절하는 가중치 모듈과, 상기 가중치 모듈에서 출력되는 결과값의 오차에 대한 피드백을 상기 복수의 뉴럴 네트워크 모듈에 전달하는 피드백 모듈을 포함할 수 있다.In addition, in the step of calculating the vector value, the vector value is calculated using a pretrained deep learning module so that the same vector value is output for the image of the same person, but the deep learning module is different from each other. A plurality of neural network modules to which images are input, a similarity determination module to calculate a similarity between values output from the plurality of neural network modules, and a weight module to adjust a weight for each calculated similarity; A feedback module may be included to transmit feedback about an error of the result value output from the weight module to the plurality of neural network modules.

또한, 상기 검사대상을 결정하는 단계는, 복수의 검사방법 중 어느 하나에 대한 선택을 수신하고, 상기 선택된 검사방법을 기초로 상기 검사대상을 결정하는 것을 포함하되, 상기 검사방법은, 고객과, 블랙리스트에 등록된 레퍼런스 이미지 간의 안면유사도를 판단하는 제1 검사방법과, 특정 고객에 귀속된 복수의 영상통화에 대한 안면이미지 간의 안면유사도를 판단하는 제2 검사방법과, 서로 다른 고객의 영상통화에 대한 안면이미지 간의 안면유사도를 판단하는 제3 검사방법을 포함할 수 있다.In addition, the step of determining the inspection target includes receiving a selection for any one of a plurality of inspection methods, and determining the inspection target based on the selected inspection method, wherein the inspection method includes a customer; The first inspection method for determining the facial similarity between reference images registered in the blacklist, the second inspection method for determining the facial similarity between the facial images of a plurality of video calls belonging to a specific customer, and the video calls of different customers A third inspection method for determining the degree of facial similarity between facial images may be included.

또한, 상기 제2 검사방법과 상기 제3 검사방법은, 고객마다 동일한 크기로 변환된 상기 매트릭스를 이용하여, 안면유사도를 판단할 수 있다.In addition, the second inspection method and the third inspection method may determine facial similarity using the matrix converted to the same size for each customer.

또한, 상기 제1 검사방법은, 상기 블랙리스트의 업데이트 여부 또는 상기 고객이 신규고객인지 여부를 기초로 수행여부가 결정될 수 있다.In addition, whether to perform the first inspection method may be determined based on whether the blacklist is updated or whether the customer is a new customer.

또한, 본 발명의 다른 실시예에 따른 안면유사도 산출 방법은, (a) 고객의 얼굴을 포함하는 복수의 영상통화 데이터를 수집하는 단계, (b) 상기 수집된 영상통화 데이터로부터 안면이미지를 도출하는 단계, (c) 상기 도출된 안면이미지의 벡터값을 산출하여 해당 고객의 계정에 귀속시키는 단계, (d) 복수의 고객에 대해 (a) 내지 (c) 단계를 반복하는 단계, (e) 각 고객별로 해당 고객의 영상통화수와, 각각의 영상통화에 대응되는 상기 안면이미지의 벡터값을 이용하여 매트릭스를 생성하는 단계, (f) 각 고객별 상기 매트릭스의 크기를 동일하게 변환하는 단계, 및 (g) 상기 변환된 매트릭스 간 연산을 이용하여, 안면유사도를 산출하는 단계를 포함한다.In addition, a facial similarity calculation method according to another embodiment of the present invention includes (a) collecting a plurality of video call data including a customer's face, (b) deriving a facial image from the collected video call data Step (c) calculating the vector value of the derived facial image and assigning it to the account of the corresponding customer, (d) repeating steps (a) to (c) for a plurality of customers, (e) each Generating a matrix using the number of video calls of the corresponding customer for each customer and the vector value of the face image corresponding to each video call, (f) converting the size of the matrix to be the same for each customer, and (g) calculating facial similarity using the transformed inter-matrix operation.

한편, 본 발명의 몇몇 실시예에 따른 안면유사도 산출 시스템은, 고객의 얼굴을 포함하는 복수의 영상통화 데이터를 수집하는 데이터 수집부, 복수의 고객 중에서, 안면유사도 판단에 대한 검사대상을 결정하는 검사대상 결정부, 상기 검사대상의 상기 복수의 영상통화 데이터에 대하여 안면이미지를 도출하는 안면이미지 도출부, 및 상기 안면이미지에 대한 안면유사도를 산출하는 연산부를 포함하되, 상기 연산부는, 상기 검사대상에 포함된 고객의 계정에 귀속된 안면이미지의 벡터값을 산출하는 피쳐 산출부와, 각 고객별로, 해당 고객의 영상통화수와, 각각의 영상통화에 대응되는 상기 안면이미지의 벡터값을 이용하여 매트릭스를 생성하는 매트릭스 생성부와, 각 고객별 상기 매트릭스의 크기를 동일하게 변환하는 매트릭스 변환부와, 상기 변환된 매트릭스 간 연산을 이용하여 안면유사도를 산출하는 안면유사도 산출부를 포함한다.On the other hand, a facial similarity calculation system according to some embodiments of the present invention includes a data collection unit that collects a plurality of video call data including a customer's face, and a test that determines a test target for determining facial similarity among a plurality of customers. A target determination unit, a facial image derivation unit for deriving a facial image from the plurality of video call data of the examination target, and a calculation unit for calculating a facial similarity with respect to the facial image, wherein the calculation unit is A feature calculation unit that calculates the vector value of the face image belonging to the account of the included customer, and the number of video calls of the customer for each customer and the vector value of the face image corresponding to each video call. It includes a matrix generator that generates a matrix, a matrix converter that converts the size of the matrix to be the same for each customer, and a facial similarity calculator that calculates the facial similarity using an operation between the transformed matrices.

또한, 상기 매트릭스는, 상기 영상통화수에 관한 제1 행렬과, 각각의 영상통화에 대응되는 안면이미지의 벡터값에 관한 제2 행렬의 곱으로 표현되며, 상기 제1 행렬은, 해당 고객의 영상통화의 수에 대응되는 하나 이상의 성분(entry) 또는 원소(element)를 포함하고, 상기 매트릭스 변환부는, 상기 검사대상에 속한 고객들의 영상통화수를 기초로, 가장 통화이력이 많은 고객의 영상통화수를 '최대 영상통화수'로 설정하고, 상기 최대 영상통화수를 기초로, 각 고객의 상기 매트릭스를 구성하는 상기 제1 행렬의 크기를 동일하게 변환할 수 있다.In addition, the matrix is expressed as a product of a first matrix related to the number of video calls and a second matrix related to vector values of facial images corresponding to each video call, and the first matrix is the image of the corresponding customer. It includes one or more entries or elements corresponding to the number of calls, and the matrix conversion unit determines the number of video calls of customers with the most call history based on the number of video calls of customers belonging to the test target. is set to the 'maximum number of video calls', and based on the maximum number of video calls, the size of the first matrix constituting the matrix of each customer may be converted to the same size.

또한, 상기 피쳐 산출부는, 동일 인물을 촬영한 이미지에 대하여 동일한 벡터값이 출력되도록 사전 학습된 딥러닝 모듈을 이용하여, 상기 벡터값을 산출하되, 상기 딥러닝 모듈은, 상기 안면이미지를 입력 노드로 하는 입력 레이어와, 상기 안면유사도를 출력 노드로 하는 출력 레이어와, 상기 입력 레이어와 상기 출력 레이어 사이에 배치되는 하나 이상의 히든 레이어를 포함하고, 상기 입력 노드와 상기 출력 노드 사이의 노드 및 에지의 가중치는 상기 딥러닝 모듈의 학습 과정에 의해 업데이트될 수 있다.In addition, the feature calculation unit calculates the vector value using a pre-learned deep learning module so that the same vector value is output for images of the same person, and the deep learning module uses the face image as an input node. It includes an input layer, an output layer having the face similarity as an output node, and one or more hidden layers disposed between the input layer and the output layer, and nodes and edges between the input node and the output node. Weights may be updated by the learning process of the deep learning module.

또한, 상기 검사대상 결정부는, 복수의 검사방법 중 어느 하나에 대한 선택을 수신하고, 상기 선택된 검사방법을 기초로 상기 검사대상을 결정하되, 상기 검사방법은, 고객과, 블랙리스트에 등록된 레퍼런스 이미지 간의 안면유사도를 판단하는 제1 검사방법과, 특정 고객에 귀속된 복수의 영상통화에 대한 안면이미지 간의 안면유사도를 판단하는 제2 검사방법과, 서로 다른 고객의 영상통화에 대한 안면이미지 간의 안면유사도를 판단하는 제3 검사방법을 포함할 수 있다.In addition, the inspection target determination unit receives a selection of any one of a plurality of inspection methods, and determines the inspection subject based on the selected inspection method, wherein the inspection method is performed by a customer and a reference registered in a blacklist. A first inspection method for determining facial similarity between images, a second inspection method for determining facial similarity between facial images for a plurality of video calls belonging to a specific customer, and facial images for video calls of different customers A third inspection method for determining similarity may be included.

본 발명의 안면유사도 산출 시스템 및 방법은, 딥러닝 모듈을 이용하여 영상통화에서 추출한 안면이미지에 대한 벡터값을 산출하고, 최대 영상통화수에 대응되는 벡터값을 포함하는 매트릭스를 이용함으로써, 많은 양의 영상통화에서 추출한 안면이미지 간의 안면유사도를 빠르게 계산할 수 있다. 이에 통해, 본 발명은 대량의 영상통화 간 유사도를 산출하는데 소요되는 시간을 단축시켜, 이상고객 존재여부에 대한 판단속도를 증가시킬 수 있다.The facial similarity calculation system and method of the present invention calculates a vector value for a facial image extracted from a video call using a deep learning module, and uses a matrix including vector values corresponding to the maximum number of video calls. It is possible to quickly calculate the facial similarity between facial images extracted from video calls. Through this, the present invention can shorten the time required to calculate the similarity between a large number of video calls, thereby increasing the speed of determining whether an abnormal customer exists.

또한, 본 발명의 안면유사도 산출 시스템 및 방법은, 안면유사도와 미리 설정된 기준치를 비교하여 이상여부를 판단하고, 이상여부가 확인된 케이스를 이용하여 안면유사도를 산출하는데 필요한 벡터값을 출력하는 딥러닝 모듈을 재학습시킴으로써, 안면유사도 판단에 대한 정확성을 높일 수 있다. 이를 통해, 본 발명은, 복수의 영상통화에 대한 유사도 판단의 정확도를 높일 수 있으며, 금융사고를 예방하고, 금융서비스 제공에 대한 안정성을 향상시킬 수 있다.In addition, the facial similarity calculation system and method of the present invention compares the facial similarity with a preset reference value to determine whether there is an abnormality, and uses a case in which the abnormality is confirmed to output a vector value necessary for calculating the facial similarity. By retraining the module, the accuracy of facial similarity judgment can be increased. Through this, the present invention can increase the accuracy of determining the degree of similarity for a plurality of video calls, prevent financial accidents, and improve the stability of providing financial services.

상술한 내용과 더불어 본 발명의 구체적인 효과는 이하 발명을 실시하기 위한 구체적인 사항을 설명하면서 함께 기술한다.In addition to the above description, specific effects of the present invention will be described together while explaining specific details for carrying out the present invention.

도 1은 본 발명의 실시예에 따른 안면유사도 산출 방법을 수행하는 시스템을 설명하기 위한 개념도이다.
도 2는 도 1의 금융사 서버의 구성을 설명하기 위한 블럭도이다.
도 3은 본 발명의 몇몇 실시예에 따른 안면유사도 산출 방법을 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따른 안면유사도 산출 방법을 설명하기 위한 순서도이다.
도 5는 도 3의 안면이미지를 도출하고 이에 대한 벡터값을 산출하는 단계에 대한 구체적인 내용을 설명하기 위한 도면이다.
도 6은 도 5에서 복수의 영상통화에서 안면유사도를 산출하는 일 예를 설명하기 위한 도면이다.
도 7은 도 5에서 영상통화와 레퍼런스 이미지 간 안면유사도를 산출하는 일 예를 설명하기 위한 도면이다.
도 8은 도 3의 S140 단계의 딥러닝 모듈을 설명하기 위한 블럭도이다.
도 9는 도 8의 딥러닝 모듈의 일 예를 설명하기 위한 블럭도이다.
도 10은 도 8의 딥러닝 모듈의 학습단계를 설명하기 위한 블럭도이다.
도 11은 도 3의 S120에서 안면유사도 판단에 대한 검사대상이 결정되는 과정을 설명하기 위한 도면이다.
도 12는 본 발명의 몇몇 실시예에서 검사방법에 따라 딥러닝 모듈을 재학습 시키는 과정을 설명하기 위한 순서도이다.
도 13은 본 발명의 몇몇 실시예에 따른 안면유사도 산출 방법을 수행하는 시스템의 하드웨어 구현을 설명하기 위한 도면이다.1 is a conceptual diagram for explaining a system for performing a facial similarity calculation method according to an embodiment of the present invention.
Figure 2 is a block diagram for explaining the configuration of the financial institution server of Figure 1;
3 is a diagram for explaining a facial similarity calculation method according to some embodiments of the present invention.
4 is a flowchart for explaining a facial similarity calculation method according to an embodiment of the present invention.
FIG. 5 is a diagram for explaining details of a step of deriving the facial image of FIG. 3 and calculating a vector value for it.
FIG. 6 is a diagram for explaining an example of calculating facial similarity in a plurality of video calls in FIG. 5 .
FIG. 7 is a diagram for explaining an example of calculating a facial similarity between a video call and a reference image in FIG. 5 .
8 is a block diagram for explaining the deep learning module in step S140 of FIG. 3 .
9 is a block diagram for explaining an example of the deep learning module of FIG. 8 .
10 is a block diagram for explaining a learning step of the deep learning module of FIG. 8 .
FIG. 11 is a diagram for explaining a process of determining a test target for determining facial similarity in S120 of FIG. 3 .
12 is a flowchart for explaining a process of re-learning a deep learning module according to an inspection method in some embodiments of the present invention.
13 is a diagram for explaining hardware implementation of a system that performs a facial similarity calculation method according to some embodiments of the present invention.

본 명세서 및 특허청구범위에서 사용된 용어나 단어는 일반적이거나 사전적인 의미로 한정하여 해석되어서는 아니된다. 발명자가 그 자신의 발명을 최선의 방법으로 설명하기 위해 용어나 단어의 개념을 정의할 수 있다는 원칙에 따라, 본 발명의 기술적 사상과 부합하는 의미와 개념으로 해석되어야 한다. 또한, 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명이 실현되는 하나의 실시예에 불과하고, 본 발명의 기술적 사상을 전부 대변하는 것이 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 및 응용 가능한 예들이 있을 수 있음을 이해하여야 한다.Terms or words used in this specification and claims should not be construed as being limited to a general or dictionary meaning. According to the principle that an inventor may define a term or a concept of a word in order to best describe his/her invention, it should be interpreted as meaning and concept consistent with the technical spirit of the present invention. In addition, the embodiments described in this specification and the configurations shown in the drawings are only one embodiment in which the present invention is realized, and do not represent all of the technical spirit of the present invention, so they can be replaced at the time of the present application. It should be understood that there may be many equivalents and variations and applicable examples.

본 명세서 및 특허청구범위에서 사용된 제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. '및/또는' 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as first, second, A, and B used in this specification and claims may be used to describe various components, but the components should not be limited by the terms. These terms are only used for the purpose of distinguishing one component from another. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element, without departing from the scope of the present invention. The term 'and/or' includes a combination of a plurality of related recited items or any one of a plurality of related recited items.

본 명세서 및 특허청구범위에서 사용된 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서 "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Terms used in this specification and claims are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise. It should be understood that terms such as "include" or "having" in this application do not exclude in advance the possibility of existence or addition of features, numbers, steps, operations, components, parts, or combinations thereof described in the specification. .

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해서 일반적으로 이해되는 것과 동일한 의미를 가지고 있다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs.

일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the present application, they should not be interpreted in an ideal or excessively formal meaning. don't

또한, 본 발명의 각 실시예에 포함된 각 구성, 과정, 공정 또는 방법 등은 기술적으로 상호 간 모순되지 않는 범위 내에서 공유될 수 있다. In addition, each configuration, process, process or method included in each embodiment of the present invention may be shared within a range that does not contradict each other technically.

이하에서는, 도 1 내지 도 13을 참조하여 본 발명의 실시예에 따른 안면유사도 산출 방법 및 이를 수행하는 시스템에 대해 자세히 설명하도록 한다.Hereinafter, a method for calculating facial similarity according to an embodiment of the present invention and a system for performing the same will be described in detail with reference to FIGS. 1 to 13 .

도 1은 본 발명의 실시예에 따른 안면유사도 산출 방법을 수행하는 시스템을 설명하기 위한 개념도이다.1 is a conceptual diagram for explaining a system for performing a facial similarity calculation method according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 실시예에 따른 시스템은, 금융사 서버(100), 고객 단말(200) 및 상담원 단말(300)을 포함한다. Referring to FIG. 1 , a system according to an embodiment of the present invention includes a financial company server 100 , a customer terminal 200 and a counselor terminal 300 .

금융사 서버(100)(이하, 서버)는 고객 단말(200)에게 다양한 비대면 금융 서비스를 제공할 수 있다. 또한, 서버(100)는 고객에게 금융 서비스를 제공함에 앞서, 고객이 본인이 맞는지에 대한 신원확인 또는 본인인증을 수행할 수 있다.The financial company server 100 (hereinafter, the server) may provide various non-face-to-face financial services to the customer terminal 200 . In addition, the server 100 may perform identity verification or identity authentication to determine whether the customer is the customer prior to providing financial services to the customer.

서버(100)는 고객의 신원확인 또는 본인인증을 위한 수단으로 영상통화를 이용될 수 있다. 이를 위해, 서버(100)는 고객 단말(200)과 상담원 단말(300) 간의 영상통화를 중개하며, 수집된 영상통화 데이터를 기초로 고객의 신원확인 또는 본인인증을 수행할 수 있다. The server 100 may use a video call as a means for customer identification or authentication. To this end, the server 100 may mediate a video call between the customer terminal 200 and the counselor terminal 300, and perform identity verification or identity authentication of the customer based on the collected video call data.

이때, 서버(100)는 안면유사도 산출 방법을 이용하여 영상통화 데이터에서 고객의 안면 이미지를 추출하고, 추출된 안면 이미지를 이용하여 고객의 신원확인 또는 본인인증을 수행할 수 있다. 이어서, 신원확인 또는 본인인증이 완료된 경우, 서버(100)는 고객 단말(200)에 고객이 요청한 금융 서비스(또는, 고객 맞춤형 금융 서비스)를 제공할 수 있다.At this time, the server 100 may extract the customer's facial image from the video call data using the facial similarity calculation method, and perform identity verification or identity authentication of the customer using the extracted facial image. Subsequently, when identification or identity verification is completed, the server 100 may provide the customer terminal 200 with a financial service requested by the customer (or a customer-customized financial service).

다만, 서버(100)에서 수행되는 안면유사도 산출 방법이 위의 동작에 국한되는 것은 아니며, 다양한 실시예에서 응용되어 수행될 수 있음은 자명하다. 이하에서는, 설명의 편의를 위하여 고객과의 영상통화 중 도출된 영상통화 데이터를 이용하여 고객의 본인인증을 수행하는 것을 예로 들어 설명하도록 한다.However, it is obvious that the facial similarity calculation method performed in the server 100 is not limited to the above operation, and can be applied and performed in various embodiments. Hereinafter, for convenience of description, an example of performing authentication of a customer using video call data derived during a video call with a customer will be described.

서버(100)는 안면유사도 산출 방법의 수행주체로써 동작할 수 있다. 서버(100)는 고객 단말(200)로부터 영상통화 데이터를 수신하고, 이를 기초로 안면유사도 산출 방법을 수행할 수 있다.The server 100 may operate as a performer of a facial similarity calculation method. The server 100 may receive video call data from the customer terminal 200 and perform a facial similarity calculation method based on the received video call data.

구체적으로, 서버(100)에서 수행되는 안면유사도 산출 방법은, 검사방법에 따라 검사대상을 다르게 설정하여 동작할 수 있다. Specifically, the facial similarity calculation method performed in the server 100 may operate by setting the test target differently according to the test method.

예를 들어, 서버(100)는 고객과 미리 등록된 블랙리스트 간의 유사도를 판단하는 제1 검사방법과, 특정 고객의 복수의 영상통화에 대하여 얼굴이 서로 일치하는지 여부를 판단하는 제2 검사방법과, 특정 고객의 영상통화에 대한 얼굴과 다른 고객의 영상통화에 대한 얼굴이 일치하는지 여부를 판단하는 제3 검사방법에 이용될 수 있다. For example, the server 100 includes a first inspection method for determining the degree of similarity between a customer and a pre-registered blacklist, a second inspection method for determining whether or not faces of a plurality of video calls of a specific customer match each other, and , It can be used in a third inspection method for determining whether a face for a video call of a specific customer matches a face for a video call of another customer.

다만, 이는 서버(100)에서 수행되는 검사방법의 몇몇 예시에 불과하고 본 발명이 이에 한정되는 것은 아니다. 이하에서는 설명의 편의를 위해 제1 내지 제3 검사방법만을 예로 들어 설명하도록 한다.However, these are only a few examples of inspection methods performed in the server 100, and the present invention is not limited thereto. Hereinafter, for convenience of description, only the first to third inspection methods will be described as examples.

서버(100)는 미리 설정된 주기마다 제2 또는 제3 검사방법을 선택하여 수행할 수 있다. 또한, 서버(100)는 신규 가입고객이 있거나, 블랙리스트가 업데이트 되는 경우, 제1 내지 제3 검사방법 중 어느 하나를 선택하여 수행할 수 있다. The server 100 may select and perform the second or third inspection method for each preset period. In addition, the server 100 may select and perform any one of the first to third inspection methods when there is a new customer or when the blacklist is updated.

제1 내지 제3 검사방법을 수행함에 있어서, 서버(100)는 많은 수의 영상통화 데이터에 대해 안면이미지를 추출하고, 추출된 안면이미지들에 대한 다양한 조합에 대하여 안면유사도를 판단해야 한다. 따라서, 서버(100)에 저장된 영상통화의 수가 증가될수록 안면유사도를 산출하는데 필요한 리소스는 증가되고, 안면유사도의 산출에 필요한 시간 역시 급격히 증가될 수 있다. 이러한 문제를 해결하기 위해 서버(100)는 고객마다 매트릭스를 생성하여 안면유사도의 연산에 이용할 수 있다.In performing the first to third inspection methods, the server 100 needs to extract facial images from a large number of video call data and determine facial similarities for various combinations of the extracted facial images. Therefore, as the number of video calls stored in the server 100 increases, the resources required to calculate the facial similarity increase, and the time required to calculate the facial similarity may also increase rapidly. To solve this problem, the server 100 may generate a matrix for each customer and use it for calculation of facial similarity.

여기에서, 매트릭스는 해당 고객의 영상통화이력과, 각각의 영상통화에 대응되는 안면이미지의 벡터값을 포함할 수 있다. 예를 들어, 매트릭스는, 해당 고객에게 귀속된 영상통화의 개수에 관련 제1 행렬과, 해당 고객에게 귀속된 각각의 영상통화에 대응되는 안면이미지의 벡터값에 관한 제2 행렬의 곱으로 표현될 수 있다. 이때, 제1 행렬 및 제2 행렬은 영상통화의 개수에 대응되는 하나 이상의 성분(entry) 또는 원소(element)를 포함할 수 있다. 이때, 각각의 성분 또는 원소는, '양의 정수'로 구성될 수 있다.Here, the matrix may include a video call history of the corresponding customer and a vector value of a face image corresponding to each video call. For example, the matrix may be expressed as a product of a first matrix related to the number of video calls belonging to the customer and a second matrix related to vector values of facial images corresponding to each video call belonging to the customer. can In this case, the first matrix and the second matrix may include one or more entries or elements corresponding to the number of video calls. In this case, each component or element may be composed of a 'positive integer'.

추가적으로, 매트릭스에 포함되는 안면이미지의 벡터값은, 미리 학습된 딥러닝 모듈에 의해 생성될 수 있다. 본 발명의 딥러닝 모듈에 대한 설명은 도 8 내지 도 10을 참조하여 이하에서 자세히 기술하도록 한다.Additionally, the vector value of the facial image included in the matrix may be generated by a pre-trained deep learning module. A description of the deep learning module of the present invention will be described in detail below with reference to FIGS. 8 to 10 .

다만, 매트릭스 연산을 이용하여 안면유사도를 산출하기 위해서는, 각 고객에 대한 매트릭스에 동일한 크기의 행렬로 구성되어야 하나, 고객별로 영상통화를 수행한 횟수가 다르기에 각 고객별로 매트릭스를 구성하는 행렬의 크기는 서로 다를 수 있다. However, in order to calculate the facial similarity using matrix operation, the matrix for each customer must be composed of matrices of the same size, but since the number of video calls performed for each customer is different, the size of the matrix constituting the matrix for each customer may be different from each other.

따라서, 서버(100)는 일률적인 매트릭스 연산이 가능하도록 검사대상이 되는 고객 중 가장 통화이력이 많은 고객의 영상통화수를 '최대 영상통화수'로 설정할 수 있다. 이어서, 서버(100)는 '최대 영상통화수'를 기초로, 각 고객의 매트릭스를 구성하는 제1 행렬의 크기를 모두 동일하게 변환할 수 있다.Accordingly, the server 100 may set the number of video calls of the customer with the most call history among the customers to be examined as the 'maximum number of video calls' so that a uniform matrix operation is possible. Subsequently, the server 100 may convert the first matrix constituting the matrix of each customer to the same size based on the 'maximum number of video calls'.

예를 들어, 특정 고객의 영상통화수가 3인 경우, 해당 고객의 매트릭스는 1x3 크기의 제1 행렬과, 3x1 크기의 제2 행렬로 구성될 수 있다. 이때, 검사대상인 고객 중에서 영상통화수가 가장 많은 고객의 영상통화수(즉, 최대 영상통화수)가 5인 경우, 서버(100)는 해당 고객의 제1 행렬을 1x5 크기로, 제2 행렬을 5x1크기로 변환할 수 있고, 이 과정에서 크기가 0인 스칼라값 또는 벡터값을 각 행렬에 추가할 수 있다. 다만, 이는 하나의 예시에 불과하며, 본 발명이 이에 한정되는 것은 아니다.For example, when the number of video calls of a specific customer is 3, the customer matrix may include a first matrix having a size of 1x3 and a second matrix having a size of 3x1. At this time, if the number of video calls of the customer with the largest number of video calls among the customers to be inspected (ie, the maximum number of video calls) is 5, the server 100 sets the size of the first matrix of the customer to 1x5 and the size of the second matrix to 5x1. In this process, a scalar value or vector value of size 0 can be added to each matrix. However, this is only one example, and the present invention is not limited thereto.

이를 통해, 각 고객의 매트릭스는 동일한 크기(또는, 동일한 차원)으로 변환되며, 서버(100)는 매트릭스 연산을 이용하여 복수의 영상통화들 간의 안면유사도 판단을 빠르게 수행할 수 있다. 이에 대한 자세한 설명은 이하에서 후술하도록 한다.Through this, the matrix of each customer is converted into the same size (or the same dimension), and the server 100 can quickly determine the facial similarity between a plurality of video calls by using the matrix operation. A detailed description of this will be described below.

한편, 본 발명에서 서버(100)와 고객 단말(200)은 서버-클라이언트 시스템으로 구현될 수 있다. Meanwhile, in the present invention, the server 100 and the customer terminal 200 may be implemented as a server-client system.

구체적으로, 서버(100)는 고객에 대한 다양한 정보와 자료를 저장 및 관리하는 전자문서관리시스템(Electronic Document management System; EDMS)을 구성할 수 있다. 이때, 서버(100)는 각 고객의 계정을 데이터베이스 형태로 운영할 수 있다. Specifically, the server 100 may constitute an electronic document management system (EDMS) that stores and manages various information and data on customers. At this time, the server 100 may operate each customer's account in the form of a database.

이를 통해, 서버(100)는 각 고객 계정에 영상통화 데이터 및 미리 입력받은 안면 이미지(예를 들어, 신분증 이미지 또는 과거에 검출된 안면이미지 등)를 분류하여 저장 및 관리할 수 있고, 금융정보 제공 및 영상통화 등과 관련된 다양한 서비스를 고객 단말(200)에 설치된 단말 어플리케이션을 통해 제공할 수 있다. Through this, the server 100 can classify, store and manage video call data and previously input facial images (eg, ID images or facial images detected in the past) for each customer account, and provide financial information. and various services related to video calls, etc. may be provided through a terminal application installed in the customer terminal 200 .

이때, 단말 어플리케이션은 영상통화 데이터를 수신하거나 금융 서비스를 제공하기 위한 전용 어플리케이션이거나, 웹 브라우징 어플리케이션일 수 있다. 여기에서, 전용 어플리케이션은 고객 단말(200)에 내장된 어플리케이션이거나, 어플리케이션 배포 서버로부터 다운로드 되어 고객 단말(200)에 설치된 어플리케이션일 수 있다.In this case, the terminal application may be a dedicated application for receiving video call data or providing financial services, or a web browsing application. Here, the dedicated application may be an application embedded in the customer terminal 200 or an application downloaded from an application distribution server and installed in the customer terminal 200 .

고객 단말(200)은 유무선 통신 환경에서 어플리케이션을 동작시킬 수 있는 통신 단말기를 의미한다. 도 1에서 고객 단말(200)은 휴대용 단말기의 일종인 스마트폰(smart phone)으로 도시되었지만, 본 발명이 이에 한정되는 것은 아니며, 상술한 바와 같이 금융 어플리케이션을 동작시킬 수 있는 장치에 제한없이 적용될 수 있다. 예를 들어, 고객 단말(200)은 퍼스널 컴퓨터(PC), 노트북, 태블릿, 휴대폰, 스마트폰, 웨어러블 디바이스(예를 들어, 워치형 단말기) 등의 다양한 형태의 전자 장치를 포함할 수 있다.The customer terminal 200 refers to a communication terminal capable of operating an application in a wired/wireless communication environment. In FIG. 1, the customer terminal 200 is shown as a smart phone, which is a type of portable terminal, but the present invention is not limited thereto, and as described above, it can be applied to a device capable of operating a financial application without limitation. there is. For example, the customer terminal 200 may include various types of electronic devices such as a personal computer (PC), a laptop computer, a tablet computer, a mobile phone, a smart phone, and a wearable device (eg, a watch type terminal).

또한, 도면 상에는 하나의 고객 단말(200)만을 도시하였으나, 본 발명이 이에 한정되는 것은 아니며, 서버(100)는 복수의 고객 단말(200)과 연동하여 동작할 수 있다.In addition, although only one customer terminal 200 is shown in the drawing, the present invention is not limited thereto, and the server 100 may operate in conjunction with a plurality of customer terminals 200 .

부가적으로, 고객 단말(200)은 고객의 입력을 수신하는 입력부, 비주얼 정보를 디스플레이 하는 디스플레이부, 외부와 신호를 송수신하는 통신부, 고객의 얼굴을 촬영하는 카메라부, 고객의 음성을 디지털 데이터로 변환하는 마이크부, 및 데이터를 프로세싱하고 고객 단말(200) 내부의 각 유닛들을 제어하며 유닛들 간의 데이터 송/수신을 제어하는 제어부를 포함할 수 있다. 이하, 고객의 명령에 따라 제어부가 고객 단말(200) 내부에서 수행하는 명령은 고객 단말(200)이 수행하는 것으로 통칭한다.Additionally, the customer terminal 200 includes an input unit for receiving a customer's input, a display unit for displaying visual information, a communication unit for sending and receiving signals to and from the outside, a camera unit for photographing the customer's face, and converting the customer's voice into digital data. It may include a microphone unit that converts, and a controller that processes data, controls each unit inside the customer terminal 200, and controls data transmission/reception between units. Hereinafter, commands executed by the controller in the customer terminal 200 according to the customer's command are collectively referred to as those performed by the customer terminal 200 .

한편, 상담원 단말(300)은 서버(100)와 상호 연계되어 동작하며, 고객 단말(200)과 영상통화를 수행하는 상대방이 될 수 있다. 도면에 명확하게 도시하지는 않았으나, 서버(100)는 복수의 상담원 단말(300)과 연계되어 동작하며, 고객 단말(200)로부터 영상통화요청이 수신되는 경우, 복수의 상담원 단말(300) 중 어느 하나를 선택하여 영상통화를 요청한 고객 단말(200)과 매칭시킬 수 있다.Meanwhile, the counselor terminal 300 operates in association with the server 100 and may be a counterpart performing a video call with the customer terminal 200 . Although not clearly shown in the drawing, the server 100 operates in association with a plurality of counselor terminals 300, and when a video call request is received from the customer terminal 200, one of the plurality of counselor terminals 300 can be matched with the customer terminal 200 that requested the video call by selecting .

서버(100)는 매칭된 고객 단말(200)과 상담원 단말(300)에 상호 영상통화를 수행할 수 있도록 중계하는 역할을 수행한다. 이때, 서버(100)는 고객 단말(200)과 상담원 단말(300) 간의 영상통화의 내역을 저장 관리할 수 있다.The server 100 serves as a relay so that a mutual video call can be performed between the matched customer terminal 200 and the counselor terminal 300 . In this case, the server 100 may store and manage details of a video call between the customer terminal 200 and the counselor terminal 300 .

한편, 통신망(400)은 서버(100), 고객 단말(200) 및 상담원 단말(300)을 연결하는 역할을 수행한다. 즉, 통신망(400)은 고객 단말(200) 또는 상담원 단말(300)이 서버(100)에 접속한 후 데이터를 송수신할 수 있도록 접속 경로를 제공하는 통신망을 의미한다. 통신망(400)은 예컨대 LANs(Local Area Networks), WANs(Wide Area Networks), MANs(Metropolitan Area Networks), ISDNs(Integrated Service Digital Networks) 등의 유선 네트워크나, 무선 LANs, CDMA, 블루투스, 위성 통신 등의 무선 네트워크를 망라할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다.Meanwhile, the communication network 400 serves to connect the server 100 , the customer terminal 200 and the counselor terminal 300 . That is, the communication network 400 refers to a communication network that provides an access path so that the customer terminal 200 or the counselor terminal 300 can transmit and receive data after accessing the server 100 . The communication network 400 may be, for example, a wired network such as LANs (Local Area Networks), WANs (Wide Area Networks), MANs (Metropolitan Area Networks), ISDNs (Integrated Service Digital Networks), wireless LANs, CDMA, Bluetooth, satellite communication, etc. However, the scope of the present invention is not limited thereto.

이하에서는, 본 발명의 서버(100)의 구체적인 구성에 대해 살펴보도록 한다.Hereinafter, a detailed configuration of the server 100 of the present invention will be described.

도 2는 도 1의 금융사 서버의 구성을 설명하기 위한 블럭도이다.Figure 2 is a block diagram for explaining the configuration of the financial institution server of Figure 1;

도 2를 참조하면, 서버(100)는 데이터 수집부(110), 검사대상 결정부(120), 안면이미지 도출부(130), 및 연산부(140)를 포함한다. Referring to FIG. 2 , the server 100 includes a data collection unit 110, an examination target determination unit 120, a facial image derivation unit 130, and a calculation unit 140.

우선, 데이터 수집부(110)는 고객 단말(200)과 상담원 단말(300) 간의 영상통화에 데이터를 수신하여 데이터 베이스(미도시)에 저장한다. 데이터베이스(미도시)는 고객의 계정별로 데이터를 분류하여, 저장 및 관리할 수 있다. 또한, 데이터 수집부(110)는 블랙리스트의 업데이트 여부, 신규 고객의 유입여부에 대한 정보도 수신할 수 있다.First, the data collection unit 110 receives video call data between the customer terminal 200 and the counselor terminal 300 and stores it in a database (not shown). A database (not shown) may classify, store, and manage data for each customer account. In addition, the data collection unit 110 may also receive information on whether or not the blacklist is updated and whether new customers are introduced.

검사대상 결정부(120)는 데이터베이스에 저장된 복수의 고객 중에서 안면유사도 판단에 대한 검사대상을 결정한다. 검사대상 결정부(120)는 복수의 검사방법 중 어느 하나에 대한 선택을 수신하고, 상기 선택된 검사방법을 기초로 상기 검사대상을 결정할 수 있다.The test subject determination unit 120 determines a test subject for determining facial similarity among a plurality of customers stored in the database. The inspection target determining unit 120 may receive a selection of one of a plurality of inspection methods and determine the inspection subject based on the selected inspection method.

예를 들어, 검사대상 결정부(120)는 전술한 제1 내지 제3 검사방법에 대한 선택을 수신할 수 있으며, 이때 제1 검사방법의 검사대상은 특정 고객과 블랙리스트가 될 수 있고, 제2 검사방법의 검사대상은 특정 고객의 계정에 귀속된 복수의 영상통화가 될 수 있으며, 제3 검사방법의 검사대상은 복수의 고객의 계정에 귀속된 영상통화가 될 수 있다. 즉, 검사대상은 검사방법에 따라 다르게 정의될 수 있다. For example, the inspection target determination unit 120 may receive selection of the first to third inspection methods described above, and in this case, the inspection target of the first inspection method may be a specific customer and a blacklist. The inspection target of the 2nd inspection method can be a plurality of video calls belonging to a specific customer's account, and the inspection subject of the third inspection method can be a video call belonging to a plurality of customer's accounts. That is, the inspection target may be defined differently according to the inspection method.

또한, 검사방법은 미리 설정된 조건의 충족여부, 미리 설정된 시기의 도래여부, 미리 설정된 기준치 이상의 영상통화가 누적되는 경우 등에 따라 수행여부가 자동으로 결정될 수 있다. In addition, whether or not to perform the inspection method may be automatically determined according to whether a preset condition is met, whether a preset time arrives, and when video calls exceeding a preset reference value are accumulated.

안면이미지 도출부(130)는 검사대상의 복수의 영상통화 데이터에 대하여 안면이미지를 도출한다. 안면이미지를 도출하는 방법에 대한 구체적인 설명은 도 5 내지 도 7을 참조하여 후술하도록 한다.The face image derivation unit 130 derives a face image for a plurality of video call data of a subject to be examined. A detailed description of the method of deriving the facial image will be described later with reference to FIGS. 5 to 7 .

연산부(140)는 도출된 안면이미지에 대한 안면유사도를 산출한다. 여기에서, 연산부(140)는 피쳐 산출부(141), 매트릭스 생성부(143), 매트릭스 변환부(145), 안면유사도 산출부(147)를 포함할 수 있다.The calculation unit 140 calculates the facial similarity with respect to the derived facial image. Here, the calculation unit 140 may include a feature calculation unit 141, a matrix generation unit 143, a matrix conversion unit 145, and a facial similarity calculation unit 147.

이때, 피쳐 산출부(141)는 검사대상에 포함된 고객의 계정에 귀속된 안면이미지의 벡터값을 산출한다. 이때, 안면이미지의 벡터값은 미리 학습된 딥러닝 모듈을 이용하여 산출될 수 있다. At this time, the feature calculation unit 141 calculates a vector value of a facial image belonging to an account of a customer included in the inspection target. At this time, the vector value of the facial image may be calculated using a pre-learned deep learning module.

이어서, 매트릭스 생성부(143)는, 각 고객별로 해당 고객의 영상통화수와, 각각의 영상통화에 대응되는 안면이미지의 벡터값을 이용하여 매트릭스를 생성하는 동작을 수행한다. Subsequently, the matrix generating unit 143 performs an operation of generating a matrix by using the number of video calls of the corresponding customer for each customer and the vector value of the face image corresponding to each video call.

이어서, 매트릭스 변환부(145)는, 각 고객별 매트릭스의 크기를 동일하게 변환하는 동작을 수행한다.Subsequently, the matrix conversion unit 145 performs an operation of equally converting the size of the matrix for each customer.

이어서, 안면유사도 산출부(147)는, 변환된 매트릭스 간 연산을 이용하여 안면유사도를 산출하는 동작을 수행한다.Subsequently, the facial similarity calculation unit 147 performs an operation of calculating the facial similarity using the converted inter-matrix operation.

연산부(140)의 각 모듈의 동작에 대한 구체적인 설명은, 안면유사도 산출 방법을 설명하면서 자세히 기술하도록 한다. 이하에서는, 본 발명의 서버(100)에서 수행되는 안면유사도 산출 방법에 대해 구체적으로 살펴보도록 한다.A detailed description of the operation of each module of the calculation unit 140 will be described in detail while explaining the facial similarity calculation method. Hereinafter, the facial similarity calculation method performed in the server 100 of the present invention will be described in detail.

도 3은 본 발명의 몇몇 실시예에 따른 안면유사도 산출 방법을 설명하기 위한 도면이다. 도 4는 본 발명의 일 실시예에 따른 안면유사도 산출 방법을 설명하기 위한 순서도이다.3 is a diagram for explaining a facial similarity calculation method according to some embodiments of the present invention. 4 is a flowchart for explaining a facial similarity calculation method according to an embodiment of the present invention.

우선, 도 3을 참조하면, 본 발명의 몇몇 실시예에 따른 안면유사도 산출 방법에서, 우선 서버(100)는 고객 단말(200)과의 영상통화를 통해 고객별 영상통화 데이터를 수집한다(S110). 수집된 영상통화 데이터는 고객의 계정에 귀속될 수 있으며, 각 고객의 계정에는 복수의 영상통화와 관련된 데이터가 저장될 수 있다.First, referring to FIG. 3 , in the facial similarity calculation method according to some embodiments of the present invention, the server 100 first collects video call data for each customer through a video call with the customer terminal 200 (S110). . The collected video call data may belong to the customer's account, and data related to a plurality of video calls may be stored in each customer's account.

이어서, 서버(100)는 안면유사도 판단에 대한 검사대상을 결정한다(S120). 검사대상은 검사방법에 따라 다르게 정의될 수 있으며, 검사방법은 미리 설정된 조건의 충족여부, 미리 설정된 시기의 도래여부, 미리 설정된 기준치 이상의 영상통화가 누적되는 경우 등에 따라 수행여부가 결정될 수 있다. Subsequently, the server 100 determines an examination target for facial similarity determination (S120). The inspection target may be defined differently according to the inspection method, and whether to perform the inspection method may be determined depending on whether a preset condition is met, whether a preset time arrives, and whether video calls exceeding a preset threshold are accumulated.

예를 들어, 신규 가입고객이 있거나 블랙리스트가 업데이트 되는 경우, 전술한 제1 검사방법이 수행될 수 있다. 또한, 미리 설정된 시기가 도래하는 경우 또는 미리 설정된 기준치 이상의 영상통화가 누적되는 경우, 제2 검사방법 또는 제3 검사방법이 수행될 수 있다. For example, when there is a new subscribed customer or when a blacklist is updated, the above-described first inspection method may be performed. In addition, when a preset time arrives or when video calls over a preset reference value are accumulated, the second or third test method may be performed.

다만, 이는 몇몇 예시에 불과하며, 제1 내지 제3 검사방법이 수행되는 조건은 다양하게 변형되어 실시될 수 있음은 물론이다. 또한, 제1 내지 제3 검사방법은 시스템 관리자의 명령에 의해 수행될 수 있다.However, these are just a few examples, and the conditions under which the first to third inspection methods are performed can be variously modified and implemented. In addition, the first to third inspection methods may be performed by a system administrator's command.

이때, 제1 검사방법에서 검사대상은 특정 고객과 블랙리스트가 될 수 있고, 제2 검사방법에서 검사대상은 특정 고객의 계정에 귀속된 복수의 영상통화가 될 수 있으며, 제3 검사방법에서 검사대상은 복수의 고객의 계정에 귀속된 영상통화가 될 수 있다. At this time, in the first inspection method, the inspection target can be a specific customer and a blacklist, in the second inspection method, the inspection subject can be a plurality of video calls belonging to the account of a specific customer, and in the third inspection method, The object may be a video call belonging to a plurality of customer accounts.

이어서, 검사대상이 특정된 경우, 서버(100)는 검사대상의 영상통화 데이터로부터 안면이미지를 도출한다(S130). 예를 들어, 서버(100)는 비디오 형태의 영상통화 데이터를 샘플링을 통해 몇몇 프레임을 추출할 수 있으며, 추출된 몇몇 프레임에서 고객의 얼굴에 해당하는 안면이미지를 추출할 수 있다. 안면이미지를 추출하는 구체적인 방법은 도 5를 참조하여 자세히 후술하도록 한다.Subsequently, when the examination target is specified, the server 100 derives a facial image from the video call data of the examination target (S130). For example, the server 100 may extract some frames through sampling of video call data, and may extract a facial image corresponding to a customer's face from some of the extracted frames. A specific method of extracting a facial image will be described later in detail with reference to FIG. 5 .

이어서, 서버(100)는 딥러닝 모듈을 이용하여 추출된 안면이미지에 대한 벡터값(또는, 피쳐(feature))을 산출한다(S140). Subsequently, the server 100 calculates a vector value (or feature) for the extracted facial image using the deep learning module (S140).

여기에서, 딥러닝 모듈은 하나의 이미지에 대해 하나의 벡터값을 출력한다. 또한, 딥러닝 모듈은 동일한 인물 대한 이미지에 대해 동일한(또는, 유사한) 벡터값을 출력하도록 트레이닝될 수 있다. 딥러닝 모듈의 구조 및 트레이닝 방법에 대한 구체적인 내용은 도 8 내지 도 10을 참조하여 자세히 후술하도록 한다.Here, the deep learning module outputs one vector value for one image. Also, the deep learning module may be trained to output the same (or similar) vector values for images of the same person. Details of the structure and training method of the deep learning module will be described later in detail with reference to FIGS. 8 to 10 .

이어서, 서버(100)는 특정 고객의 '영상통화수'와, 각 영상통화에 대응되는 안면이미지의 벡터값을 이용하여 해당 고객에 대한 매트릭스를 생성한다(S150). 서버(100)는 각 고객마다 고유의 매트릭스를 생성할 수 있다. Next, the server 100 creates a matrix for the customer by using the 'number of video calls' of the specific customer and the vector value of the face image corresponding to each video call (S150). Server 100 may create a unique matrix for each customer.

이때, 매트릭스는 해당 고객에게 귀속된 영상통화의 개수에 관련 제1 행렬과, 해당 고객에게 귀속된 각각의 영상통화에 대응되는 안면이미지의 벡터값에 관한 제2 행렬의 곱으로 표현될 수 있다. 이때, 제1 행렬 및 제2 행렬은 영상통화의 개수에 대응되는 하나 이상의 성분(entry) 또는 원소(element)를 포함할 수 있다.In this case, the matrix may be expressed as a product of a first matrix related to the number of video calls belonging to the customer and a second matrix related to vector values of face images corresponding to each video call belonging to the customer. In this case, the first matrix and the second matrix may include one or more entries or elements corresponding to the number of video calls.

예를 들어, 특정 고객의 계정에 3개의 영상통화 데이터가 귀속된 경우, 해당 고객의 매트릭스는 1x3 크기의 제1 행렬과, 3x1 크기의 제2 행렬로 구성될 수 있다. 이때, 제1 행렬의 각 성분은 스칼라값을 갖고, 제2 행렬의 각 성분은 벡터값을 가질 수 있다. 다르게 표현하면, 제1 행렬은 특정 영상통화를 나타내는 레퍼런스번호 또는 유효값으로 구성될 수 있고, 제2 행렬은 특정 영상통화와 관련된 안면이미지의 벡터값으로 구성될 수 있다.For example, when three pieces of video call data are attributed to a specific customer's account, the customer's matrix may consist of a first matrix with a size of 1x3 and a second matrix with a size of 3x1. In this case, each component of the first matrix may have a scalar value, and each component of the second matrix may have a vector value. In other words, the first matrix may be composed of reference numbers or valid values representing a specific video call, and the second matrix may be composed of vector values of face images related to a specific video call.

다만, 이는 하나의 예시에 불과하고, 매트릭스를 구성하는 행렬의 조합과, 각 행렬의 크기 및 구성성분은 다양하게 변형될 수 있음은 물론이다.However, this is only one example, and it goes without saying that the combination of matrices constituting the matrix and the size and components of each matrix may be variously modified.

이어서, 서버(100)는 각 고객별 매트릭스의 크기를 동일하게 변환한다(S160). 이는 서버(100)에서 일률적인 매트릭스 연산이 가능하도록 매트릭스의 크기를 동일하게 맞추는 작업을 의미한다. Subsequently, the server 100 converts the size of the matrix for each customer into the same (S160). This means an operation of equalizing the size of a matrix so that a uniform matrix operation can be performed in the server 100 .

구체적으로, 도 4를 참조하면, 서버(100)는 검사대상이 되는 고객 중 가장 통화이력이 많은 고객의 영상통화수를 '최대 영상통화수'로 설정한다(S161). Specifically, referring to FIG. 4 , the server 100 sets the number of video calls of the customer with the most call history among the customers subject to examination as the 'maximum number of video calls' (S161).

이어서, 서버(100)는 '최대 영상통화수'를 기초로, 각 고객의 매트릭스를 구성하는 행렬의 크기를 모두 동일하게 변환한다(S163).Next, the server 100 converts all the matrices constituting the matrix of each customer into the same size based on the 'maximum number of video calls' (S163).

예를 들어, 특정 고객의 영상통화수가 3인 경우, 해당 고객의 매트릭스는 1x3 크기의 제1 행렬과, 3x1 크기의 제2 행렬로 구성될 수 있다. 이때, 검사대상인 고객 중에서 영상통화수가 가장 많은 고객의 영상통화수(즉, 최대 영상통화수)가 5인 경우, 서버(100)는 해당 고객의 제1 행렬을 1x5 크기로, 제2 행렬을 5x1크기로 변환할 수 있다. 또한, 이 과정에서 서버(100)는 크기가 0인 스칼라값을 제1 행렬에 추가하고, 크기가 0인 벡터값을 제2 행렬에 추가할 수 있다. 이를 통해, 검사대상에 포함된 각 고객의 매트릭스의 크기는 동일하게 변환될 수 있다.For example, when the number of video calls of a specific customer is 3, the customer matrix may include a first matrix having a size of 1x3 and a second matrix having a size of 3x1. At this time, if the number of video calls of the customer with the largest number of video calls among the customers to be inspected (ie, the maximum number of video calls) is 5, the server 100 sets the size of the first matrix of the customer to 1x5 and the size of the second matrix to 5x1. size can be converted. Also, in this process, the server 100 may add a scalar value having a size of 0 to the first matrix and a vector value having a size of 0 to the second matrix. Through this, the size of the matrix of each customer included in the inspection target can be converted to be the same.

다만, 이는 본 발명의 매트릭스의 크기를 동일하게 변환하는 하나의 실시예에 불과하며, 본 발명의 매트릭스의 크기를 변환하는 방법은 당업자에 의해 다양하게 변형되어 실시될 수 있음은 물론이다.However, this is only one embodiment of converting the size of the matrix of the present invention to be the same, and the method of converting the size of the matrix of the present invention can be variously modified and implemented by those skilled in the art.

이어서, 다시 도 3을 참조하면, 서버(100)는 변환된 매트릭스를 이용한 매트릭스 간 연산을 통하여, 검사대상에 대한 안면유사도를 산출한다(S170). 여기에서, 서버(100)는, 벡터 간 유사도를 산출하는 방법(예를 들어, 코사인 디스턴스(cosine distance))을 통하여, 각 이미지 간의 유사도를 계산할 수 있다. 다만, 본 발명이 이에 한정되는 것은 아니며, 서버(100)는 안면유사도를 산출하기 위한 다양한 방법을 이용할 수 있으며, 이에 대한 자세한 설명은 이미 공개되어 있으므로 여기에서는 생략하도록 한다.Subsequently, referring to FIG. 3 again, the server 100 calculates the facial similarity for the examination target through inter-matrix calculation using the transformed matrix (S170). Here, the server 100 may calculate the similarity between the respective images through a method for calculating the similarity between vectors (eg, cosine distance). However, the present invention is not limited thereto, and the server 100 may use various methods for calculating facial similarity, and since detailed descriptions thereof have already been disclosed, they will be omitted here.

이를 통해, 본 발명은 많은 양의 영상통화에서 추출한 안면이미지 간의 안면유사도를 빠른 시간 내에 계산할 수 있으며, 영상통화 간 유사도를 산출하는데 소요되는 시간을 단축시켜, 이상고객존부에 대한 판단속도를 증가시킬 수 있다.Through this, the present invention can calculate the facial similarity between facial images extracted from a large amount of video calls in a short time, shorten the time required to calculate the similarity between video calls, and increase the speed of judgment on the existence of abnormal customers. can

추가적으로, 도면에 명확히 도시하지는 않았으나, 서버(100)는 산출된 안면유사도와 미리 설정된 기준치(또는, 기준범위)를 비교하여, 검사대상의 이상여부를 판단할 수 있다. Additionally, although not clearly shown in the drawings, the server 100 may compare the calculated facial similarity with a preset standard value (or standard range) to determine whether the examination target is abnormal.

이어서, 이상여부가 발견되는 경우, 서버(100)는 해당 케이스를 상담원 단말(300) 또는 관리자 단말(미도시)에 전달하여 이상여부에 대해 육안검사를 실시할 것을 요청할 수 있다. 이를 통해, 서버(100)는 동일인물여부를 재확인할 수 있다.Next, when abnormalities are found, the server 100 may request a visual inspection for abnormalities by forwarding the corresponding case to the counselor terminal 300 or manager terminal (not shown). Through this, the server 100 may reconfirm whether or not the person is the same person.

이어서, 서버(100)는 상담원 단말(300)로부터 수신한 육안검사결과를 기초로 이상여부에 대한 판단에 오류가 있는지 여부를 판단할 수 있다.Subsequently, the server 100 may determine whether there is an error in determining whether or not there is an abnormality based on the visual inspection result received from the counselor terminal 300 .

만약, 이상여부에 대한 판단에 오류가 있는 경우, 서버(100)는 상기 해당 케이스에 대한 이미지를 이용하여 딥러닝 모듈을 재학습시켜, 동일한 오류가 발생되지 않도록 시스템을 업데이트 시킬 수 있다.If there is an error in determining whether or not there is an anomaly, the server 100 may relearn the deep learning module using the image for the corresponding case to update the system so that the same error does not occur.

이를 통해, 본 발명은 이상여부가 확인된 케이스를 이용하여 딥러닝 모듈을 재학습시킴으로써, 안면유사도 판단에 대한 정확성을 높일 수 있다. 따라서, 본 발명은, 다수의 영상통화에 대한 유사도 판단의 정확도를 높일 수 있으며, 금융사고를 예방하고, 금융서비스에 대한 안정성을 향상시킬 수 있다.Through this, the present invention can increase the accuracy of facial similarity judgment by retraining the deep learning module using the case in which abnormality is confirmed. Accordingly, the present invention can increase the accuracy of determining the degree of similarity for multiple video calls, prevent financial accidents, and improve the stability of financial services.

도 5는 도 3의 안면이미지를 도출하고 이에 대한 벡터값을 산출하는 단계에 대한 구체적인 내용을 설명하기 위한 도면이다. 도 6은 도 5에서 복수의 영상통화에서 안면유사도를 산출하는 일 예를 설명하기 위한 도면이다. 도 7은 도 5에서 영상통화와 레퍼런스 이미지 간 안면유사도를 산출하는 일 예를 설명하기 위한 도면이다. 이하에서는 전술한 내용과 중복되는 설명은 생략하고, 차이점을 위주로 기술하도록 한다.FIG. 5 is a diagram for explaining details of a step of deriving the facial image of FIG. 3 and calculating a vector value for it. FIG. 6 is a diagram for explaining an example of calculating facial similarity in a plurality of video calls in FIG. 5 . FIG. 7 is a diagram for explaining an example of calculating a facial similarity between a video call and a reference image in FIG. 5 . Hereinafter, overlapping descriptions with the above descriptions will be omitted, and descriptions will focus on differences.

도 5를 참조하면, 안면유사도 판단에 대한 검사대상이 결정되고(도 3의 S120), 검사대상이 되는 고객의 영상통화 데이터에서 안면이미지를 도출한 이후(도 3의 S130), 서버(100)는 영상통화 데이터의 샘플링 이미지로부터 고객의 얼굴(즉, 안면이미지)을 검출한다(S210).Referring to FIG. 5, after the test subject for facial similarity determination is determined (S120 in FIG. 3) and a facial image is derived from the video call data of the customer to be tested (S130 in FIG. 3), the server 100 detects the customer's face (ie, facial image) from the sampling image of the video call data (S210).

여기에서, 안면 검출은 특정 이미지에서 사람의 얼굴이라고 인식되는 부분을 바운딩 박스로 검출하는 것을 의미한다. 예를 들어, 도 6의 S11 단계를 살펴보면, 샘플링된 이미지에서 고객의 안면 부분을 사각형의 바운딩 박스로 표시할 수 있다. Here, face detection means detecting a part recognized as a human face in a specific image as a bounding box. For example, referring to step S11 of FIG. 6 , the customer's facial part in the sampled image may be displayed as a rectangular bounding box.

이때, 서버(100)는 미리 학습된 딥러닝 모델(예를 들어, MTCNN, Retinaface, 또는 Blazeface)을 이용하여 샘플링 이미지 내에서 고객의 안면을 검출할 수 있다. 물론, 서버(100)에서 사용되는 딥러닝 모델은 다양하게 변형되어 사용될 수 있다.In this case, the server 100 may detect the customer's face in the sampling image using a pre-learned deep learning model (eg, MTCNN, Retinaface, or Blazeface). Of course, the deep learning model used in the server 100 may be variously modified and used.

이어서, 서버(100)는 검출된 바운딩 박스 내의 안면이미지에서 안면 랜드마크를 검출한다(S220). 여기에서, 안면 랜드마크란, 눈, 코, 입, 턱선 및 콧대와 같은 안면의 특징을 구성하는 부분을 의미한다. 안면 랜드마크의 수는 최소 5점(예를 들어, 눈 2점, 코 1점, 입 2점)부터 68점, 96점, 109점까지 다양하게 설정될 수 있다. Subsequently, the server 100 detects a facial landmark from the facial image within the detected bounding box (S220). Here, facial landmarks refer to parts constituting facial features such as eyes, nose, mouth, jawline, and bridge of the nose. The number of facial landmarks may be variously set from a minimum of 5 points (eg, 2 eyes, 1 nose, and 2 mouth) to 68 points, 96 points, and 109 points.

이때, 서버(100)는 미리 학습된 딥러닝 모델(예를 들어, MTCNN, Retinaface, 또는 Blazeface)을 이용하여 검출된 바운딩 박스 내에서 안면의 랜드마크를 검출할 수 있다. 마찬가지로, 서버(100)에서 사용되는 딥러닝 모델은 다양하게 변형되어 사용될 수 있다.In this case, the server 100 may detect facial landmarks within the detected bounding box using a pre-learned deep learning model (eg, MTCNN, Retinaface, or Blazeface). Similarly, the deep learning model used in the server 100 may be variously modified and used.

이어서, 서버(100)는 검출된 안면 랜드마크를 기초로 안면 정렬을 수행한다(S230). 여기에서, 안면 정렬은, 이미지에서 검출된 안면이 정면이 아닌 다른 각도를 향해 있다면 안면이미지를 회전하여 정면 중앙 방향을 바라보도록 정렬하는 것을 의미한다(즉, 도 6의 S12 단계). Next, the server 100 performs facial alignment based on the detected facial landmarks (S230). Here, the face alignment means, if the face detected in the image faces an angle other than the front, the face image is rotated and aligned so as to face the front center direction (ie, step S12 of FIG. 6 ).

예를 들어, 5점 랜드마크를 사용하는 경우, 서버(100)는 눈과 눈 사이에 직선을 형성하고, 해당 직선과 가로 수평선 사이의 각도를 측정하여 반대각도만큼 안면 이미지를 회전시키는 방법을 이용할 수 있다. 다만, 안면 정렬의 방법은 다양하게 변형되어 실시될 수 있음은 물론이다.For example, when using a 5-point landmark, the server 100 may use a method of forming a straight line between the eyes, measuring an angle between the straight line and a horizontal horizontal line, and rotating the facial image by an opposite angle. can However, it goes without saying that the face alignment method may be variously modified and implemented.

이어서, 서버(100)는 정렬된 안면이미지에 대해 딥러닝 모듈을 이용하여 피쳐(feature; 즉, 벡터값)을 추출한다(S240). 서버(100)는 미리 학습된 딥러닝 모듈을 이용하여 안면이미지에 대한 벡터값을 도출할 수 있다. Subsequently, the server 100 extracts features (ie, vector values) from the aligned facial images using a deep learning module (S240). The server 100 may derive a vector value for a facial image using a pre-learned deep learning module.

여기에서, 딥러닝 모듈은 하나의 이미지에 대해 하나의 실수 벡터값을 출력한다. 딥러닝 모듈은 안면 고유의 특징을 실수 벡터로 치환할 수 있고, 각 벡터간의 거리는 유사도를 나타낸다. 예를 들어, 딥러닝 모듈은 '512차원의 벡터값' 또는 '512 차원의 플로팅(floating) 32 벡터'를 출력할 수 있다. 다만, 이는 벡터값의 하나의 예시에 불과하며, 본 발명이 이에 한정되는 것은 아니다.Here, the deep learning module outputs one real vector value for one image. The deep learning module can substitute real-valued vectors for facial features, and the distance between each vector represents a degree of similarity. For example, the deep learning module may output a '512-dimensional vector value' or a '512-dimensional floating 32 vector'. However, this is only an example of a vector value, and the present invention is not limited thereto.

이때, 딥러닝 모듈은 동일한 인물 대한 이미지에 대해 동일한(또는, 유사한) 벡터값을 출력하도록 트레이닝될 수 있다. 딥러닝 모듈의 구조 및 트레이닝 방법에 대한 구체적인 내용은 도 8 내지 도 10을 참조하여 자세히 후술하도록 한다.In this case, the deep learning module may be trained to output the same (or similar) vector values for images of the same person. Details of the structure and training method of the deep learning module will be described later in detail with reference to FIGS. 8 to 10 .

이어서, 서버(100)는 추출된 벡터값을 기초로 고객별로 매트릭스를 생성하고, 매트릭스 연산을 이용하여 매트릭스 간 안면유사도를 산출할 수 있다(S250). 이를 통해, 서버(100)는 많은 수의 영상통화에 대한 안면유사도를 빠르게 연산할 수 있다.Subsequently, the server 100 may generate a matrix for each customer based on the extracted vector value, and calculate facial similarity between matrices using matrix operation (S250). Through this, the server 100 can quickly calculate facial similarities for a large number of video calls.

이어서, 서버(100)는 산출된 안면유사도와 미리 설정된 기준치(또는, 기준범위)를 비교하여 고객의 이상유무를 판단할 수 있다(S260). 추가로, 안면유사도의 판단이 잘못 이루어진 샘플의 경우, 딥러닝 모듈을 재학습하는데 이용할 수 있다. 이에 대한 설명은 도 12를 참조하여 후술하도록 한다.Subsequently, the server 100 may compare the calculated facial similarity with a preset reference value (or reference range) to determine whether the customer has an abnormality (S260). In addition, in the case of samples in which the facial similarity was judged incorrectly, the deep learning module can be used for re-learning. This will be described later with reference to FIG. 12 .

한편, 도 6은 복수의 영상통화 데이터에 대한 안면이미지의 유사도를 판단하는 과정을 개략적으로 나타낸다.Meanwhile, FIG. 6 schematically illustrates a process of determining similarities of facial images with respect to a plurality of video call data.

구체적으로, 도 6을 참조하면, 서버(100)는 서로 다른 제1 영상통화 데이터(VD1) 및 제2 영상통화 데이터(VD2)를 수신한다. 이때, 각각의 영상통화 데이터(VD1, VD2)는 동일 고객에 관한 것이거나(예를 들어, 제2 검사방법), 서로 다른 고객에 관한 것일 수 있다(예를 들어, 제3 검사방법).Specifically, referring to FIG. 6 , the server 100 receives different first video call data VD1 and second video call data VD2. In this case, each of the video call data VD1 and VD2 may be related to the same customer (eg, the second inspection method) or different customers (eg, the third inspection method).

이어서, 서버(100)는 각각의 영상통화 데이터(VD1, VD2)에 대해 특정 프레임을 추출하는 샘플링 과정을 수행한다(S11, S21). Subsequently, the server 100 performs a sampling process of extracting a specific frame for each of the video call data VD1 and VD2 (S11 and S21).

예를 들어, 서버(100)는 영상통화 데이터(VD1, VD2)에 대해 일정 시간 간격으로 프레임을 샘플링하거나, 옵티컬 플로우가 기준치보다 작은 영상 프레임을 도출하여 샘플링 할 수 있다.For example, the server 100 may sample frames of the video call data VD1 and VD2 at regular time intervals or derive and sample video frames having an optical flow smaller than a reference value.

다른 예로, 서버(100)는 특정 음성패턴이 감지되는 구간을 검출하여, 해당 구간 내에서 특정 프레임을 샘플링 할 수 있다.As another example, the server 100 may detect a section in which a specific voice pattern is detected and sample a specific frame within the section.

또 다른 예로, 서버(100)는 추출된 영상데이터(VD)에 대해 포즈 검출 알고리즘을 동작시킬 수 있다. 포즈 검출 알고리즘에 의해 미리 정해진 포즈가 검출된 경우, 서버(100)는 포즈 검출 알고리즘을 종료하고 검출된 포즈와 관련된 영상 프레임을 추출할 수 있다. As another example, the server 100 may operate a pose detection algorithm on the extracted image data VD. When a predetermined pose is detected by the pose detection algorithm, the server 100 may terminate the pose detection algorithm and extract an image frame related to the detected pose.

다만, 이는 영상 프레임을 도출하는 몇몇 예시에 불과하며, 본 발명이 이에 제한되는 것은 아니다.However, these are only some examples of deriving an image frame, and the present invention is not limited thereto.

이어서, 서버(100)는 샘플링된 프레임으로부터 안면 검출을 통해 안면이미지를 검출한다(S12, S22).Next, the server 100 detects a face image from the sampled frame through face detection (S12, S22).

이어서, 서버(100)는 검출된 안면이미지가 정면 중앙을 바라보도록 안면 정렬을 수행한다(S13, S23). 이때, 서버(100)는 전술한 안면 랜드마크의 검출을 통하여 안면 정렬을 수행할 수 있다.Next, the server 100 performs face alignment so that the detected face image faces the center of the front (S13, S23). At this time, the server 100 may perform facial alignment through detection of the aforementioned facial landmarks.

이어서, 서버(100)는 미리 학습된 딥러닝 모듈을 이용하여 안면이미지에 대한 벡터값(즉, 피쳐)를 추출한다(S14, S24).Subsequently, the server 100 extracts vector values (ie, features) for the face image using the pre-learned deep learning module (S14 and S24).

이어서, 서버(100)는 추출된 벡터값에 대한 유사도를 판단하여, 각각의 안면이미지에 대한 안면유사도를 산출한다(S41). 이때, 서버(100)는 추출된 벡터값을 포함하는 매트릭스를 고객별로 생성하여, 매트릭스 연산을 통해 복수의 영상에 대한 안면유사도를 빠르게 산출할 수 있다.Next, the server 100 determines the similarity of the extracted vector values and calculates the facial similarity for each facial image (S41). At this time, the server 100 may generate a matrix including the extracted vector values for each customer, and quickly calculate the facial similarity of the plurality of images through matrix operation.

또한, 도 7은 특정 영상통화 데이터와 레퍼런스 이미지 간 안면이미지의 유사도를 판단하는 과정을 개략적으로 나타낸다.In addition, FIG. 7 schematically illustrates a process of determining the similarity of a facial image between specific video call data and a reference image.

구체적으로, 도 7을 참조하면, 서버(100)는 특정 고객의 영상통화 데이터(VD)와, 레퍼런스 이미지 데이터(ID)를 수신한다. 여기에서, 레퍼런스 이미지 데이터는 고객의 신분증 이미지 또는 미리 등록된 블랙리스트에 관한 이미지가 될 수 있다(예를 들어, 제1 검사방법). 이때, 서버(100)는 고객 단말(200)로부터 고객의 신분증 이미지를 수신하거나, 데이터베이스에 미리 등록된 블랙리스트 또는 고객의 신분증 이미지를 수신하여 이용할 수 있다.Specifically, referring to FIG. 7 , the server 100 receives video call data (VD) and reference image data (ID) of a specific customer. Here, the reference image data may be a customer's identification card image or a pre-registered blacklist image (eg, the first inspection method). At this time, the server 100 may receive the customer's ID image from the customer terminal 200 or receive and use a blacklist or customer's ID image previously registered in the database.

이어서, 서버(100)는 각각의 영상통화 데이터(VD)에 대해 특정 프레임을 추출하는 샘플링 과정을 수행한다(S11). 샘플링 과정에 대한 자세한 예시는 전술하였으므로, 여기에서 중복되는 내용은 생략하도록 한다.Subsequently, the server 100 performs a sampling process of extracting a specific frame for each video call data (VD) (S11). Since a detailed example of the sampling process has been described above, redundant information will be omitted here.

이어서, 서버(100)는 샘플링된 프레임 및 레퍼런스 이미지 데이터(ID)로부터 안면 검출을 통해 각각의 안면이미지를 검출한다(S12, S32).Subsequently, the server 100 detects each face image through face detection from the sampled frame and the reference image data (ID) (S12, S32).

이어서, 서버(100)는 검출된 안면이미지가 정면 중앙을 바라보도록 안면 정렬을 수행한다(S13, S33). 이때, 서버(100)는 전술한 안면 랜드마크의 검출을 통하여 안면 정렬을 수행할 수 있다.Next, the server 100 performs face alignment so that the detected face image faces the center of the front (S13, S33). At this time, the server 100 may perform facial alignment through detection of the aforementioned facial landmarks.

이어서, 서버(100)는 미리 학습된 딥러닝 모듈을 이용하여 각각의 안면이미지에 대한 벡터값(즉, 피쳐)를 추출한다(S14, S34).Next, the server 100 extracts vector values (ie, features) for each facial image using the pre-learned deep learning module (S14, S34).

이어서, 서버(100)는 추출된 벡터값에 대한 유사도를 판단하여, 각각의 안면이미지에 대한 안면유사도를 산출한다(S42). 이때, 서버(100)는 추출된 벡터값을 포함하는 매트릭스를 고객별로 생성하여, 매트릭스 연산을 통해 복수의 영상에 대한 안면유사도를 빠르게 산출할 수 있다.Next, the server 100 determines the similarity of the extracted vector values and calculates the facial similarity for each facial image (S42). At this time, the server 100 may generate a matrix including the extracted vector values for each customer, and quickly calculate the facial similarity of the plurality of images through matrix operation.

이하에서는, 안면이미지의 벡터값을 추출하는 딥러닝 모듈에 대해 자세히 설명하도록 한다.Hereinafter, a deep learning module for extracting a vector value of a facial image will be described in detail.

도 8은 도 3의 S140 단계의 딥러닝 모듈을 설명하기 위한 블럭도이다.8 is a block diagram for explaining the deep learning module in step S140 of FIG. 3 .

도 8을 참조하면, 딥러닝 모듈(DM)은 고객에 관한 이미지 데이터를 입력받고, 이에 대한 출력으로 이미지에 대한 벡터값을 출력할 수 있다. 이때, 이미지 데이터는 미리 추출되어 정렬된 안면이미지가 될 수 있으며, 출력되는 벡터값은 512 차원의 플로팅(floating) 32 벡터일 수 있다. 다만, 출력되는 벡터값의 형식은 다양하게 변형될 수 있음은 물론이다.Referring to FIG. 8 , the deep learning module (DM) may receive image data about a customer and output a vector value of the image as an output thereof. At this time, the image data may be pre-extracted and aligned face images, and the output vector value may be a 512-dimensional floating 32 vector. However, it goes without saying that the format of the output vector value may be variously modified.

딥러닝 모듈(DM)은 빅데이터를 기초로 학습된 인공신경망을 이용하여, 안면이미지에 대한 고유의 벡터값을 도출할 수 있다. 이에 따라, 딥러닝 모듈(DM)은 동일한 이미지에 대해 동일한 벡터값을 출력할 수 있다.The deep learning module (DM) may derive a unique vector value for a facial image using an artificial neural network trained on the basis of big data. Accordingly, the deep learning module (DM) may output the same vector value for the same image.

딥러닝 모듈(DM)은 입력된 데이터를 기초로 도출된 별도의 파라미터에 대한 매핑 데이터를 이용하여 인공신경망 학습을 수행할 수 있다. 딥러닝 모듈(DM)은 학습 인자로 입력되는 파라미터들에 대하여 머신 러닝(machine learning)을 수행할 수 있다. 이때, 서버(100)의 메모리에는 머신 러닝에 사용되는 데이터 및 결과 데이터 등이 저장될 수 있다.The deep learning module (DM) may perform artificial neural network learning using mapping data for separate parameters derived based on input data. The deep learning module (DM) may perform machine learning on parameters input as learning factors. In this case, the memory of the server 100 may store data used for machine learning and result data.

보다 자세히 설명하자면, 머신 러닝(Machine Learning)의 일종인 딥러닝(Deep Learning) 기술은 데이터를 기반으로 다단계로 깊은 수준까지 내려가 학습하는 것이다.To explain in more detail, deep learning technology, a type of machine learning, learns by going down to a deep level in multiple stages based on data.

딥러닝(Deep learning)은, 단계를 높여가면서 복수의 데이터들로부터 핵심적인 데이터를 추출하는 머신 러닝(Machine Learning) 알고리즘의 집합을 나타낸다.Deep learning represents a set of machine learning algorithms that extract core data from a plurality of data while stepping up.

딥러닝 모듈(DM)은 공지된 다양한 딥러닝 구조를 이용할 수 있다. 예를 들어, 딥러닝 모듈(DM)은 CNN(Convolutional Neural Network), RNN(Recurrent Neural Network), DBN(Deep Belief Network), GNN(Graph Neural Network) 등의 구조를 이용할 수 있다.The deep learning module (DM) may use various known deep learning structures. For example, the deep learning module (DM) may use a structure such as a convolutional neural network (CNN), a recurrent neural network (RNN), a deep belief network (DBN), or a graph neural network (GNN).

구체적으로, CNN(Convolutional Neural Network)은 사람이 물체를 인식할 때 물체의 기본적인 특징들을 추출한 다음 뇌 속에서 복잡한 계산을 거쳐 그 결과를 기반으로 물체를 인식한다는 가정을 기반으로 만들어진 사람의 뇌 기능을 모사한 모델이다.Specifically, CNN (Convolutional Neural Network) extracts the basic features of an object when a person recognizes an object, and then performs complex calculations in the brain to recognize the object based on the result. It is a simulated model.

RNN(Recurrent Neural Network)은 자연어 처리 등에 많이 이용되며, 시간의 흐름에 따라 변하는 시계열 데이터(Time-series data) 처리에 효과적인 구조로 매 순간마다 레이어를 쌓아올려 인공신경망 구조를 구성할 수 있다.RNN (Recurrent Neural Network) is widely used in natural language processing, etc., and is an effective structure for processing time-series data that changes over time.

DBN(Deep Belief Network)은 딥러닝 기법인 RBM(Restricted Boltzman Machine)을 다층으로 쌓아 구성되는 딥러닝 구조이다. RBM(Restricted Boltzman Machine) 학습을 반복하여 일정 수의 레이어가 되면, 해당 개수의 레이어를 가지는 DBN(Deep Belief Network)이 구성될 수 있다.DBN (Deep Belief Network) is a deep learning structure composed of multiple layers of RBM (Restricted Boltzman Machine), a deep learning technique. When a certain number of layers is obtained by repeating RBM (Restricted Boltzman Machine) learning, a DBN (Deep Belief Network) having a corresponding number of layers may be configured.

GNN(Graphic Neural Network, 그래픽 인공신경망, 이하, GNN)는 특정 파라미터 간 매핑된 데이터를 기초로 모델링된 모델링 데이터를 이용하여, 모델링 데이터 간의 유사도와 특징점을 도출하는 방식으로 구현된 인공신경망 구조를 나타낸다.GNN (Graphic Neural Network, hereinafter, GNN) represents an artificial neural network structure implemented in a way to derive similarities and feature points between modeling data using modeling data modeled on the basis of data mapped between specific parameters. .

한편, 딥러닝 모듈(DM)의 인공신경망 학습은 주어진 입력에 대하여 원하는 출력이 나오도록 노드간 연결선의 웨이트(weight)를 조정(필요한 경우 바이어스(bias) 값도 조정)함으로써 이루어질 수 있다. 또한, 인공신경망은 학습에 의해 웨이트(weight) 값을 지속적으로 업데이트시킬 수 있다. 또한, 인공신경망의 학습에는 역전파(Back Propagation) 등의 방법이 사용될 수 있다.Meanwhile, learning of the artificial neural network of the deep learning module (DM) can be performed by adjusting the weight of the connection line between nodes (and adjusting the bias value if necessary) so that a desired output is produced for a given input. In addition, the artificial neural network may continuously update a weight value by learning. In addition, a method such as back propagation may be used to learn the artificial neural network.

한편, 서버(100)의 메모리에는 머신 러닝으로 미리 학습된 인공신경망(Artificial Neural Network)이 탑재될 수 있다.Meanwhile, the memory of the server 100 may be equipped with an artificial neural network pre-learned through machine learning.

딥러닝 모듈(DM)은 도출된 파라미터에 대한 모델링 데이터를 입력 데이터로 하는 머신 러닝(machine learning) 기반의 개선 프로세스 추천 동작을 수행할 수 있다. 이때, 인공신경망의 머신 러닝 방법으로는 준지도학습(semi-supervised learning)과 지도학습(supervised learning)이 모두 사용될 수 있다. 또한, 딥러닝 모듈(DM)은 설정에 따라 이상여부가 발견된 샘플데이터에 대하여 정확한 벡터값을 출력하기 위한 인공신경망 구조를 자동 업데이트하도록 제어될 수 있다.The deep learning module (DM) may perform a machine learning-based improvement process recommendation operation using modeling data for the derived parameters as input data. In this case, both semi-supervised learning and supervised learning may be used as machine learning methods of the artificial neural network. In addition, the deep learning module (DM) can be controlled to automatically update the artificial neural network structure for outputting an accurate vector value for sample data in which abnormalities are found according to settings.

추가적으로, 도면에 명확하게 도시하지는 않았으나, 본 발명의 다른 실시예에서, 딥러닝 모듈(DM)의 동작은 서버(100) 또는 별도의 클라우드 서버(미도시)에서 실시될 수 있다. Additionally, although not clearly shown in the drawings, in another embodiment of the present invention, the operation of the deep learning module (DM) may be implemented in the server 100 or a separate cloud server (not shown).

이하에서는, 전술한 본 발명의 실시예에 따른 딥러닝 모듈(DM)의 구성에 대해 살펴보도록 한다.Hereinafter, the configuration of the deep learning module (DM) according to the above-described embodiment of the present invention will be described.

도 9는 도 8의 딥러닝 모듈의 일 예를 설명하기 위한 블럭도이다.9 is a block diagram for explaining an example of the deep learning module of FIG. 8 .

도 9를 참조하면, 딥러닝 모듈(DM)은 추출되어 정렬된 안면이미지를 입력노드로 하는 입력 레이어(input)와, 해당 안면이미지의 특징점에 대한 벡터값을 출력노드로 하는 출력 레이어(Output)와, 입력 레이어와 출력 레이어 사이에 배치되는 M 개의 히든 레이어를 포함한다.Referring to FIG. 9, the deep learning module (DM) has an input layer (input) having extracted and sorted facial images as input nodes, and an output layer (Output) having vector values for feature points of the corresponding facial images as output nodes. and M hidden layers disposed between the input layer and the output layer.

여기서, 각 레이어들의 노드를 연결하는 에지(edge)에는 가중치가 설정될 수 있다. 이러한 가중치 혹은 에지의 유무는 학습 과정에서 추가, 제거, 또는 업데이트 될 수 있다. 따라서, 학습 과정을 통하여, k개의 입력노드와 i개의 출력노드 사이에 배치되는 노드들 및 에지들의 가중치는 업데이트될 수 있다.Here, a weight may be set to an edge connecting nodes of each layer. The presence or absence of these weights or edges can be added, removed, or updated in the learning process. Therefore, through the learning process, weights of nodes and edges disposed between k input nodes and i output nodes may be updated.

딥러닝 모듈(DM)이 학습을 수행하기 전에는 모든 노드와 에지는 초기값으로 설정될 수 있다. 그러나, 누적하여 정보가 입력될 경우, 노드 및 에지들의 가중치는 변경되고, 이 과정에서 학습인자로 입력되는 파라미터들(즉, 구간 별 음성데이터 및 음성 패턴)과 출력노드로 할당되는 값(즉, 구간 별 음성 유사도) 사이의 매칭이 이루어질 수 있다.All nodes and edges may be set to initial values before the deep learning module (DM) performs learning. However, when information is accumulated and input, the weights of nodes and edges are changed, and in this process, the parameters input as learning factors (i.e., voice data and voice patterns for each section) and the values assigned to output nodes (i.e., Voice similarity for each section) may be matched.

추가적으로, 클라우드 서버(미도시)를 이용하는 경우, 딥러닝 모듈(DM)은 많은 수의 파라미터들을 수신하여 처리할 수 있다. 따라서, 딥러닝 모듈(DM)은 방대한 데이터에 기반하여 학습을 수행할 수 있다.Additionally, when using a cloud server (not shown), the deep learning module (DM) may receive and process a large number of parameters. Therefore, the deep learning module (DM) can perform learning based on massive data.

딥러닝 모듈(DM)을 구성하는 입력노드와 출력노드 사이의 노드 및 에지의 가중치는 딥러닝 모듈(DM)의 학습 과정에 의해 업데이트될 수 있다. 또한, 딥러닝 모듈(DM)에서 출력되는 파라미터는 이미지에 대한 벡터값 외에도 다양한 데이터로 추가 확장될 수 있음은 물론이다. The weights of nodes and edges between an input node and an output node constituting the deep learning module (DM) may be updated by the learning process of the deep learning module (DM). In addition, it goes without saying that parameters output from the deep learning module (DM) can be additionally expanded with various data in addition to vector values for images.

예를 들어, 딥러닝 모듈(DM)은 서로 다른 복수의 이미지를 입력받고, 입력받은 복수의 이미지에 대한 유사도를 출력하도록 변형되어 실시될 수 있다. 다만, 이는 하나의 확장 실시예에 대한 예시에 불과하며 본 발명이 이에 한정되는 것은 아니다.For example, the deep learning module (DM) may be implemented by receiving a plurality of different images and outputting a similarity of the plurality of input images. However, this is only an example of one extended embodiment and the present invention is not limited thereto.

이하에서는, 본 발명의 실시예에서, 삼중항 손실(Triplet Loss) 구조를 이용하여 본 발명의 딥러닝 모듈(DM)이 학습되는 과정에 대해 살펴보도록 한다. 다만, 이는 딥러닝 모듈의 학습과정의 하나의 예시일 뿐, 본 발명이 이에 한정되는 것은 아니다.Hereinafter, in an embodiment of the present invention, a process in which the deep learning module (DM) of the present invention is learned using a triplet loss structure will be described. However, this is only one example of the learning process of the deep learning module, and the present invention is not limited thereto.

도 10은 도 8의 딥러닝 모듈의 학습단계를 설명하기 위한 블럭도이다.10 is a block diagram for explaining a learning step of the deep learning module of FIG. 8 .

도 10을 참조하면, 딥러닝 모듈(DM)은 서로 다른 이미지가 입력되는 복수의 뉴럴 네트워크 모듈(11), 각 뉴럴 네트워크 모듈에서 출력된 값의 유사도를 산출하는 유사도 판단 모듈(13)(distance calculator), 산출된 각 유사도에 대한 가중치를 조절하는 가중치 모듈(15)(weight calculator) 및 결과값의 오차에 대한 피드백을 제공하는 피드백 모듈(17)(feedback module)을 포함할 수 있다.Referring to FIG. 10, the deep learning module (DM) includes a plurality of neural network modules 11 into which different images are input, and a similarity determination module 13 (distance calculator) that calculates the similarity of values output from each neural network module. ), a weight module 15 (weight calculator) for adjusting the weight for each calculated similarity, and a feedback module 17 (feedback module) for providing feedback on the error of the result value.

딥러닝 모듈(DM)은 기본적으로 삼중항 손실(triplet loss)의 기계 학습 알고리즘을 이용한다. 따라서, 뉴럴 네트워크 모듈(11)에는 서로 다른 3개의 뉴럴 네트워크 서브 모듈(11a, 211b, 211c)이 포함되며, 각각의 서브 모듈(11a, 11b, 11c)에는 서로 다른 이미지가 입력된다.The deep learning module (DM) basically uses a machine learning algorithm of triplet loss. Accordingly, the neural network module 11 includes three different neural network submodules 11a, 211b, and 211c, and different images are input to each of the submodules 11a, 11b, and 11c.

예를 들어, 제1 서브 모듈(11a)에는 판단의 대상이 되는 기준 이미지(I1)(Anchor Image)가 입력되고, 제2 서브 모듈(11b)에는 기준 이미지(I1)와 동일한 대상을 포함하는 포지티브 이미지(I2)(Positive Image)가 입력되며, 제3 서브 모듈(11c)에는 I1, I2와 비유사한 네가티브 이미지(I3)(Negative Image)가 입력될 수 있다.For example, a reference image I1 (anchor image) to be judged is input to the first submodule 11a, and a positive object including the same object as the reference image I1 is input to the second submodule 11b. A positive image (I2) is input, and a negative image (I3) similar to I1 and I2 may be input to the third sub-module 11c.

이때 각각의 서브 모듈(11a, 11b, 11c) 간에는 뉴럴 네트워크의 가중치(weight)가 공유될 수 있다.In this case, the weight of the neural network may be shared among the submodules 11a, 11b, and 11c.

각각의 서브 모듈(11a, 11b, 11c)에서 출력된 출력값(Av, Pv, Nv)은 벡터값을 가질 수 있으며, 각각의 벡터값은 전술한 벡터값과 동일한 형식을 취할 수 있다.The output values Av, Pv, and Nv output from each of the submodules 11a, 11b, and 11c may have vector values, and each vector value may have the same format as the vector value described above.

이어서, 각각의 서브 모듈(11a, 11b, 11c)에서 출력된 출력값(Av, Pv, Nv)은 유사도 판단 모듈(13)에 입력된다. 또한, 유사도 판단 모듈(13)에는 입력된 기준 이미지(I1)에 대한 기준값(Ground Truth; GT)이 입력된다.Subsequently, the output values Av, Pv, and Nv output from each of the submodules 11a, 11b, and 11c are input to the similarity determination module 13. Also, a ground truth (GT) for the input reference image I1 is input to the similarity determination module 13 .

유사도 판단 모듈(13)은 입력된 출력값(Av, Pv, Nv) 및 기준값(GT)을 이용하여 각 값들 간의 유사도를 계산한다. 예를 들어, 유사도 판단 모듈(13)은 코사인 디스턴스(Cosine distance) 함수를 이용하여 입력된 값들의 유사도를 산출할 수 있다. The similarity determination module 13 calculates the similarity between the respective values using the input output values Av, Pv, and Nv and the reference value GT. For example, the similarity determination module 13 may calculate the similarity of input values using a cosine distance function.

이때, 유사도 판단 모듈(13)은 기준 이미지에 대한 제1 결과값(Av)과 포지티브 이미지에 대한 제2 결과값(Pv) 간의 제1 유사도, 제1 결과값(Av)과 네가티브 이미지에 대한 제3 결과값(Nv) 간의 제2 유사도, 제2 결과값(Pv)과 제3 결과값(Nv) 간의 제3 유사도, 제1 결과값(Av)과 기준값(GT) 간의 제4 유사도, 제2 결과값(Pv)과 기준값(GT) 간의 제5 유사도를 도출하여, 가중치 모듈(15)에 전달할 수 있다. 또한, 도면에 도시되지는 않았으나, 유사도 판단 모듈(13)은 제3 결과값(Nv)과 기준값(GT) 간의 제6 유사도를 추가적으로 도출하여, 가중치 모듈(15)에 전달할 수 있다.At this time, the similarity determination module 13 determines the first similarity between the first resultant value (Av) for the reference image and the second resultant value (Pv) for the positive image, and the first resultant value (Av) for the negative image. 3 The second similarity between result values (Nv), the third similarity between the second result value (Pv) and the third result value (Nv), the fourth similarity between the first result value (Av) and the reference value (GT), the second A fifth degree of similarity between the result value Pv and the reference value GT may be derived and transmitted to the weight module 15 . Also, although not shown in the drawing, the similarity determination module 13 may additionally derive a sixth similarity between the third result value Nv and the reference value GT and transmit it to the weight module 15 .

이어서, 가중치 모듈(15)은 수신된 유사도에 미리 설정된 가중치를 적용하여 제1 시점의 결과값(T(t))을 출력할 수 있다. 예를 들어, 가중치 모듈(15)은 제1 내지 제3 유사도에는 제1 가중치를 적용하고, 제4 및 제5 유사도에는 제1 가중치와 다른 제2 가중치를 적용함으로써 제1 시점의 결과값(T(t))을 도출할 수 있다.Next, the weight module 15 may apply a preset weight to the received similarity and output a result value T(t) at the first time point. For example, the weight module 15 applies a first weight to the first to third similarities, and applies a second weight different from the first weight to the fourth and fifth similarities, thereby applying a resultant value at the first point in time (T (t)) can be derived.

이어서, 가중치 모듈(15)에서 출력된 결과값(T(t))은 피드백 모듈(17)에 제공될 수 있으며, 피드백 모듈(17)은 가중치 모듈(15)로부터 제1 시점에 수신한 제1 결과값(T(t))과 제2 시점에 수신한 제2 결과값(T(t-1)) 사이의 차이값을 도출하고, 도출된 값을 피드백 값으로 뉴럴 네트워크 모듈(11)에 제공할 수 있다. Subsequently, the result value T(t) output from the weight module 15 may be provided to the feedback module 17, and the feedback module 17 may receive the first value received from the weight module 15 at a first time point. A difference value between the result value T(t) and the second result value T(t-1) received at the second time point is derived, and the derived value is provided to the neural network module 11 as a feedback value. can do.

뉴럴 네트워크 모듈(11)은 수신된 피드백 값을 이용하여 각 뉴럴 네트워크 서브 모듈(11a, 11b, 11c)에 대한 가중치를 조절할 수 있다.The neural network module 11 may adjust the weight of each neural network submodule 11a, 11b, and 11c using the received feedback value.

또한, 딥러닝 모듈(DM)은 학습 모드와 수행 모드로 나누어 동작할 수 있다.In addition, the deep learning module (DM) can operate by dividing into a learning mode and a performance mode.

학습 모드에서, 딥러닝 모듈(DM)은 미리 설정된 학습 데이터셋을 통해 각 이미지들의 유사도 판단의 정확성을 높일 수 있도록 학습될 수 있다. 딥러닝 모듈(DM)은 데이터셋을 이용한 충분한 학습을 통해, 동일한 대상을 포함하는 기준 이미지와 포지티브 이미지에 대해 동일한 또는 유사도가 높은 벡터값을 출력할 수 있다.In the learning mode, the deep learning module (DM) may be trained to increase the accuracy of determining the similarity of each image through a preset training dataset. The deep learning module (DM) may output vector values having the same or high similarity for the reference image and the positive image including the same object through sufficient learning using the dataset.

수행 모드에서, 딥러닝 모듈(DM)에는 하나의 이미지만 입력될 수 있으며, 이에 따라 딥러닝 모듈(DM)은 학습된 뉴럴 네트워크를 이용하여 입력된 이미지에 대한 벡터값을 출력할 수 있다.In the execution mode, only one image may be input to the deep learning module (DM), and accordingly, the deep learning module (DM) may output a vector value for the input image using the trained neural network.

딥러닝 모듈(DM)에서 출력된 벡터값은 각 입력 이미지에 대응되도록 고객의 계정에 저장될 수 있으며, 전술한 바와 같이 고객의 매트릭스 생성에 이용될 수 있다.Vector values output from the deep learning module (DM) can be stored in the customer's account to correspond to each input image, and can be used to generate the customer's matrix as described above.

도 11은 도 3의 S120에서 안면유사도 판단에 대한 검사대상이 결정되는 과정을 설명하기 위한 도면이다.FIG. 11 is a diagram for explaining a process of determining a test target for determining facial similarity in S120 of FIG. 3 .

도 11을 참조하면, 서버(100)는 미리 설정된 복수의 검사방법 중 어느 하나에 대한 선택을 수신하고, 선택된 검사방법을 기초로 안면유사도 산출에 대한 검사대상을 결정할 수 있다. Referring to FIG. 11 , the server 100 may receive a selection of one of a plurality of preset examination methods and determine a test target for facial similarity calculation based on the selected examination method.

또한, 서버(100)는 미리 설정된 조건의 충족여부, 미리 설정된 시기의 도래여부, 미리 설정된 기준치 이상의 영상통화가 누적되는 경우 등에 따라 특정 검사방법을 자동으로 수행하고, 이에 따라 검사대상을 결정할 수 있다.In addition, the server 100 may automatically perform a specific inspection method depending on whether a preset condition is met, whether a preset time has arrived, and video calls exceeding a preset reference value are accumulated, and determine the test subject accordingly. .

구체적으로, 서버(100)가 관리자로부터 복수의 검사방법 중 어느 하나에 대한 선택을 수신하는 경우, 서버(100)는 해당 검사방법을 기초로 검사대상을 결정한다(S310).Specifically, when the server 100 receives a selection for any one of a plurality of inspection methods from the administrator, the server 100 determines the inspection target based on the inspection method (S310).

예를 들어, 서버(100)가 제1 검사방법에 대한 선택을 수신하는 경우, 서버(100)는 특정 고객과 미리 설정된 블랙리스트에 등록된 레퍼런스 이미지 간의 안면유사도를 판단하는 동작을 수행할 수 있다(S341). 이 경우, 검사대상은 특정 고객에 대한 영상통화 데이터와 미리 설정된 블랙리스트에 등록된 레퍼런스 이미지가 될 수 있다.For example, when the server 100 receives a selection for the first inspection method, the server 100 may perform an operation of determining facial similarity between a specific customer and a reference image registered in a preset blacklist. (S341). In this case, the inspection target may be video call data for a specific customer and a reference image registered in a preset blacklist.

다른 예로, 서버(100)가 제2 검사방법에 대한 선택을 수신하는 경우, 서버(100)는 특정 고객에 귀속된 복수의 영상통화에 대한 안면이미지 간의 안면유사도를 판단하는 동작을 수행할 수 있다(S343). 이 경우, 검사대상은 특정 고객의 계정에 귀속된 영상통화 데이터가 될 수 있다.As another example, when the server 100 receives a selection for the second inspection method, the server 100 may perform an operation of determining facial similarity between facial images of a plurality of video calls belonging to a specific customer. (S343). In this case, the inspection target may be video call data belonging to a specific customer's account.

또 다른 예로, 서버(100)가 제3 검사방법에 대한 선택을 수신하는 경우, 서버(100)는 서로 다른 고객에 귀속된 각각의 영상통화에 대한 안면이미지 간의 안면유사도를 판단하는 동작을 수행할 수 있다(S345). 이 경우, 검사대상은 특정 고객의 계정에 귀속된 영상통화 데이터와, 다른 고객의 계정에 귀속된 영상통화 데이터가 될 수 있다.As another example, when the server 100 receives a selection for the third inspection method, the server 100 may perform an operation of determining facial similarity between facial images for video calls belonging to different customers. It can (S345). In this case, the inspection target may be video call data belonging to a specific customer's account and video call data belonging to another customer's account.

이때, 제2 검사방법 및 제3 검사방법은, 전술한 바와 같이 고객마다 동일한 크기로 변환된 고유의 매트릭스를 이용하여 안면유사도를 산출할 수 있다.In this case, the second inspection method and the third inspection method may calculate facial similarity using a unique matrix converted to the same size for each customer as described above.

또한, 제1 검사방법은, 미리 등록된 블랙리스트의 레퍼런스 이미지의 수에 대응되는 행렬과, 검사대상이 되는 고객의 영상통화수에 대응되는 행렬의 크기를 일치시킨 뒤, 매트릭스 연산을 통하여 안면유사도를 산출할 수 있다. 만약, 제1 검사방법에서 검사대상이 되는 고객이 복수인 경우, 레퍼런스 이미지의 수와 복수의 고객에 대한 영상통화수 중 가장 큰 값을 기초로 변환될 매트릭스의 크기를 결정할 수 있다.In addition, the first inspection method matches the size of the matrix corresponding to the number of reference images of the blacklist registered in advance with the size of the matrix corresponding to the number of video calls of the customer to be inspected, and then calculates the facial similarity through matrix operation. can be calculated. If there are a plurality of customers to be inspected in the first inspection method, the size of the matrix to be converted may be determined based on the largest value among the number of reference images and the number of video calls to the plurality of customers.

다만, 본 발명의 실시예에 따른 검사방법이 제1 내지 제3 검사방법에만 한정되는 것은 아니며, 추가적인 검사방법 또는 변형된 검사방법이 이용될 수 있음은 물론이다.However, the inspection method according to the embodiment of the present invention is not limited to the first to third inspection methods, and additional inspection methods or modified inspection methods may be used.

한편, 서버(100)에서 검사방법을 수신하지 않는 경우, 서버(100)는 미리 설정된 조건에 의해 검사방법의 실시여부를 결정할 수 있다.Meanwhile, when the server 100 does not receive the inspection method, the server 100 may determine whether to implement the inspection method according to a preset condition.

예를 들어, 서버(100)에 새로 가입한 신규고객이 발생한 경우, 서버(100)는 전술한 제1 검사방법을 수행할 수 있다(S320).For example, when a new customer newly subscribes to the server 100 occurs, the server 100 may perform the first inspection method described above (S320).

또한, 서버(100)에 새로 가입된 신규고객이 없더라도, 서버(100)에서 관리하는 블랙리스트에 업데이트가 있는 경우, 서버(100)는 전술한 제1 검사방법을 수행할 수 있다(S330).In addition, even if there is no new customer newly subscribed to the server 100, if there is an update in the blacklist managed by the server 100, the server 100 may perform the first inspection method described above (S330).

또한, 도면에 명확히 도시하지는 않았으나, 미리 설정된 주기가 도래하는 경우, 서버(100)는 전술한 제2 검사방법 또는 제3 검사방법을 자동으로 수행할 수 있다.In addition, although not clearly shown in the drawings, when a preset period arrives, the server 100 may automatically perform the above-described second inspection method or third inspection method.

다만, 이러한 특정 검사방법의 실시여부를 결정하는 조건은 다양하게 변형되어 실시될 수 있음은 자명하다.However, it is obvious that the conditions for determining whether or not to implement this specific inspection method can be variously modified and implemented.

부가적으로, 제1 검사방법에서, 서버(100)는 전체 고객과 블랙리스트 레퍼런스 이미지 간의 안면유사도를 산출할 수 있다. Additionally, in the first inspection method, the server 100 may calculate facial similarity between all customers and the blacklist reference image.

예를 들어, 서버(100)는 고객의 각 영상통화에 대한 안면이미지와 블랙리스트 레퍼런스 이미지 각각에 대해서 512 차원의 벡터값을 추출할 수 있다. 이 경우, 각각의 고객에 대하여 '영상통화수' x '512차원의 벡터값'에 관한 매트릭스가 생성되며, 블랙리스트의 경우 '레퍼런스 이미지수' x '512차원의 벡터값'에 관한 매트릭스가 생성된다. 이어서. 서버(100)는 고객과 블랙리스트에 대한 두 매트릭스 대하여, '512 차원의 벡터값'에 대해서 안면유사도를 산출한다. 이에 대한 결과로, '영상통화수' x '레퍼런스 이미지수'에 관한 결과값 매트릭스가 산출될 수 있다. 이어서, 서버(100)는 결과값 매트릭스에 포함된 각각의 값이 미리 정해진 기준치를 넘는지 여부를 판단하여, 해당 영상통화와 블랙리스트가 유사한지 여부를 판단할 수 있다.For example, the server 100 may extract a 512-dimensional vector value for each of the customer's face image for each video call and the blacklist reference image. In this case, a matrix of 'number of video calls' x 'vector value of 512 dimensions' is created for each customer, and a matrix of 'number of reference images' x 'vector value of 512 dimensions' is created for the blacklist. do. next. The server 100 calculates facial similarity for '512-dimensional vector values' with respect to the two matrices of the customer and the blacklist. As a result of this, a matrix of result values related to 'number of video calls' x 'number of reference images' can be calculated. Subsequently, the server 100 may determine whether each value included in the result value matrix exceeds a predetermined reference value, and determine whether the corresponding video call is similar to the blacklist.

한편, 제2 검사방법에서, 서버(100)는 전체 고객에 대하여 각 고객별 복수의 영상통화에 대한 각각의 안면유사도를 산출할 수 있다.Meanwhile, in the second inspection method, the server 100 may calculate facial similarities for a plurality of video calls for each customer with respect to all customers.

예를 들어, 특정 고객에 대해서 영상통화의 수가 n건이라면, 복수의 영상통화에 대한 안면이미지의 비교 조합수는 “n 콤비네이션 2”가 된다. 즉, 서버(100)는 각 조합의 경우의 수만큼 안면유사도를 계산해야 한다. 이를 전체 고객으로 확장하면, 각 고객별 영상통화들의 조합의 수만큼 안면유사도에 대한 연산을 반복해야 한다. 다만, 각 고객의 영상통화수가 상이하기 때문에, 고객마다 순차적으로 조합의 수만큼 안면유사도를 산출해야 하는데, 이 경우 시스템에 많은 부하가 발생할 수 있다. 이러한 문제점을 해결하기 위해, 본 발명은 가장 통화이력이 많은 고객의 영상통화수를 '최대 영상통화수'로 설정하고, 모든 고객에 대해서 '512 차원의 0 벡터'를 '최대 영상통화수'를 기초로 각 고객의 행렬에 추가해서 모든 고객이 동일한 수의 '512 차원 벡터값'을 갖도록 매트릭스를 생성한다. 이를 통해, 각각의 고객은 동일한 크기(또는, 차원)의 매트릭스를 가질 수 있으며, 매트릭스의 크기가 모두 동일하기 때문에 서버(100)는 매트릭스 연산을 이용하여 안면유사도의 연산을 고속화할 수 있다.For example, if the number of video calls is n for a specific customer, the number of comparison combinations of facial images for a plurality of video calls is “n combinations 2”. That is, the server 100 must calculate facial similarity as many times as the number of cases of each combination. If this is extended to all customers, the facial similarity calculation should be repeated as many times as the number of combinations of video calls for each customer. However, since the number of video calls for each customer is different, it is necessary to sequentially calculate facial similarity as many as the number of combinations for each customer. In this case, a lot of load may occur in the system. In order to solve this problem, the present invention sets the number of video calls of the customer with the most call history as the 'maximum number of video calls', and sets the '512-dimensional 0 vector' to the 'maximum number of video calls' for all customers. By adding to each customer's matrix as a basis, a matrix is created so that all customers have the same number of '512-dimensional vector values'. Through this, each customer can have a matrix of the same size (or dimension), and since all matrices have the same size, the server 100 can speed up the facial similarity calculation by using the matrix operation.

다른 한편, 제3 검사방법에서, 서버(100)는 전체 고객에 대하여 서로 다른 고객별 복수의 영상통화에 대한 각각의 안면유사도를 산출할 수 있다. 제3 검사방법은 제2 검사방법과 실질적으로 동일하게 동작할 수 있으며, 산출된 안면유사도와 기준치를 비교하여 이상여부를 판단하는 방법만 상이할 수 있다.On the other hand, in the third inspection method, the server 100 may calculate facial similarity for each of a plurality of video calls for each different customer with respect to all customers. The third inspection method may operate substantially the same as the second inspection method, and may differ only in a method for determining whether or not there is an abnormality by comparing the calculated facial similarity with a reference value.

이하에서는, 전술한 제2 검사방법 및 제3 검사방법을 수행함에 있어서, 안면유사도 산출 결과를 이용하여 안면이미지의 이상여부를 판단하고, 이를 기초로 딥러닝 모듈을 재학습시키는 과정에 대해 자세히 설명하도록 한다.Hereinafter, in performing the above-described second and third inspection methods, the process of determining whether or not the facial image is abnormal using the facial similarity calculation result and re-learning the deep learning module based on this will be described in detail. let it do

도 12는 본 발명의 몇몇 실시예에서 검사방법에 따라 딥러닝 모듈을 재학습 시키는 과정을 설명하기 위한 순서도이다. 이하에서는, 설명의 편의를 위하여 이상여부에 대한 판단을 상담원 단말(300)이 수행하는 것을 예로 들어 설명하도록 한다. 다만, 본 발명에서 이상여부에 대한 판단의 주체는 변경되어 수행될 수 있음은 물론이다.12 is a flowchart for explaining a process of re-learning a deep learning module according to an inspection method in some embodiments of the present invention. Hereinafter, for convenience of description, an example in which the counselor terminal 300 determines whether or not there is an error will be described. However, it goes without saying that in the present invention, the subject of determination of abnormality may be changed and performed.

도 12를 참조하면, 도 3의 안면유사도를 산출하는 S170 단계에 이어서, 서버(100)는 산출된 안면유사도가 미리 설정된 기준치보다 작은지 여부를 판단한다(S410).Referring to FIG. 12 , following step S170 of calculating the facial similarity of FIG. 3 , the server 100 determines whether the calculated facial similarity is smaller than a preset reference value (S410).

이어서, 안면유사도가 기준치보다 작은 경우, 서버(100)는 수행된 검사방법이 제2 검사방법인지 여부를 판단한다(S421).Subsequently, when the facial similarity is smaller than the reference value, the server 100 determines whether the performed test method is the second test method (S421).

이어서, 제2 검사방법으로 안면유사도를 산출하고, 안면유사도가 미리 설정된 기준치보다 작은 경우에 대하여, 서버(100)는 해당 케이스를 상담원 단말(300)에 전달하여 이상여부에 대해 육안검사를 실시할 것을 요청한다(S423). 즉, 서버(100)가 검사대상이 동일인물인지 여부를 판단하는 제2 검사방법을 수행하였기에, 상담원 단말(300)은 해당 케이스에 대하여 동일인물인지 여부를 다시 판단하게 된다(S425).Subsequently, the facial similarity is calculated by the second inspection method, and when the facial similarity is smaller than a preset reference value, the server 100 transfers the case to the counselor terminal 300 to perform a visual inspection for abnormalities. request (S423). That is, since the server 100 has performed the second test method for determining whether the test subject is the same person, the counselor terminal 300 determines again whether the test target is the same person (S425).

만약, 상담원 단말(300)의 육안검사결과, 해당 케이스가 동일인물로 판단되는 경우, 서버(100)는 해당 케이스에 대한 이미지를 이용하여 딥러닝 모듈을 재학습시킨다(S427). 즉, 해당 케이스는 동일인물에 대한 이미지를 유사도가 낮다고 잘못 판단한 케이스에 해당하므로, 딥러닝 모듈이 해당 케이스에 대해 향후 정확한 유사도를 도출할 수 있도록 이를 재학습시킨다.If, as a result of the visual inspection of the counselor terminal 300, if the case is determined to be the same person, the server 100 relearns the deep learning module using the image of the case (S427). That is, since the corresponding case corresponds to a case in which images of the same person are mistakenly judged to have low similarity, the deep learning module relearns the case so that an accurate similarity can be derived in the future.

반면, 안면유사도가 기준치보다 큰 경우, 서버(100)는 수행된 검사방법이 제3 검사방법인지 여부를 판단한다(S431).On the other hand, if the facial similarity is greater than the reference value, the server 100 determines whether the performed examination method is the third examination method (S431).

이어서, 제3 검사방법으로 안면유사도를 산출하고, 안면유사도가 미리 설정된 기준치보다 큰 경우에 대하여, 서버(100)는 해당 케이스를 상담원 단말(300)에 전달하여 이상여부에 대해 육안검사를 실시할 것을 요청한다(S433). 즉, 서버(100)가 검사대상이 서로 다른 인물인지 여부를 판단하는 제3 검사방법을 수행하였기에, 상담원 단말(300)은 해당 케이스에 대하여 동일인물인지 여부를 다시 판단하게 된다(S435).Subsequently, the facial similarity is calculated by the third inspection method, and when the facial similarity is greater than a preset reference value, the server 100 transfers the case to the counselor terminal 300 to perform a visual inspection for abnormalities. request (S433). That is, since the server 100 has performed the third test method for determining whether the test target is a different person, the counselor terminal 300 determines again whether the test target is the same person (S435).

만약, 상담원 단말(300)의 육안검사결과, 해당 케이스가 동일인물이 아닌 것으로 판단되는 경우, 서버(100)는 해당 케이스에 대한 이미지를 이용하여 딥러닝 모듈을 재학습시킨다(S437). 즉, 해당 케이스는 서로 다른 인물에 대한 이미지를 유사도가 높다고 잘못 판단한 케이스에 해당하므로, 마찬가지로 딥러닝 모듈이 해당 케이스에 대해 향후 정확한 유사도를 도출할 수 있도록 이를 재학습시킨다.If, as a result of the visual inspection of the counselor terminal 300, it is determined that the corresponding case is not the same person, the server 100 relearns the deep learning module using the image of the corresponding case (S437). That is, since the corresponding case corresponds to a case in which images of different persons are mistakenly judged to have a high degree of similarity, the deep learning module similarly relearns the case so that an accurate degree of similarity can be derived in the future.

이러한 딥러닝 모듈의 재학습 과정은, 판단이상여부가 발견될 때마다 수행되거나, 미리 정해진 숫자 이상의 케이스가 누적된 경우에 한꺼번에 수행될 수 있다.The re-learning process of the deep learning module may be performed whenever an abnormality in judgment is found, or may be performed at once when a predetermined number or more cases are accumulated.

이러한 딥러닝 모듈의 재학습 과정을 통하여, 본 발명은 안면유사도 판단에 대한 정확성을 높일 수 있다. 이를 통해, 본 발명은 복수의 영상통화에 대한 유사도 판단의 정확도를 높일 수 있으며, 금융사고의 발생을 예방하고, 금융서비스에 대한 안정성을 향상시킬 수 있다.Through this relearning process of the deep learning module, the present invention can increase the accuracy of facial similarity determination. Through this, the present invention can increase the accuracy of determining the similarity of a plurality of video calls, prevent the occurrence of financial accidents, and improve the stability of financial services.

도 13은 본 발명의 몇몇 실시예에 따른 안면유사도 산출 방법을 수행하는 시스템의 하드웨어 구현을 설명하기 위한 도면이다.13 is a diagram for explaining hardware implementation of a system that performs a facial similarity calculation method according to some embodiments of the present invention.

도 13을 참조하면, 본 발명의 몇몇 실시예들에 따른 안면유사도 산출 방법을 수행하는 서버(100)는 전자 장치(1000)로 구현될 수 있다. 전자 장치(1000)는 컨트롤러(1010, controller), 입출력 장치(1020, I/O), 메모리 장치(1030, memory device), 인터페이스(1040, interface) 및 버스(1050, bus)를 포함할 수 있다. 컨트롤러(1010), 입출력 장치(1020), 메모리 장치(1030) 및/또는 인터페이스(1040)는 버스(1050)를 통하여 서로 결합될 수 있다. 이때, 버스(1050)는 데이터들이 이동되는 통로(path)에 해당한다.Referring to FIG. 13 , the server 100 performing the facial similarity calculation method according to some embodiments of the present invention may be implemented as an electronic device 1000. The electronic device 1000 may include a controller 1010, an input/output device 1020 (I/O), a memory device 1030, an interface 1040, and a bus 1050. . The controller 1010 , the input/output device 1020 , the memory device 1030 and/or the interface 1040 may be coupled to each other through a bus 1050 . At this time, the bus 1050 corresponds to a path through which data is moved.

구체적으로, 컨트롤러(1010)는 CPU(Central Processing Unit), MPU(Micro Processor Unit), MCU(Micro Controller Unit), GPU(Graphic Processing Unit), 마이크로프로세서, 디지털 신호 프로세스, 마이크로컨트롤러, 어플리케이션 프로세서(AP, application processor) 및 이들과 유사한 기능을 수행할 수 있는 논리 소자들 중에서 적어도 하나를 포함할 수 있다. Specifically, the controller 1010 includes a central processing unit (CPU), a micro processor unit (MPU), a micro controller unit (MCU), a graphic processing unit (GPU), a microprocessor, a digital signal processor, a microcontroller, and an application processor (AP). , application processor), and logic elements capable of performing functions similar thereto.

입출력 장치(1020)는 키패드(keypad), 키보드, 터치스크린 및 디스플레이 장치 중 적어도 하나를 포함할 수 있다. The input/output device 1020 may include at least one of a keypad, a keyboard, a touch screen, and a display device.

메모리 장치(1030)는 데이터 및/또는 프로그램 등을 저장할 수 있다.The memory device 1030 may store data and/or programs.

인터페이스(1040)는 통신 네트워크로 데이터를 전송하거나 통신 네트워크로부터 데이터를 수신하는 기능을 수행할 수 있다. 인터페이스(1040)는 유선 또는 무선 형태일 수 있다. 예컨대, 인터페이스(1040)는 안테나 또는 유무선 트랜시버 등을 포함할 수 있다. 도시하지 않았지만, 메모리 장치(1030)는 컨트롤러(1010)의 동작을 향상시키기 위한 동작 메모리로서, 고속의 디램 및/또는 에스램 등을 더 포함할 수도 있다. 메모리 장치(1030)는 내부에 프로그램 또는 어플리케이션을 저장할 수 있다. The interface 1040 may perform a function of transmitting data to a communication network or receiving data from the communication network. Interface 1040 may be wired or wireless. For example, the interface 1040 may include an antenna or a wired/wireless transceiver. Although not shown, the memory device 1030 is an operating memory for improving the operation of the controller 1010 and may further include a high-speed DRAM and/or SRAM. The memory device 1030 may store programs or applications therein.

본 발명의 실시예들에 따른 서버(100) 및 고객 단말(200)은 각각 복수의 전자 장치(1000)가 네트워크를 통해서 서로 연결되어 형성된 시스템일 수 있다. 이러한 경우에는 각각의 모듈 또는 모듈의 조합들이 전자 장치(1000)로 구현될 수 있다. 단, 본 실시예가 이에 제한되는 것은 아니다.The server 100 and the customer terminal 200 according to embodiments of the present invention may be systems formed by connecting a plurality of electronic devices 1000 to each other through a network. In this case, each module or combinations of modules may be implemented as the electronic device 1000 . However, this embodiment is not limited thereto.

추가적으로, 서버(100)는 워크스테이션(workstation), 데이터 센터, 인터넷 데이터 센터(internet data center(IDC)), DAS(direct attached storage) 시스템, SAN(storage area network) 시스템, NAS(network attached storage) 시스템, RAID(redundant array of inexpensive disks, or redundant array of independent disks) 시스템, 및 EDMS(Electronic Document Management) 시스템 중 적어도 하나로 구현될 수 있으나, 본 실시예가 이에 제한되는 것은 아니다.Additionally, the server 100 may include a workstation, a data center, an internet data center (IDC), a direct attached storage (DAS) system, a storage area network (SAN) system, and a network attached storage (NAS) system. It may be implemented as at least one of a system, a redundant array of inexpensive disks (RAID) system, and an electronic document management (EDMS) system, but the present embodiment is not limited thereto.

또한, 서버(100)는 고객 단말(200)을 이용하여 네트워크를 통해서 데이터를 전송할 수 있다. 네트워크는 유선 인터넷 기술, 무선 인터넷 기술 및 근거리 통신 기술에 의한 네트워크를 포함할 수 있다. 유선 인터넷 기술은 예를 들어, 근거리 통신망(LAN, Local area network) 및 광역 통신망(WAN, wide area network) 중 적어도 하나를 포함할 수 있다.In addition, the server 100 may transmit data through a network using the customer terminal 200 . The network may include a network based on wired Internet technology, wireless Internet technology, and short-range communication technology. Wired Internet technology may include, for example, at least one of a local area network (LAN) and a wide area network (WAN).

무선 인터넷 기술은 예를 들어, 무선랜(Wireless LAN: WLAN), DMNA(Digital Living Network Alliance), 와이브로(Wireless Broadband: Wibro), 와이맥스(World Interoperability for Microwave Access: Wimax), HSDPA(High Speed Downlink Packet Access), HSUPA(High Speed Uplink Packet Access), IEEE 802.16, 롱 텀 에볼루션(Long Term Evolution: LTE), LTE-A(Long Term Evolution-Advanced), 광대역 무선 이동 통신 서비스(Wireless Mobile Broadband Service: WMBS) 및 5G NR(New Radio) 기술 중 적어도 하나를 포함할 수 있다. 단, 본 실시예가 이에 제한되는 것은 아니다.Wireless Internet technologies include, for example, Wireless LAN (WLAN), DMNA (Digital Living Network Alliance), Wireless Broadband (Wibro), WiMAX (World Interoperability for Microwave Access: Wimax), HSDPA (High Speed Downlink Packet Access), High Speed Uplink Packet Access (HSUPA), IEEE 802.16, Long Term Evolution (LTE), Long Term Evolution-Advanced (LTE-A), Wireless Mobile Broadband Service (WMBS) And it may include at least one of 5G New Radio (NR) technology. However, this embodiment is not limited thereto.

근거리 통신 기술은 예를 들어, 블루투스(Bluetooth), RFID(Radio Frequency Identification), 적외선 통신(Infrared Data Association: IrDA), UWB(Ultra-Wideband), 지그비(ZigBee), 인접 자장 통신(Near Field Communication: NFC), 초음파 통신(Ultra Sound Communication: USC), 가시광 통신(Visible Light Communication: VLC), 와이 파이(Wi-Fi), 와이 파이 다이렉트(Wi-Fi Direct), 5G NR (New Radio) 중 적어도 하나를 포함할 수 있다. 단, 본 실시예가 이에 제한되는 것은 아니다.Short-range communication technologies include, for example, Bluetooth, Radio Frequency Identification (RFID), Infrared Data Association (IrDA), Ultra-Wideband (UWB), ZigBee, Near Field Communication: At least one of NFC), Ultra Sound Communication (USC), Visible Light Communication (VLC), Wi-Fi, Wi-Fi Direct, and 5G NR (New Radio) can include However, this embodiment is not limited thereto.

네트워크를 통해서 통신하는 서버(100)는 이동통신을 위한 기술표준 및 표준 통신 방식을 준수할 수 있다. 예를 들어, 표준 통신 방식은 GSM(Global System for Mobile communication), CDMA(Code Division Multi Access), CDMA2000(Code Division Multi Access 2000), EV-DO(Enhanced Voice-Data Optimized or Enhanced Voice-Data Only), WCDMA(Wideband CDMA), HSDPA(High Speed Downlink Packet Access), HSUPA(High Speed Uplink Packet Access), LTE(Long Term Evolution), LTEA(Long Term Evolution-Advanced) 및 5G NR(New Radio) 중 적어도 하나를 포함할 수 있다. 단, 본 실시예가 이에 제한되는 것은 아니다.The server 100 communicating through the network may comply with technical standards and standard communication methods for mobile communication. For example, standard communication methods include GSM (Global System for Mobile communication), CDMA (Code Division Multi Access), CDMA2000 (Code Division Multi Access 2000), EV-DO (Enhanced Voice-Data Optimized or Enhanced Voice-Data Only) At least one of Wideband CDMA (WCDMA), High Speed Downlink Packet Access (HSDPA), High Speed Uplink Packet Access (HSUPA), Long Term Evolution (LTE), Long Term Evolution-Advanced (LTEA), and 5G New Radio (NR) can include However, this embodiment is not limited thereto.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an example of the technical idea of the present embodiment, and various modifications and variations can be made to those skilled in the art without departing from the essential characteristics of the present embodiment. Therefore, the present embodiments are not intended to limit the technical idea of the present embodiment, but to explain, and the scope of the technical idea of the present embodiment is not limited by these embodiments. The scope of protection of this embodiment should be construed according to the claims below, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of rights of this embodiment.

Claims

In the facial similarity calculation method performed in the server associated with the customer terminal,
Collecting a plurality of video call data including a customer's face;
deriving facial images from the collected video call data and assigning them to each customer's account;
Determining a test target for determining facial similarity among a plurality of customers;
Calculating a vector value of a facial image belonging to an account of a customer included in the inspection target;
generating a matrix for each customer by using the number of video calls of the corresponding customer and the vector value of the face image corresponding to each video call;
converting the size of the matrix to be the same for each customer based on the number of video calls of the customer with the most call history among the customers belonging to the test target; and
Comprising the step of calculating facial similarity using the transformed inter-matrix operation
Facial similarity calculation method.

According to claim 1,
The matrix is
It is expressed as a product of a first matrix related to the number of video calls and a second matrix related to vector values of face images corresponding to each video call,
The first matrix includes one or more entries or elements corresponding to the number of video calls of the corresponding customer,
The component or the element is a positive integer
Facial similarity calculation method.

According to claim 2,
The step of converting the size of the matrix to be the same,
Setting the number of video calls of the customer with the most call history as the 'maximum number of video calls';
Based on the maximum number of video calls, converting all the sizes of the first matrix constituting the matrix of each customer to be the same
Facial similarity calculation method.

According to claim 1,
The step of calculating the vector value is,
Detecting a landmark in the face image;
Aligning the face image using the detected landmark;
For the aligned facial image, calculating the vector value using a deep learning module
Facial similarity calculation method.

According to claim 4,
Comparing the calculated facial similarity with a preset reference value to determine whether there is an abnormality;
If the abnormality is found, reconfirming whether the person is the same person based on the result of the visual inspection for the case;
Further comprising the step of re-learning the deep learning module using the image for the corresponding case if there is an error in the determination of the abnormality
Facial similarity calculation method.

According to claim 1,
The step of calculating the vector value is,
Calculate the vector value using a pre-trained deep learning module so that the same vector value is output for the image of the same person,
The deep learning module,
An input layer having the facial image as an input node;
An output layer having the facial similarity as an output node;
Including one or more hidden layers disposed between the input layer and the output layer;
The weights of nodes and edges between the input node and the output node are updated by the learning process of the deep learning module.
Facial similarity calculation method.

According to claim 1,
The step of calculating the vector value is,
Calculate the vector value using a pre-trained deep learning module so that the same vector value is output for the image of the same person,
The deep learning module,
A plurality of neural network modules into which different images are input;
a similarity determination module for calculating a similarity between values output from the plurality of neural network modules;
A weight module for adjusting weights for each of the calculated similarities;
And a feedback module for transmitting feedback about the error of the result value output from the weight module to the plurality of neural network modules.
Facial similarity calculation method.

According to claim 1,
The step of determining the inspection target includes receiving a selection for any one of a plurality of inspection methods and determining the inspection target based on the selected inspection method;
The inspection method,
A first inspection method for determining facial similarity between a customer and a reference image registered in a blacklist;
A second inspection method for determining facial similarity between facial images of a plurality of video calls belonging to a specific customer;
Including a third inspection method for determining the degree of facial similarity between facial images of different customers' video calls
Facial similarity calculation method.

According to claim 8,
The second inspection method and the third inspection method determine facial similarity using the matrix converted to the same size for each customer.
Facial similarity calculation method.

According to claim 8,
In the first inspection method, whether to perform is determined based on whether the blacklist is updated or whether the customer is a new customer
Facial similarity calculation method.

In the facial similarity calculation method performed in the server associated with the customer terminal,
(a) collecting a plurality of video call data including a customer's face;
(b) deriving a facial image from the collected video call data;
(c) calculating a vector value of the derived facial image and assigning it to the customer's account;
(d) repeating steps (a) to (c) for a plurality of customers;
(e) generating a matrix for each customer by using the number of video calls of the corresponding customer and the vector value of the face image corresponding to each video call;
(f) converting the size of the matrix to be the same for each customer based on the number of video calls of the customer with the most call history among the customers belonging to the test target; and
(g) calculating facial similarity using the transformed inter-matrix operation
Facial similarity calculation method.

a data collection unit that collects a plurality of video call data including a customer's face;
Among a plurality of customers, a test target determining unit for determining a test target for facial similarity determination;
a facial image derivation unit deriving a facial image for the plurality of video call data of the test subject; and
Including a calculation unit for calculating the facial similarity with respect to the facial image,
The calculation unit,
A feature calculation unit for calculating a vector value of a facial image belonging to an account of a customer included in the inspection target;
For each customer, a matrix generator for generating a matrix using the number of video calls of the corresponding customer and the vector value of the face image corresponding to each video call;
a matrix converter for converting the size of the matrix to be the same for each customer based on the number of video calls of the customer with the most call history among the customers belonging to the test target;
Comprising a facial similarity calculation unit for calculating facial similarity using the transformed inter-matrix operation
Facial similarity calculation system.

According to claim 12,
The matrix is
It is expressed as a product of a first matrix related to the number of video calls and a second matrix related to vector values of face images corresponding to each video call,
The first matrix includes one or more entries or elements corresponding to the number of video calls of the corresponding customer,
The matrix conversion unit,
The number of video calls of the customer with the most call history is set as the 'maximum number of video calls',
Converting the size of the first matrix constituting the matrix of each customer to be the same based on the maximum number of video calls
Facial similarity calculation system.

According to claim 12,
The feature calculation unit,
Calculate the vector value using a pre-trained deep learning module so that the same vector value is output for the image of the same person,
The deep learning module,
An input layer having the facial image as an input node;
An output layer having the facial similarity as an output node;
Including one or more hidden layers disposed between the input layer and the output layer;
The weights of nodes and edges between the input node and the output node are updated by the learning process of the deep learning module.
Facial similarity calculation system.

According to claim 12,
The inspection target determining unit,
Receiving a selection for any one of a plurality of inspection methods, and determining the inspection target based on the selected inspection method;
The inspection method,
A first inspection method for determining facial similarity between a customer and a reference image registered in a blacklist;
A second inspection method for determining facial similarity between facial images of a plurality of video calls belonging to a specific customer;
Including a third inspection method for determining the degree of facial similarity between facial images of different customers' video calls
Facial similarity calculation system.