KR102194409B1

KR102194409B1 - Face Detection and Recogniton System and Method using Multi-focusing AI Convolution Neural Network

Info

Publication number: KR102194409B1
Application number: KR1020200068528A
Authority: KR
Inventors: 박노원; 김현정; 강지훈; 신오석; 오병철
Original assignee: 주식회사 컴트루테크놀로지; 주식회사 핀샷
Priority date: 2020-06-05
Filing date: 2020-06-05
Publication date: 2020-12-24

Abstract

The present invention relates to a system for performing face detection and face recognition in a cost-effective method. According to the present invention, the system comprises an input unit receiving image data and a management server receiving the image data from the input unit to process the image data. The management server comprises a back-end server receiving the image data received from the input unit in a streaming type and adjusting the load and frames of the received image data, and an AI analysis server analyzing the image data adjusted by the back-end server through an AI neural network algorithm to detect or recognize a face in the image data. Accordingly, the system monitors and checks identification of a user without explicit notification to the user by unaware continuous authentication, thereby providing an effect of maintaining validity in detection/recognition without interfering with a normal flow of service.

Description

Face Detection and Recogniton System and Method using Multi-focusing AI Convolution Neural Network}

본 발명은 비용 효율적인 방법으로 AI 인공지능 신경망 학습법을 이용한 안면검출(face detection)과 안면인식(face recognition) 방법으로서, 멀티포커싱(Multi-focusing ) AI신경망을 채용한, 안면검출, 안면인식, 안면인증을 동시에 수행할 수 있는 방법/시스템에 관한 것으로, 다층의 BackBone신경망을 운용하는 방법과, 사용자 이미지의 환경변화(조도, 해상도)를 감지하여, 적절한 보조신경망을 운용함으로써 안면검출률 및 인식률을 높일 수 있으며, 실물인증(liveness)에 대한 보조신경망을 사용하여, 안면 스푸핑 등의 보안공격에 대응할 수 있는 안면 검출 및 식별 시스템에 관한 것이다.The present invention is a cost-effective method for face detection and face recognition using an AI artificial intelligence neural network learning method, employing a multi-focusing AI neural network, face detection, facial recognition, and facial recognition. Regarding a method/system that can perform authentication simultaneously, a method of operating a multi-layered BackBone neural network, sensing environmental changes (illumination, resolution) of a user image, and operating an appropriate auxiliary neural network to increase the face detection rate and recognition rate. The present invention relates to a face detection and identification system capable of responding to security attacks such as facial spoofing by using an auxiliary neural network for liveness.

일반적인 카메라의 경우, 카메라의 렌즈종류에 따라 최적의 초점거리가 정해진다. CCTV 웹카메라 단말기 등과 같은 영상 취득 단말기들도 촬영 대상 오브젝트의 거리에 따라 심도와 깊이를 조절하고, 광량을 조절하여, 원하는 거리의 목표대상을 최적의 해상도로 찍는 과정을 통해 영상을 취득하게 된다.In the case of a general camera, the optimal focal length is determined according to the type of lens of the camera. Image acquisition terminals such as CCTV web camera terminals also acquire images through the process of photographing a target object at a desired distance with an optimal resolution by adjusting the depth and depth according to the distance of the object to be photographed and the amount of light.

이때, 안면이미지를 구성하는 픽셀(Pixel) 수는 촬영 단말기의 지원 가능 해상도에도 영향을 받지만, 동일 단말기에서도, 도 1에 도시된 바와 같이, 오브젝트(안면)와의 거리에 따라 안면 영역을 구성하는 영역의 픽셀 수가 달라지게 된다.At this time, the number of pixels constituting the facial image is affected by the supported resolution of the photographing terminal, but even in the same terminal, as shown in FIG. 1, the area constituting the facial area according to the distance to the object (face). The number of pixels in is different.

즉, 거리가 먼, 도 1 (A)의 경우, 안면이미지를 구성하는 영역 내의 픽셀 수가 적은 반면, 거리가 가까운 도 1 (C)의 경우, 안면이미지를 구성하는 영역 내의 픽셀 수가 많다.That is, in the case of FIG. 1(A), which is a long distance, the number of pixels in the area constituting the facial image is small, whereas in the case of FIG. 1(C), which is close to the distance, the number of pixels in the area constituting the facial image is large.

또한, 안면검출률과 인식률은, 픽셀의 수뿐만 아니라 광량 및 조도/휘도에 따라서 영향을 받는다.In addition, the face detection rate and recognition rate are affected not only by the number of pixels, but also by the amount of light and illuminance/luminance.

즉, 조도(illuminance) 및 휘도(Luminance)가 밝을수록 카메라 센서에 닿는 광량이 많아지고, 이미지 파일의 픽셀값(Plxel value, RGB)이 변화하면서 안면 검출률과 인식률에 영향을 준다.That is, the brighter the illuminance and luminance, the greater the amount of light that reaches the camera sensor, and the pixel value (RGB) of the image file changes, affecting the face detection rate and recognition rate.

한편, 안면 검출/인식 시스템을 사용할 경우, 카메라의 존재 자체가 사용자들한테 거부감 또는 불편함을 줄 수 있어, 촬영 대상자들이 인지하지 못하는 상태에서 안면검출 및 안면인식이 이루어지는 경우도 필요하다.On the other hand, in the case of using the face detection/recognition system, the presence of the camera itself may give users a sense of rejection or discomfort, so it is also necessary to perform face detection and facial recognition without being recognized by the subject.

금융서비스의 비대면 인증과 같은 서비스를 이용할 경우, 안면 스푸핑공격(face spoofing attack)등으로 인한 금융거래시의 위험요소를 방지하기 위하여, 명시적으로 고객한테 요구하는 절차가 아니더라도, 무자각 지속인증의 방법으로 백그라운드에서 서비스 이용자로부터 발생하는 위험요소를 확인하는 작업이 필요하다.In the case of using services such as non-face-to-face authentication of financial services, in order to prevent risk factors in financial transactions due to face spoofing attacks, etc., continuous authentication without awareness, even if it is not a procedure explicitly required by customers In the background, it is necessary to check the risk factors arising from service users.

한편, 대한민국 공개특허 제10-2008-0073598호에는 입력되는 영상으로부터 얼굴을 인식하여 신원을 판별하는 기술이 개시되어 있고, 대한민국 공개특허 제10-2020-0017594호에는 입력되는 영상에서 대규모 객체를 식별하는 기술이 개시되어 있다.Meanwhile, Korean Patent Laid-Open No. 10-2008-0073598 discloses a technology for identifying an identity by recognizing a face from an input image, and Korean Patent Publication No. 10-2020-0017594 identifies a large-scale object in an input image. The technique of doing is disclosed.

그러나 상기한 바와 같은 선행 기술들은 다음과 같은 문제점이 있다.However, the prior art as described above has the following problems.

특정한 해상도에 학습되어 최적화한 인공지능 신경망은 한 개의 template로 볼 수 있다. 따라서 다양한 목적으로 이러한 백본신경망을 사용하는 것에 제약이 있다. 만약 모든 거리와 모든 해상도에서 안면검출과 인식을 동시에 수행할 수 있도록 하려면 해당 신경망은 수 많은 parameter, 중첩된 layer, 보다 많은 학습시간이 소요될 것이다.An artificial intelligence neural network that has been trained and optimized at a specific resolution can be viewed as one template. Therefore, there are restrictions on the use of such a backbone neural network for various purposes. If face detection and recognition are to be performed simultaneously at all distances and at all resolutions, the neural network will take many parameters, overlapping layers, and more learning time.

(001) 대한민국 공개특허 제10-2008-0073598호(001) Korean Patent Application Publication No. 10-2008-0073598 (002) 대한민국 공개특허 제10-2020-0017594호(002) Korean Patent Application Publication No. 10-2020-0017594

본 발명은 입력된 이미지영상에 포함된 인식 대상이 되는 얼굴이, 근거리/원거리에 혼재하여 존재할 경우, 또는 해상도가 충분한 영역과 불충분한 영역이 혼재하는 경우, 조도의 편차가 심한 경우, 입력 영상의 인식/식별을 위한 주변요건들이 불균일한 경우, 단일한 인공지능 신경망 학습을 수행한, 기존의 단일한 백본신경망으로는 안면데이터를 효과적으로 분석, 검출할 수 없었다. In the present invention, when a face to be recognized included in an input image image exists in a mixture of near/far distance, or when an area with sufficient resolution and an area that is insufficient are mixed, when there is a large deviation in illuminance, the input image If the peripheral requirements for recognition/identification are uneven, facial data could not be effectively analyzed and detected with the existing single backbone neural network that performed a single artificial intelligence neural network learning.

즉, 일반적인 “안면인식용 백본신경망”은 특정한 환경조건에 맞추어서 안면사진을 학습하여 완성된 템플릿(template)이기 때문에 기준 척도가 항상 일정하다. 따라서 20 pixel의 이미지나 200 pixel의 이미지를 동일한 기준과 척도로 분석하게 되면 정확한 결과를 도출하기 어렵다. In other words, since a general “backbone neural network for facial recognition” is a template completed by learning facial photos according to specific environmental conditions, the reference scale is always constant. Therefore, if an image of 20 pixels or an image of 200 pixels is analyzed with the same standard and scale, it is difficult to derive accurate results.

또한, 모바일기기의 금융거래를 위한 네이티브 앱 (Native App)에서, 비대면 금융거래시, 다른 인물사진을 이용하거나, 다른 인물의 동영상을 이용하여 본인임을 숨기는 경우가 발생할 경우, 이에 대처할 수 있는 실물확인(Liveness인증) 기능이 필요하다. 모바일기기가 특별한 센서장비(적외선 센서, 거리측정 센서, 열감지 센서)가 없어도 실물(liveness) 임을 확인할 수 있는 보안(security)수단이 필요하다. In addition, in the native app for financial transactions on mobile devices, in the case of non-face-to-face financial transactions, other portraits are used, or when other people's videos are used to hide their identity, a real object that can cope with this. Verification (Liveness authentication) function is required. Even without special sensor equipment (infrared ray sensor, distance measurement sensor, heat sensor), a security means is needed to confirm that the mobile device is real.

그리고 본 발명은 특정한 목적으로만 사용되는 안면인식 검출/식별 시스템으로부터 벗어나, 범용적으로 입력 영상의 환경조건에 적응하여 안면을 검출 및 인식할 수 있도록 다층 백본신경망과 보조신경망으로 구성된 멀티포커싱 안면 검출 및 식별 시스템을 제공하고자 하는 것이다.In addition, the present invention deviates from a facial recognition detection/identification system that is used only for a specific purpose, and is a multi-focusing face detection composed of a multilayer backbone neural network and an auxiliary neural network so that the face can be detected and recognized by adapting to the environmental conditions of the input image in general And an identification system.

상기한 바와 같은 목적을 달성하기 위한 본 발명의 특징에 따르면, 본 발명은 영상 데이터를 입력받는 입력유닛과; 상기 입력유닛으로부터 영상데이터를 수신받아 처리하는 관리서버를 포함하여 구성되고: 상기 관리서버는, 상기 입력유닛으로부터 수신되는 영상데이터를 스트리밍형태로 입력받아 수신된 영상데이터의 부하 및 프레임을 조정하는 백엔드(Back-end)서버와; 상기 백엔드 서버로부터 조정된 영상데이터를 인공신경망 알고리즘을 통해 분석하여 상기 영상데이터 상의 안면을 검출하거나 식별하도록 하는 AI분석서버를 포함하여 구성된다.According to a feature of the present invention for achieving the above object, the present invention comprises an input unit for receiving image data; And a management server that receives and processes image data from the input unit, wherein the management server receives the image data received from the input unit in a streaming format and adjusts the load and frame of the received image data. (Back-end) server; And an AI analysis server configured to detect or identify a face on the image data by analyzing the image data adjusted from the backend server through an artificial neural network algorithm.

이때, 상기 입력유닛은, 자체 내장된 카메라를 통해 셀프 촬영 기능이 제공되는 모바일 단말기; PC와 연결되고, 웹USB 카메라를 이용하여 브라우저 환경에서 복수의 인물을 인식할 수 있는 웹단말기; 또는 다수의 인물을 관찰할 수 있도록 공개된 장소에 설치되어, 통신 가능하도록 구성되는 IP카메라(로컬단말기) 중 어느 하나 이상을 포함하여 구성될 수도 있다.In this case, the input unit may include a mobile terminal provided with a self-taking function through a built-in camera; A web terminal connected to a PC and capable of recognizing a plurality of people in a browser environment using a web USB camera; Alternatively, it may be configured to include any one or more of an IP camera (local terminal) that is installed in a public place to observe a number of people and configured to enable communication.

그리고 상기 AI 분석서버는, IP카메라와 같은 입력장치를 통해서 입력된 영상데이터를 받은 후, RPN(region proposal network) 신경망을 이용하여, 안면영역을 특정(안면탐지 및 해당 안면 위치탐지결과물로 안면영역을 Bounding 하는 작업)하는 영역특정부와; 안면영역(얼굴)의 개수, 입력 영상데이터의 해상도정보 및 입력 영상데이터의 개수를 판별하는 K-밸류부와; 안면검출 알고리즘 및 안면인식 알고리즘을 포함하여 구성되어, 입력 영상데이터에 대하여 상기 안면검출 알고리즘 또는 안면인식 알고리즘을 선택적으로 적용하여 안면정보를 분석하는 AI 분석부와; 입력 영상데이터에 대한 조도, 해상도, 각도 및 실물(liveness) 보정을 수행할 수 있도록 조도보정 알고리즘, 해상도 보정 알고리즘, 각도 보정 알고리즘 및 실물보정 알고리즘을 포함하여, 상기 AI 분석부에서 선택된 안면정보 분석 알고리즘의 종류에 따라 선별적으로 보정 알고리즘을 선택하여 수행하는 영상 보정부를 포함하여 구성될 수도 있다.In addition, the AI analysis server receives image data input through an input device such as an IP camera, and then uses a region proposal network (RPN) neural network to specify a facial region (facial region as a result of face detection and corresponding facial location detection). A region-specific government unit that performs a task of Bounding; A K-value unit for determining the number of facial regions (faces), resolution information of the input image data, and the number of input image data; An AI analysis unit configured to include a face detection algorithm and a face recognition algorithm to analyze face information by selectively applying the face detection algorithm or the face recognition algorithm to input image data; Facial information analysis algorithm selected by the AI analysis unit, including an illuminance correction algorithm, a resolution correction algorithm, an angle correction algorithm, and an object correction algorithm to perform illuminance, resolution, angle and liveness correction for input image data. It may be configured to include an image correction unit that selectively selects and performs a correction algorithm according to the type of.

또한, 상기 안면인식알고리즘은, 인식 안면의 인물 특징을 구분하는 식별용 인식알고리즘과; 특정 인물의 생체정보임을 확인할 수 있는 인증용 인식알고리즘을 포함하여 구성될 수도 있다.In addition, the facial recognition algorithm may include: an identification recognition algorithm for distinguishing character features of a recognized face; It may be configured to include a recognition algorithm for authentication that can confirm the biometric information of a specific person.

한편, 상기 AI 분석부의 분석 알고리즘 선택은 다음과 같은 방법 중 어느 하나가 선택될 수 있다.Meanwhile, for the selection of the analysis algorithm of the AI analysis unit, any one of the following methods may be selected.

즉, 상기 AI 분석부는 분석 대상 영상에 포함된 안면영역들에 포함된 픽셀수에 따라, 안면인식 기준픽셀수 이상의 픽셀수를 갖는 안면영역이 존재하고, 상기 안면인식 기준픽셀수 이상의 픽셀수를 갖는 안면영역들의 편차평균값이 기설정된값 이상인 경우 안면인식 알고리즘을 분석 알고리즘으로 선택할 수 있다.That is, the AI analysis unit may have a facial area having a number of pixels equal to or greater than the number of pixels for facial recognition based on the number of pixels included in the facial areas included in the image to be analyzed, and having a number of pixels equal to or greater than the number of pixels for facial recognition. When the average deviation value of the facial regions is more than a preset value, a facial recognition algorithm may be selected as an analysis algorithm.

또한, 상기 AI 분석부는 안면영역들의 개수를 파악하고 각각의 안면영역(Boundary)내의 픽셀(Pixel)수의 평균이 안면인식 기준픽셀수와 오차범위 내에서 일치하는 경우, 상기 각 안면영역들의 구성 픽셀수 편차평균값이 기설정된값 이상이면 안면인식 알고리즘을 분석 알고리즘으로 선택하고, 기설정된 값 미만이면 안면검출 알고리즘을 분석 알고리즘으로 선택할 수도 있다.In addition, the AI analysis unit determines the number of facial areas, and when the average of the number of pixels in each facial area matches within an error range with the reference number of facial recognition pixels, the constituent pixels of each facial area If the number deviation average value is greater than or equal to a preset value, a facial recognition algorithm may be selected as an analysis algorithm, and if it is less than a preset value, a face detection algorithm may be selected as an analysis algorithm.

마지막으로, 상기 AI 분석부는 안면영역들에 포함된 픽셀수의 평균이 인증용 기준픽셀수와 오차범위 내에서 일치하는 경우, 상기 각 안면영역들의 구성 픽셀수 편차평균값이 기설정된값 이상인 경우 인증용 안면인식 알고리즘을 분석 알고리즘으로 선택하고, 기설정된값 미만인 경우 식별용 안면인식 알고리즘을 분석 알고리즘으로 선택할 수도 있다.Finally, the AI analysis unit is used for authentication when the average of the number of pixels included in the facial areas matches within an error range with the reference pixel number for authentication, and the average deviation of the number of pixels in each facial area is equal to or greater than a preset value. A facial recognition algorithm may be selected as an analysis algorithm, and if it is less than a preset value, an identification facial recognition algorithm may be selected as an analysis algorithm.

상기 안면검출용 기준픽셀수는 안면영역당 25×25 픽셀일 수도 있고, 상기 안면인식용 기준픽셀수는 안면영역당 50×50 픽셀일 수도 있으며, 상기 안면인증용 기준픽셀수는 안면영역당 90×90 픽셀일 수도 있다.The number of reference pixels for face detection may be 25×25 pixels per face area, the number of reference pixels for face recognition may be 50×50 pixels per face area, and the number of reference pixels for face authentication is 90 per face area. It may be x90 pixels.

그리고 상기 K-밸류부는, 입력 영상데이터의 해상도 픽셀을 기준으로 최소인식 안면크기, 이미지상의 안면비율 및 안면 개수에 따라 안면에 포함된 픽셀수를 판별하여 K값을 산출하고; 상기 AI 분석부는, 상기 K값에 따라 상기 안면검출 알고리즘, 식별용 인식알고리즘 또는 인증용 인식알고리즘 중 어느 하나를 선택하여 안면정보를 분석할 수도 있다. And the K-value unit calculates a K value by determining the number of pixels included in the face according to the minimum recognized face size, the face ratio on the image, and the number of faces based on the resolution pixels of the input image data; The AI analysis unit may analyze facial information by selecting any one of the face detection algorithm, identification recognition algorithm, or authentication recognition algorithm according to the K value.

또한, 상기 입력유닛은, 안면인식 프로그램이 설치될 수도 있다.In addition, the input unit may be installed with a facial recognition program.

그리고 상기 입력유닛은, 안면인식 프로그램이 설치되지 않은 경우, 백엔드 서버로 단말기 식별번호를 전송하고, 백엔드 서버로 API링크를 요청하여, 암호화된 연결주소를 제공 받아 URL(API)에 접속할 수도 있다. In addition, when the facial recognition program is not installed, the input unit may transmit a terminal identification number to a backend server, request an API link to the backend server, and receive an encrypted connection address to access a URL (API).

이때 모바일기기에서는 모바일기기의 기본브라우저(default browser, IoS의 경우에는 Safari, 안드로이드 기기는 chrome browser)를 이용하여 제공받은 URL로 접속한다.At this time, the mobile device accesses the provided URL using the default browser of the mobile device (Safari in the case of IoS, and the chrome browser in the Android device).

또한, 상기 백엔드 서버는, 동시에 다중의 입력유닛으로부터 전송되는 스트리밍 데이터를 순차적으로 분석하기 위하여, 입력유닛으로 응답요청신호를 발신하여 해당 입력유닛의 반응속도에 따라 데이터 전송속도를 로컬에서 지연시킬 수 있도록하는 딜레이부를 포함하여 구성될 수도 있다.In addition, in order to sequentially analyze the streaming data transmitted from multiple input units at the same time, the backend server may send a response request signal to the input unit to locally delay the data transmission rate according to the response rate of the corresponding input unit. It may be configured to include a delay unit to enable.

구체적으로, 상기 백엔드 서버의 상기 딜레이부는 구성된 다수의 로컬단말기와 관리서버의 처리능력을 고려하여, 스트리밍 데이터(reaming data)의 전송속도를, 전송되는 초당 프레임(Frame) 수로 기준을 정한 후, 다수의 입력유닛 중에서 특정한 순번의 입력유닛 부하 처리가 많아질 경우, 해당 입력유닛의 초당 전송되는 프레임 수를 조정할 수 있도록 구성될 수 있다.Specifically, the delay unit of the back-end server determines the transmission rate of streaming data in consideration of the processing capabilities of the plurality of local terminals and the management server configured, and then sets the standard in terms of the number of frames per second to be transmitted. When the load processing of the input unit of a specific sequence among the input units of is increased, the number of frames transmitted per second of the corresponding input unit may be adjusted.

그리고 상기 조도보정 알고리즘은, grayscale로 만들어진 이미지의 Pixel 값에 대해서 조도의 변화에 대한 민감도를 떨어뜨리기 위하여 LBP(Local Binary Pattern)를 이용하여, 흑백의 이미지에서 조도보정을 사용할 수 있도록 한다. 예를 들어 상기 조도보정 알고리즘은, 입력 이미지 내에서 3×3 pixel 크기의 스크린 윈도우를 이용하여 입력이미지 전체를 스캔하면서, 스크린 윈도우 상에서 추출된 특징점들을 스크린 윈도우의 중심 픽셀과 비교하여 차이값에 따라 2진화하여 보정을 수행할 수도 있다.In addition, the illuminance correction algorithm uses LBP (Local Binary Pattern) to reduce sensitivity to changes in illuminance with respect to the pixel value of an image made in grayscale, so that illuminance correction can be used in a black and white image. For example, the illuminance correction algorithm scans the entire input image using a screen window having a size of 3×3 pixels within the input image, and compares the feature points extracted on the screen window with the center pixel of the screen window according to the difference value. It is also possible to perform correction by binarizing.

또한, 상기 해상도보정 알고리즘은, 안면 이미지를 포함하는 정방향 영역의 구성 픽셀이 25×25 내지 50×50인 경우 수행되며; 상기 안면 이미지를 포함하는 정방향 영역 테두리 부분을 마스킹하는 패딩(padding)작업을 수행되도록 할 수 있다.Further, the resolution correction algorithm is performed when the constituent pixels of the forward region including the facial image are 25×25 to 50×50; It is possible to perform a padding operation of masking the edge portion of the forward area including the facial image.

그리고 상기 실물보정 알고리즘은, 인물사진과 동영상을 이용한 스푸핑공격을 방지하기 위하여, 거리센서, 자외선 센서등의 하드웨어에 의존하지 않고, CNN신경망 학습만을 통한 구분자 신경망( Classifier)을 이용한 Machine Learning Classifier를 사용하되, 오브젝트 디텍션 모듈을 사용하여 스푸핑 공격을 탐지한다. In addition, the real correction algorithm uses a Machine Learning Classifier using a classifier that uses only CNN neural network learning, without relying on hardware such as distance sensors and ultraviolet sensors, to prevent spoofing attacks using portrait photos and videos. However, the object detection module is used to detect spoofing attacks.

즉, 상기 실물보정 알고리즘은, CNN 신경망을 통해 학습되어, 입력 영상데이터에 포함된 오브젝트를 검출하고, 검출된 오브젝트를 실물안면 오브젝트, 비실물안면 오브젝트 및 비안면 오브젝트로 구별할 수도 있다.That is, the real correction algorithm may be learned through a CNN neural network to detect an object included in the input image data, and may distinguish the detected object into a real facial object, a non-real facial object, and a non-facial object.

위에서 살핀 바와 같은 본 발명에 의한 AI인공신경망을 이용한 학습을 통해서, 멀티포커싱 다층 백본신경망과 보조신경망을 이용한 안면 검출 및 식별 시스템에서는 다음과 같은 효과를 기대할 수 있다.Through learning using the AI artificial neural network according to the present invention as seen above, the following effects can be expected in a face detection and identification system using a multi-focusing multi-layer backbone neural network and an auxiliary neural network.

여러 가지 템플릿(안면인식 신경망, 또는 검출자, detector)을 사용하여 안면인식 이미지를 검색할 수 있다. 다양한 이미지 해상도와 환경조건(조도/ 각도/ 실물인증)에 맞는 템플릿을 구성하여 다음과 같은 목적으로 사용할 수 있다.Face recognition images can be searched using various templates (face recognition neural networks, or detectors). A template suitable for various image resolutions and environmental conditions (illumination/angle/realistic authentication) can be configured and used for the following purposes.

1) 마케팅정보 도구로 활용(detection): 전시장 유동인구분석, 세미나 인증1) Utilization as a marketing information tool (detection): Analysis of the floating population of the exhibition hall, seminar certification

2) 특정인물 검출 및 비교(identity recognition ) : 범죄인, 미아찾기, VIP 동선파악2) Specific person detection and comparison (identity recognition): criminal, lost child search, VIP movement

3) 개인인증( identity authentication ): 생체인증과 같은 수준의 정확도로 안면인식 데이터를 처리(무인점포/출입문통제/비대면 신원인증)3) Personal authentication (identity authentication): Processes facial recognition data with the same level of accuracy as biometric authentication (unmanned stores/access door control/non-face-to-face identity authentication)

또한, 본 발명에서는 안면에 대한 무자각 지속인증으로, 사용자에게 명시적인 고지 없이, 범죄예상자의 안면인식을 통해서 신원을 모니터링하고 확인함으로써 정상적인 서비스 흐름을 방해하지 않은 상태에서도, 보안의 목적을 달성할 수 있는 효과가 있다.In addition, in the present invention, the purpose of security can be achieved even without interfering with the normal service flow by monitoring and confirming the identity through the face recognition of the suspected crime without explicit notice to the user as a continuous authentication for the face. It can have an effect.

그리고 본 발명에서는 입력 영상의 해상도 차이, 조도의 편차 등의 환경요인에 대한 편차가 심한 경우에도, 입력 이미지의 상태에 맞추어, 효과적으로 안면을 검출 또는 인식할 수 있는 효과가 있다.In addition, in the present invention, even when the difference in environmental factors such as a difference in resolution and a difference in illuminance of an input image is severe, there is an effect of effectively detecting or recognizing a face according to the state of the input image.

한편, 본 발명에서는 신경망도 이를 효과적으로 검출 또는 인식할 수 있도록 다층의 백본 인공신경망(Backbone AI Convolutional Neural Network)과 보조신경망(Auxiliary AI Convolutional Neural Network, template, detector)을 이용한 안면 검출 및 식별 시스템을 제공할 수 있는 효과가 있다.On the other hand, the present invention provides a face detection and identification system using a multi-layered Backbone AI Convolutional Neural Network and an Auxiliary AI Convolutional Neural Network (template, detector) so that the neural network can also effectively detect or recognize it. There is an effect that can be done.

도 1은 영상 입력장치를 통해 입력되는 입력영상의 해상도의 다양한 예를 도시한 예시도.
도 2는 본 발명의 구체적인 실시예에 의한 안면 검출 및 식별 시스템의 구성을 도시한 블록도.
도 3은 본 발명의 다양한 적용분야별 구동 과정을 도시한 예시도.
도 4는 본 발명의 구체적인 실시예를 구성하는 AI분석 서버의 구성을 도시한 예시도.
도 5는 본 발명의 구체적인 실시예를 구성하는 입력유닛과 백앤드 서버 및 모바일기기 간 안면인식 API의 사용과정을 도시한 예시도.
도 6은 본 발명의 구체적인 실시예에서 실물보정 알고리즘에 의한 보정원리를 도시한 예시도.
도 7은 본 발명의 구체적인 실시예에 의한 백본 신경망의 운용과정을 도시한 예시도.
도 8은 이미지 저장소(Image set)에 저장된 이미지들의 크기 및 해상도 통계치를 도시한 예시도.1 is an exemplary view showing various examples of resolution of an input image input through an image input device.
2 is a block diagram showing the configuration of a face detection and identification system according to a specific embodiment of the present invention.
3 is an exemplary view showing a driving process according to various fields of application of the present invention.
Figure 4 is an exemplary view showing the configuration of an AI analysis server constituting a specific embodiment of the present invention.
5 is an exemplary view showing a process of using a facial recognition API between an input unit, a backend server, and a mobile device constituting a specific embodiment of the present invention.
6 is an exemplary view showing a correction principle by an object correction algorithm in a specific embodiment of the present invention.
7 is an exemplary diagram showing an operation process of a backbone neural network according to a specific embodiment of the present invention.
8 is an exemplary diagram showing size and resolution statistics of images stored in an image set.

이하에서는 첨부된 도면을 참조하여 본 발명의 구체적인 실시예에 의한 인공신경망을 이용한 멀티포커싱 콘텐츠의 안면 검출 및 식별 시스템을 살펴보기로 한다. Hereinafter, a face detection and identification system of multi-focusing content using an artificial neural network according to a specific embodiment of the present invention will be described with reference to the accompanying drawings.

본 발명의 구체적인 설명에 앞서, 본 발명에 적용되는 안면인식 기술에 대하여 먼저 간단히 설명하도록 한다.Prior to the detailed description of the present invention, the facial recognition technology applied to the present invention will be briefly described first.

본 발명을 구성하는 상기 K-밸류부는, 아래 [표 1]과 같이 1) 입력이미지해상도 수준, 2) 안면영역당 평균픽셀수, 3)안면비율 등의 함수로 나타낼 수도 있고, 안면영역 별 실제 픽셀수에 의해 산정될 수도 있다. The K-value unit constituting the present invention may be expressed as a function of 1) input image resolution level, 2) average number of pixels per face area, 3) face ratio as shown in [Table 1] below. It can also be calculated by the number of pixels.

아래 [표 1]은 안면검출/안면인식/안면인증의 경우 각각에 대하여 카메라 송출 해상도에 따라 최대 처리할 수 있는 안면영역 숫자를 기준으로 최소요구 Pixel값을 정의하고 있다. 안면검출/인식 시스템의 처리성능에 따라서 안면비율은 더 올라갈 수도 있으며, 이렇게 될 경우 더 많은 숫자의 안면데이터를 처리할 수 있다. [Table 1] below defines the minimum required pixel value based on the maximum number of face areas that can be processed according to the camera transmission resolution for each of the face detection/face recognition/facial authentication. Depending on the processing performance of the facial detection/recognition system, the facial ratio may increase further, and in this case, a larger number of facial data can be processed.

안면데이터 분석을 위한 최소 Pixel 산정 예Example of calculating the minimum pixel for facial data analysis Detection 의 경우In the case of detection 해상도
구분resolution
division 카메라송출
해상도Camera transmission
resolution 최대
안면숫자maximum
Face number 최소픽셀
(길이)Minimum pixel
(Length) 최소픽셀
(면적)Minimum pixel
(area) 전체안면넓이(pixel) Total face width (pixel) 안면비율
(%)Facial ratio
(%) SDSD 720720 480480 3030 24.024.0 576576 17.28017.280 5.005.00 HDHD 12801280 720720 5050 25.625.6 655655 32.76832.768 3.563.56 FHDFHD 19201920 10801080 8080 24.024.0 575575 46.08046.080 2.222.22 QHDQHD 25802580 14401440 100100 25.625.6 655655 65.53665.536 1.781.78 UHDUHD 38403840 21602160 150150 25.625.6 655655 98.30498.304 1.191.19 Recognitiond 의 경우In the case of Recognitiond 해상도
구분resolution
division 카메라송출
해상도Camera transmission
resolution 최대
안면숫자maximum
Face number 최소픽셀
(길이)Minimum pixel
(Length) 최소픽셀
(면적)Minimum pixel
(area) 전체안면넓이(pixel) Total face width (pixel) 안면비율
(%)Facial ratio
(%) SDSD 720720 480480 1515 48.048.0 23042304 34.56034.560 10.0010.00 HDHD 12801280 720720 2525 51.251.2 26212621 65.53665.536 7.117.11 FHDFHD 19201920 10801080 4040 48.048.0 23042304 92.16092.160 4.444.44 QHDQHD 25802580 14401440 5050 51.251.2 26212621 131.072131.072 3.563.56 UHDUHD 38403840 21602160 7575 51.251.2 26212621 196.608196.608 2.372.37 Recognitiond 의 경우In the case of Recognitiond 해상도
구분resolution
division 카메라송출
해상도Camera transmission
resolution 최대
안면숫자maximum
Face number 최소픽셀
(길이)Minimum pixel
(Length) 최소픽셀
(면적)Minimum pixel
(area) 전체안면넓이(pixel) Total face width (pixel) 안면비율
(%)Facial ratio
(%) SDSD 720720 480480 77 102.9102.9 1058010580 74.05774.057 21.4321.43 HDHD 12801280 720720 1212 106.7106.7 1137811378 136.533136.533 14.8114.81 FHDFHD 19201920 10801080 1919 101.1101.1 1021210212 194.021194.021 9.369.36 QHDQHD 25802580 14401440 2525 102.4102.4 1048510485 262.144262.144 7.117.11 UHDUHD 38403840 21602160 3939 98.598.5 96959695 378.092378.092 4.564.56

여기서는 한 가지 실시예로서, 안면영역당 평균 Pixel 값을 구하는 방법을 적용한다. 즉, 아래 식을 적용한다. Here, as an embodiment, a method of obtaining an average pixel value per facial area is applied. That is, the following equation is applied.

A avg = ( A1 + A2 + … + An) / n A avg = (A1 + A2 +… + An) / n

입력 영상들은 각 인물들의 안면영역에 포함되는 해상도(resolution)가 각각 다르기 때문에 동일한 한 개의 백본신경망에서 모든 해상도를 최적화시키는 작업은 학습에 소요되는 비용과 시간이 많이 소요될 뿐만 아니라, 속도와 정확도는 서로 상반되는 관계를 갖기 때문에 한 개의 백본(Backbone) 신경망으로 모든 해상도의 이미지를 분석, 인식하려면 대단히 큰 구조의 신경망과 많은 신경망 파라미터(parameter)들을 계산해야 하는 비효율성이 존재한다.Since the input images have different resolutions included in each person's facial area, the task of optimizing all resolutions in the same backbone neural network takes a lot of cost and time for learning, and speed and accuracy are different. Since they have conflicting relationships, in order to analyze and recognize images of all resolutions with a single backbone neural network, there is an inefficiency of having to calculate a neural network with a very large structure and many neural network parameters.

한편, 상기 K-밸류부는, 입력 영상데이터의 해상도 픽셀을 기준으로 최소인식 안면크기, 이미지상의 안면비율 및 안면 개수에 따라 안면영역에 포함된 픽셀수를 각각 판별하여 K값을 산출하여, 산출된 K값에 따라 상기 안면검출 알고리즘, 식별용 인식알고리즘 또는 인증용 인식알고리즘 중 어느 하나를 선택하여 안면정보를 분석하도록 할 수 있다.On the other hand, the K-value unit calculates a K value by determining the number of pixels included in the facial area according to the minimum recognized face size, the facial ratio on the image, and the number of faces based on the resolution pixels of the input image data. According to the K value, one of the facial detection algorithm, identification recognition algorithm, or authentication recognition algorithm may be selected to analyze facial information.

이와 같이 안면영역에 대한 개별 픽셀수를 이용하여 K값을 산출하는 경우, 전술한 바와 같이, 안면영역들에 대한 평균픽셀수를 이용하여 K값을 산출하는 경우에 비하여, 산출 연산 효율은 저하될 수 있으나, 식별오류의 발생을 감소시킬 수 있다.In the case of calculating the K value using the number of individual pixels for the facial area as described above, compared to the case of calculating the K value using the average number of pixels for the facial area, the calculation efficiency will be lowered. However, the occurrence of identification errors can be reduced.

따라서, K값 산출의 방식은, 시스템의 사양, 적용공간에서의 원근 차이, 처리대상 데이터 량 등에 따라 선택적으로 적용할 수 있다.Therefore, the method of calculating the K value can be selectively applied according to the specification of the system, the difference in perspective in the applied space, and the amount of data to be processed.

이하에서는 백본(BackBone) 신경망을 이용하여 안면을 검출(face detection) 또는 , 인식(face recognition)하는 본 발명의 프로세스에 대하여 설명하기로 한다. 아래 그림은 본 발명에 따른 안면인식 프로세스를 설명하고 있다. 도 7에 도시된 바와 같이, 백본(BackBone) 신경망 #1로 표시된 부분은 Face detection과 Localization을 담당하는 부분이고, 백본(BackBone) 신경망 #2로 표시되는 부분은 안면인식(recognition)을 위한 신경망이다.Hereinafter, a process of the present invention for face detection or face recognition using a backbone neural network will be described. The figure below illustrates the facial recognition process according to the present invention. As shown in FIG. 7, a part marked with BackBone neural network #1 is a part responsible for face detection and localization, and a part marked with BackBone neural network #2 is a neural network for facial recognition. .

입력된 이미지에 대해서, K밸류부에서 1) 입력이미지해상도 수준, 2) 안면영역당 평균픽셀수, 3)안면비율 등의 인자를 고려하여 판단하며, 안면검출용으로 판단될 경우, 안면검출용 백본(BackBone) 신경망 #1 (Face Detection 신경망, Localization 신경망)만을 거친 후 종료된다. Regarding the input image, the K value unit determines factors such as 1) the input image resolution level, 2) the average number of pixels per face area, and 3) the face ratio, and if it is determined for face detection, it is used for face detection. It is terminated after passing through only the BackBone neural network #1 (Face Detection Neural Network, Localization Neural Network).

또한, K밸류부에서 안면인식용으로 판단할 경우, 안면인식용 BackBone 신경망#1(Face Detection 신경망, Localization 신경망)을 이용한 후, 다시 LandMarking 과정을 거쳐서 기준 Model을 만든 후, 해당 이미지는 안면인식용 BackBone신경망#2(Recogniton 신경망)을 거쳐서 해당 이미지에 고유한 인코딩 벡터를 생성한다. In addition, if the K-value department determines that it is for facial recognition, use BackBone Neural Network #1 (Face Detection Neural Network, Localization Neural Network) for facial recognition, then go through the LandMarking process again to create a reference model, and then the image is used for facial recognition. Through the BackBone Neural Network #2 (Recogniton Neural Network), an encoding vector that is unique to the image is created.

또한, K밸류부에서 안면인증용으로 판단할 경우, 안면인증용 BackBone 신경망#1(Face Detection 신경망, Localization 신경망), 과 안면인증용 BackBone신경망#2(Recogniton 신경망)을 거쳐서 해당 이미지에 고유한 인코딩 벡터를 생성한다. In addition, if the K Value Department determines that it is for facial authentication, the image is uniquely encoded through BackBone neural network #1 for facial authentication (Face Detection neural network, Localization neural network) and BackBone neural network for facial authentication #2 (Recogniton neural network). Create a vector.

이하에서는, 보조신경망을 이용하여 안면을 검출(face detection) 또는 인식(face recognition)하는 프로세스에 대하여 설명하기로 한다. Hereinafter, a process of face detection or face recognition using an auxiliary neural network will be described.

도 7에서는 본 발명에 따른 안면인식 프로세스를 설명하고 있다. 이에 도시된 바와 같이, 보조신경망#1로 표시된 부분은 실물인증(Liveness인증)이 적용되는 부분이고, 보조신경망#2로 표시되는 부분은 각도보정/조도보정이 이루어지는 신경망이다 7 illustrates a facial recognition process according to the present invention. As shown, the part marked with auxiliary neural network #1 is the part to which realness authentication is applied, and the part marked with auxiliary neural network #2 is the neural network where angle correction/illuminance correction is performed.

이미지데이터가 전송이 되고 K밸류가 안면인증용으로만 정해졌다고 가정하면, 이때 백본신경망을 지나기 전에 보조신경망#1(Liveness 보조신경망)이 제일 먼저 이미지 데이터를 처리한다. Assuming that image data is transmitted and K-value is set only for facial authentication, at this time, auxiliary neural network #1 (Liveness auxiliary neural network) processes the image data first before passing through the backbone neural network.

즉, 특별한 적외선 센서, 거리센서 등을 이용하지 않고, CNN(Convolution Neural Network)을 이용하여 학습된 Liveness용 보조신경망으로 안면 스푸핑(spoofing)공격을 판단할 수 있도록 한다. That is, without using a special infrared sensor or distance sensor, it is possible to determine a facial spoofing attack with an auxiliary neural network for Liveness learned using a Convolution Neural Network (CNN).

Liveness용 보조신경망은 오브젝트 디텍션 모듈을 포함하고 있으며, 모바일단말기로부터 전송된 비실물(즉, 사진/동영상)에 대해서 객체의 특징을 추출하도록 학습을 수행한 신경망이다. The auxiliary neural network for liveness includes an object detection module, and is a neural network that learns to extract features of an object for non-real objects (ie, photos/videos) transmitted from a mobile terminal.

보조신경망#2는 안면의 기하학적인 위치에 LandMarking 작업을 완료한 후 기준Model을 확정하기 위한 작업을 진행하기 전에 수행하게 된다. 안면이 상/하/좌/우로 회전된 만큼 보정을 해서 정면을 바라보게 하거나, 또는 현재의 조도수준이 너무 높거나 낮을 경우 이러한 변화에 민감하지 않도록 조정하여, 기준 Model을 만든다. The auxiliary neural network #2 is performed after completing the LandMarking work on the geometric position of the face and before proceeding with the work to determine the reference model. As long as the face is rotated up/down/left/right, it is corrected to look at the front, or if the current illuminance level is too high or too low, it is adjusted so as not to be sensitive to such changes to create a reference model.

한편, 본 발명의 이해를 돕기 위해, 해상도 보정 / 조도보정에 대하여 설명하도록 한다.On the other hand, in order to help understand the present invention, the resolution correction / illuminance correction will be described.

도 8에 도시된 바와 같이, ImageNet data set에는 수백만 장의 이미지가 저장되어 있고, 이들 대부분은 40 내지 140 pixel 크기의 이미지가 80% 이상을 점유하고 있다. As shown in FIG. 8, millions of images are stored in the ImageNet data set, and most of them occupy more than 80% of images having a size of 40 to 140 pixels.

그리고 실제로 입력유닛으로부터 입력되는 이미지 파일의 전체 해상도는 도시된 바와 같이 다양하다.In addition, the total resolution of the image file actually input from the input unit varies as shown.

즉, 우리가 ImageNet의 데이터들을 사용하여 인공 신경망 학습에 사용하는 경우, 학습이 완료된 신경망은, 이 같은 학습된 해상도 범위(Pixel 범위)에 최적화될 것이고, 이러한 Object가 포함된 이미지 파일은 여러 해상도로 표현되어, 입력유닛에서 전송될 것이다. In other words, when we use the data of ImageNet for artificial neural network training, the neural network that has been trained will be optimized for this learned resolution range (Pixel range), and the image file containing these objects will have multiple resolutions. Expressed and will be transmitted in the input unit.

입력 영상들은 기본적으로 해상도(resolution)가 각각 다르기 때문에 동일한 신경망에서 모든 해상도를 최적화시키는 작업은 비용과 시간이 많이 소요될 뿐만 아니라 속도와 정확도는 서로 상반되는 관계를 갖기 때문에 한 개의 백본(Backbone) 신경망으로 모든 해상도를 인식하려면 대단히 큰 구조의 신경망과 많은 파라미터들이 계산되어야 하므로 속도의 감소문제가 발생된다.Since input images basically have different resolutions, optimizing all resolutions in the same neural network takes a lot of cost and time, and speed and accuracy have opposite relations, so a single backbone neural network is used. In order to recognize all resolutions, a neural network with a very large structure and many parameters must be calculated, resulting in a speed reduction problem.

조도의 경우에도, 환경변화의 영향을 많이 받는다. 즉 일기에 따라 밝은 날씨, 어두운 날씨, 또는 거실, 사무실, 복도 등의 위치에 따라서 조도의 변화가 발생하고 이를 보정하는 작업이 필요하다.Even in the case of illuminance, it is highly affected by environmental changes. That is, depending on the weather, a change in the illuminance occurs depending on the location of bright weather, dark weather, or living room, office, corridor, etc., and it is necessary to correct it.

등급별 조도 분류 Classification of illuminance by grade CaseCase 조도(Lux)Illuminance (Lux) L1L1 5050 L2L2 100100 L3L3 200200 L4L4 300300 L5L5 500500 L6L6 10001000 L7L7 20002000 L8L8 50005000

모든 환경변화에 대비하여 이를 감지하여 적절한 백본신경망과 보조신경망을 사용하여 안면검출/인식을 수행하려고 할 때, 환경변화에 능동적으로 대응할 수 있고 환경변화에 비교적 영향을 덜 받는 안정적인 안면검출/인식 시스템이 제공될 수 있다. Stable facial detection/recognition system that can actively respond to environmental changes and is relatively less affected by environmental changes when attempting to perform facial detection/recognition using appropriate backbone neural networks and auxiliary neural networks by detecting them in preparation for all environmental changes Can be provided.

도 2는 본 발명의 구체적인 실시예에 의한 안면 검출 및 식별 시스템의 구성을 도시한 블록도이고, 도 3은 본 발명의 다양한 적용분야별 구동 과정을 도시한 예시도이며, 도 4는 본 발명의 구체적인 실시예를 구성하는 AI분석 서버의 구성을 도시한 예시도이고, 도 5는 본 발명의 구체적인 실시예를 구성하는 입력유닛과 백앤드 서버 및 모바일기기 간 안면인식 API의 사용과정을 도시한 예시도이며, 도 6은 본 발명의 구체적인 실시예에서 실물보정 알고리즘에 의한 보정원리를 도시한 예시도이고, 도 7은 본 발명의 구체적인 실시예에 의한 백본 신경망의 운용과정을 도시한 예시도이며, 도 8은 이미지 저장소(Image set)에 저장된 이미지들의 크기 및 해상도 통계치를 도시한 예시도이다.FIG. 2 is a block diagram showing the configuration of a face detection and identification system according to a specific embodiment of the present invention, FIG. 3 is an exemplary view showing a driving process for various fields of application of the present invention, and FIG. 4 is a detailed diagram of the present invention. An exemplary diagram showing the configuration of an AI analysis server constituting an embodiment, and FIG. 5 is an exemplary diagram showing a process of using a facial recognition API between an input unit constituting a specific embodiment of the present invention, a backend server, and a mobile device 6 is an exemplary diagram showing a correction principle by an object correction algorithm in a specific embodiment of the present invention, and FIG. 7 is an exemplary diagram showing an operation process of a backbone neural network according to a specific embodiment of the present invention. 8 is an exemplary diagram showing size and resolution statistics of images stored in an image set.

먼저, 도 2에 도시된 바와 같이, 본 발명에 의한 안면 검출 및 식별 시스템은 입력유닛(110, 120, 130) 및 관리서버(200)를 포함하여 구성된다.First, as shown in Figure 2, the face detection and identification system according to the present invention is configured to include the input unit (110, 120, 130) and the management server (200).

상기 입력유닛은 다양한 형태의 영상 입력장치(모바일 단말기(110), 웹단말기(120), 로컬단말기(130))가 적용될 수 있는데, 카메라로 촬영 가능한 모든 디바이스(모바일기기, 110), PC와 USB연결로 이루어진 웹카메라(웹단말기, 120), Local IP camera(130)일 수 있다.The input unit can be applied to various types of video input devices (mobile terminal 110, web terminal 120, local terminal 130), and all devices (mobile devices, 110) that can be photographed with a camera, PC and USB It may be a connected web camera (web terminal, 120) or a Local IP camera 130.

이때, 촬영된 동영상은 실시간으로 스트리밍데이터 형태로 백엔드 서버(210)로 전송이 이루어는데, 카메라 대수 및 부하량에 따라서 적절하게 프레임 분할이 이루어진다. At this time, the captured video is transmitted to the backend server 210 in the form of streaming data in real time, and the frame is appropriately divided according to the number of cameras and the amount of load.

부하분산은 웹단말기(120), 모바일단말기(110), 로컬단말기(130)로 각각 이루어지는 부하분산일수도 있고, 예를 들어 로컬단말기(130)에서 여러 대의 IP카메라 사이에 이루어지는 부하분산일수도 있다.The load distribution may be a load distribution composed of the web terminal 120, the mobile terminal 110, and the local terminal 130, respectively, or, for example, may be a load distribution between several IP cameras in the local terminal 130. .

한편, 백엔드서버(210)는 접속된 모든 기기에 대해서 동영상 스트리밍데이터에서 설정된 프레임수를 낮추는 방법으로 부하분산을 유도한다. Meanwhile, the backend server 210 induces load distribution by lowering the number of frames set in the video streaming data for all connected devices.

예를 들어 로컬단말기(130)에서 백엔드 서버(210)로 전송된 동영상은 다시 부하분산을 위해서 10 내지 30 frame으로 나누어지고 해당 프레임 데이터를 AI분석서버(220)가 분석한다. 지정된 전송 frame 숫자, 예를 들어 (130)과 같은 경우, 24 FPS보다 늦어지는 단말기를 확인하여 해당 단말기의 fps를 24에서 20으로 낮추어서 전송할 수 있도록 하여 전체적으로 해당 단말기로 인해 부하를 받는 영향이 적도록 유도한다. For example, a video transmitted from the local terminal 130 to the backend server 210 is divided into 10 to 30 frames for load distribution, and the frame data is analyzed by the AI analysis server 220. In the case of a designated transmission frame number, for example, (130), a terminal that is later than 24 FPS is identified and the fps of the terminal is lowered from 24 to 20 so that it can be transmitted so that the overall load is less affected by the terminal. To induce.

상기 모바일 단말기(110)는 자체내장 카메라를 구비하여 촬영기능이 제공된다. 그리고 상기 웹단말기(200)는 PC와 USB포트를 통하여 연결되는 웹카메라를 이용하여 브라우저 환경에서 복수의 대상체(인물)들을 인식할 수 있는 것이 바람직하다.The mobile terminal 110 is provided with a self-contained camera to provide a photographing function. In addition, it is preferable that the web terminal 200 can recognize a plurality of objects (persons) in a browser environment using a web camera connected through a PC and a USB port.

그리고 상기 로컬단말기(130)는 다수의 인물을 관찰할 수 있도록 공개된 장소에 설치되어, IP통신이 가능하도록 구성되는, 보안 IP 카메라일 수 있다.In addition, the local terminal 130 may be a security IP camera installed in a public place so as to observe a number of people and configured to enable IP communication.

한편, 상기 관리서버 (200)는 상기 입력유닛으로부터 수신되는 영상데이터를 스트리밍형태로 입력받은 후, 수신된 영상데이터의 부하 및 프레임을 조정하는 백엔드(Back-end)서버(210)와, 상기 백엔드 서버(210)로부터 조정된 영상데이터를 인공신경망 알고리즘을 통해 분석하여 상기 영상데이터 상의 안면을 검출하거나 식별하도록 하는 AI분석서버(220)를 포함하여 구성된다.On the other hand, the management server 200 receives the video data received from the input unit in a streaming format, and then a back-end server 210 for adjusting the load and frame of the received video data, and the back-end It is configured to include an AI analysis server 220 that analyzes the image data adjusted from the server 210 through an artificial neural network algorithm to detect or identify a face on the image data.

상기 백엔드(BackEnd) 서버(210)와 Al분석서버(220) 같은 로컬 서버로 구성될 수도 있고, 클라우드 형태로 서로 분리된 형태로 구성될 수 있다.It may be configured as a local server such as the BackEnd server 210 and the Al analysis server 220, or may be configured in a form separated from each other in a cloud form.

한편, 도 3에 도시된 바와 같이, 입력 영상(streaming data)이 전송되면, 백엔드 서버(210)에서 부하분산과정을 거쳐서 AI분석 서버(220)로 전달된다. Meanwhile, as shown in FIG. 3, when an input image (streaming data) is transmitted, it is transmitted to the AI analysis server 220 through a load distribution process in the backend server 210.

그리고 전달된 동영상을, 프레임 단위로 나눈 이미지파일들을 이용하여 AI분석 서버(220)가 분석(안면검출/인식 작업)을 수행한다.Then, the AI analysis server 220 analyzes (facial detection/recognition work) using image files divided by frame units of the transferred video.

이를 위해, 상기 AI 분석서버(220)는, 영역특정부, K-밸류부, AI 분석부 및 영상 보정부를 포함하여 구성된다.To this end, the AI analysis server 220 is configured to include a region-specific government, a K-value unit, an AI analysis unit, and an image correction unit.

상기 영역특정부는, 안면검출 알고리즘 신경망(RPN 신경망)을 통해 입력된 영상데이터에서, 안면영역을 특정(안면탐지 및 해당 안면 위치탐지결과물로 안면영역을 Bounding 하는 작업 )하는 부분으로, 안면의 인식은 얼굴의 형태, 피부색과 주변 경계 사시의 구분(?), 얼굴의 입과 같은 부위가 동작할 때의 움직임(head shape / color contrast / head movement / mouse speaking)을 감지하여 안면부위로 검출할 수 있다.The region-specific unit is a part for specifying the facial region (a task of bounding the facial region with the result of facial detection and the corresponding facial position detection) from the image data input through the facial detection algorithm neural network (RPN neural network). Face shape, skin color and boundary strabismus distinction (?), and movement when a part such as the mouth of the face is in motion (head shape / color contrast / head movement / mouse speaking) can be detected as a facial area.

다음으로, 상기 K-밸류부는 전체 이미지 영역에서, 안면영역의 개수, 입력 영상데이터의 해상도 정보 및 입력 영상 데이터의 개수를 판별하는 부분으로, 이들에 대한 K밸류(K값)들을 통해 영상 데이터의 분석 방법이 결정될 수 있다. 이때, 상기 K값은 특정한 하나의 값으로 산출될 수도 있으나, 바람직하게는 다양한 특성 요소들에 대한 값들의 집합으로 표현될 수 있다. Next, the K-value unit determines the number of facial areas, resolution information of the input image data, and the number of input image data in the entire image area. The method of analysis can be determined. In this case, the K value may be calculated as a specific value, but preferably may be expressed as a set of values for various characteristic elements.

그리고 AI 분석부는 안면검출 알고리즘 및 안면인식 알고리즘을 포함하여 구성되어, 입력 영상데이터에 대하여 상기 안면검출 알고리즘을 통해 전체 영상에서 안면을 검출(특정)하는 영상 분석을 수행하도록 하거나, 안면인식 알고리즘을 통해 입력 영상이 포함된 안면을 인식하여 안면 이미지를 판별하는 분석을 수행할 수 있다.In addition, the AI analysis unit is configured to include a face detection algorithm and a face recognition algorithm, and performs an image analysis of detecting (specific) a face from the entire image through the face detection algorithm on the input image data, or through a face recognition algorithm. An analysis to determine a facial image may be performed by recognizing a face including an input image.

상기 AI분석부의 분석 과정과 이를 위한 알고리즘에 대하여는 이후에 다시 상세히 설명하기로 한다.The analysis process of the AI analysis unit and the algorithm for this will be described in detail later.

한편, 상기 영상 보정부는 입력 영상데이터에 대한 조도, 해상도, 각도 및 실물(liveness) 보정을 수행할 수 있도록, 조도보정 알고리즘, 해상도 보정 알고리즘, 각도 보정 알고리즘 및 실물보정 알고리즘을 포함하여, 상기 AI 분석부에서 선택된 안면정보 분석 알고리즘의 종류에 따라 선별적으로 보정 알고리즘을 선택하여 수행한다.Meanwhile, the image correction unit includes an illuminance correction algorithm, a resolution correction algorithm, an angle correction algorithm, and an object correction algorithm so that illumination, resolution, angle and liveness correction for input image data can be performed. The correction algorithm is selectively selected and performed according to the type of facial information analysis algorithm selected in the part.

이하에서는 상기 AI분석부의 분석 과정과 영상 보정부의 보정과정을 더욱 상세히 설명하기로 한다. Hereinafter, the analysis process of the AI analysis unit and the correction process of the image correction unit will be described in more detail.

먼저, 상기 AI분석부의 분석은 크게 안면검출과 안면인식으로 구분되고, 상기 안면인식은 다시 세부적으로 식별용 안면인식과 인증용 안면인식으로 구분되어, 총 3가지 분석으로 구분될 수 있다.First, the analysis of the AI analysis unit is largely divided into face detection and face recognition, and the face recognition is further divided into face recognition for identification and face recognition for authentication, and can be divided into a total of three analysis.

여기서, 상기 안면검출은 전체 입력 영상에서 사람의 안면을 구별해내는 것으로, 이를 통해 영상내 사람의 수, 사람의 이동량, 이동 방향 등을 산출하는데 이용될 수 있으며, 얼굴을 다른 오브젝트와 구별하기 위해서는 해당 안면 영역을 구성하는 픽셀의 개수가 최소 대략 25×25 pixel 이상인 것이 바람직하다.Here, the face detection is to distinguish a person's face from the entire input image, and through this, it can be used to calculate the number of people in the image, the amount of movement of the person, and the direction of movement. In order to distinguish the face from other objects, It is preferable that the number of pixels constituting the facial area is at least approximately 25×25 pixels.

그리고 상기 안면검출은 사람의 안면 이미지를 다른 대상(Object)와 구별하고, 안면 이미지가 누구인지 개인을 특정할 수는 없으나, 사람의 얼굴형태( 눈 / 귀 위치 등)로 판단하는 영역이 어느 위치에 존재하는지 판단하여 외곽선을 그리는 작업(Boundering) 이다. 즉 대상의 분류와 위치정보만을 파악할 수 있는 단계이다. In addition, the face detection distinguishes a person's facial image from other objects, and it is not possible to specify who the facial image is, but where the area judged by the shape of the person's face (eye/ear position, etc.) It is a task (Boundering) to draw an outline by judging whether it exists in. That is, it is a step in which only the classification and location information of the object can be grasped.

특정 밀집지역의 사람 숫자의 실시간 파악은 가능하지만 동일인물임을 추적할 수 있는 단계는 아니다. 특정장소에 존재하는 사람의 숫자, 성별 정도 데이터를 이용하여 거시적인 통계데이터를 추론할 수 있다. Real-time identification of the number of people in a specific dense area is possible, but it is not a step to track that they are the same person. Macroscopic statistical data can be inferred using data on the number and gender of people in a specific place.

이때, 안면 이미지의 안면검출을 위하여는 해당 안면 영역을 구성하는 픽셀의 개수가 대략 25×25 pixel 이상인 것이 바람직하다.In this case, in order to detect the face of the facial image, it is preferable that the number of pixels constituting the corresponding facial area is approximately 25×25 pixels or more.

그리고 상기 식별용 안면인식은 사람의 안면 이미지를 구별하고, 나아가 해당 안면의 특징을 식별하는 것으로, 안면 이미지가 누구인지 개인을 특정하는 단계이며 안면 이미지의 특성정보(인종, 성별, 연령, 피부색, 얼굴형태구조 등)도 추가로 파악 가능하다.In addition, the identification facial recognition is a step of identifying a person's facial image and further identifying the characteristics of the corresponding facial, specifying who the facial image is, and characteristic information of the facial image (race, gender, age, skin color, Face shape structure, etc.) can also be identified.

이때, 안면 이미지의 특성 정보를 파악하기 위하여는 해당 안면 영역을 구성하는 픽셀의 개수가 대략 50×50 pixel 이상이어야 한다.At this time, in order to grasp the characteristic information of the facial image, the number of pixels constituting the corresponding facial area should be approximately 50×50 pixels or more.

마지막으로, 상기 인증용 안면인식은 안면이미지가 누구인지 확인하는 것으로 사용자 인증 등에 사용되는 인증용 분석을 말하는 것으로, 신뢰성 있는 인증 결과를 도출하기 위하여는 안면 영역을 구성하는 픽셀의 개수가 대략 90×90 pixel 이상인 것이 바람직하다.Finally, the facial recognition for authentication refers to an authentication analysis used for user authentication, which identifies who the facial image is.In order to derive a reliable authentication result, the number of pixels constituting the facial area is approximately 90× It is desirable to have more than 90 pixels.

이와 같은, 안면검출 및 인식은 각각 안면검출 알고리즘, 식별용 인식알고리즘 및 인증용 인식알고리즘에 의해 수행되며, 이들 알고리즘은 CNN 인공신경망으로 구성되어, 인공지능 학습에 의해 구현된다.Such face detection and recognition are performed by a face detection algorithm, a recognition algorithm for identification, and a recognition algorithm for authentication, respectively, and these algorithms are composed of CNN artificial neural networks and are implemented by artificial intelligence learning.

이들 BackBone CNN신경망은 아래 [표 3]과 같이 사용조건이 구분될 수도 있다. These BackBone CNN neural networks may be classified in terms of use as shown in [Table 3] below.

최소인식 얼굴사이즈
(Min.Face.Size)Minimum recognized face size
(Min.Face.Size) 화면에서 얼굴이 차지하는 비율
(Face Ratio of window)The percentage of the face on the screen
(Face Ratio of window) 얼굴 개수
(No. of faces)Face count
(No. of faces) BackBone CNN#1
(안면검출 알고리즘)BackBone CNN#1
(Facial detection algorithm) 25x2525x25 20%이상20% or more 50이상50 or more BackBone CNN#2
(식별용 인식알고리즘)BackBone CNN#2
(Recognition algorithm for identification) 50x5050x50 10-20%10-20% 20-5020-50 BackBone CNN#3
(인증용 인식알고리즘)BackBone CNN#3
(Authentication recognition algorithm) 90x9090x90 10%이하below 10 1-31-3

전술한 바와 같이, 상기 AI 분석부는 입력된 영상 데이터에 대하여, 분석을 수행함에 있어, 안면검출 알고리즘, 식별용 인식알고리즘 또는 인증용 인식알고리즘 중 어느 하나를 선택하여 분석을 수행한다.As described above, the AI analysis unit selects any one of a face detection algorithm, an identification recognition algorithm, or an authentication recognition algorithm to perform analysis on the input image data.

그리고 이들 알고리즘의 선택에 따라 해당 입력 영상에 대하여 수행되는 분석은 안면검출, 안면의 식별 또는 안면의 인증 중 어느 하나의 기능을 수행하게 된다.In addition, the analysis performed on the input image according to the selection of these algorithms performs any one of face detection, face identification, or face authentication.

이때, 기본적으로, 상기 AI분석부는, 입력 영상에 포함된 안면이미지들의 구성하는 각각의 픽셀수 들이 대부분 25x25 내지 50x50 사이에 속하는 경우, 안면검출 알고리즘에 의해 안면검출을 수행하고, 상기 안면이미지들의 구성하는 각각의 픽셀수 들이 대부분 50x50 내지 90x90 사이에 속하는 경우, 식별용 인식알고리즘에 의해 식별용 안면인식을 수행하며, 상기 안면이미지들의 구성하는 각각의 픽셀수 들이 대부분 90x90 이상인 경우, 인증용 인식알고리즘에 의해 인증용 안면인식을 수행한다.At this time, basically, when the number of pixels constituting the facial images included in the input image is mostly between 25x25 and 50x50, the AI analysis unit performs face detection by a face detection algorithm, and the composition of the face images When the number of pixels is mostly between 50x50 and 90x90, face recognition for identification is performed by an identification recognition algorithm, and when the number of pixels constituting the face images is mostly 90x90 or more, the recognition algorithm for authentication By performing facial recognition for authentication.

그러나 입력 영상에 포함된 안면이미지들의 구성하는 각각의 픽셀수 들이 기준픽셀수 이상인 안면이미지들과 이하인 안면이미지들이 비슷한 비율로 구분되는 경우, 이에 대한 분석 알고리즘 선택의 기준이 마련되어야 한다.However, when the number of pixels constituting the facial images included in the input image is divided into a similar ratio between the facial images having the reference pixel number or more and the facial images having the lower number of pixels, a criterion for selecting an analysis algorithm for this must be prepared.

여기서 기준픽셀수란 안면검출과 안면인식의 기준이 되는 픽셀수(안면인식 기준피셀수, 50x50) 또는 식별용 안면인식과 인증용 안면인식의 구분 기준이 되는 픽셀수(인증용 기준픽셀수, 90x90)를 말한다.Here, the number of reference pixels refers to the number of pixels that are the standard for face detection and facial recognition (the number of pixels for facial recognition, 50x50), or the number of pixels that are the criteria for the classification of face recognition for identification and facial recognition for authentication (the number of pixels for authentication, 90x90 ).

본 발명에서는 각 안면영역들에 포함된 픽셀수의 평균이 안면인식 기준픽셀수와 오차범위 내에서 일치하는 경우, 상기 각 안면영역들의 구성 픽셀수 편차평균값이 기설정된값 이상인 경우 안면인식 알고리즘을 분석 알고리즘으로 선택하고, 기설정된값 미만인 경우 안면검출 알고리즘을 분석 알고리즘으로 선택한다.In the present invention, when the average of the number of pixels included in each of the facial areas matches within an error range with the reference number of facial recognition pixels, the facial recognition algorithm is analyzed when the average deviation of the number of pixels of each facial area is equal to or greater than a preset value. It is selected as an algorithm, and if it is less than a preset value, a face detection algorithm is selected as an analysis algorithm.

물론, 인증용 기준픽셀의 경우도 마찬가지로, 각 안면영역들에 포함된 픽셀수의 평균이 인증용 기준픽셀수와 오차범위 내에서 일치하는 경우, 상기 각 안면영역들의 구성 픽셀수 편차평균값이 기설정된값 이상인 경우 인증용 인식알고리즘을 분석 알고리즘으로 선택하고, 기설정된값 미만인 경우 식별용 인식알고리즘을 분석 알고리즘으로 선택한다.Of course, in the case of the reference pixel for authentication, similarly, when the average of the number of pixels included in each facial region is within an error range with the reference pixel number for authentication, the average deviation of the number of constituent pixels of each facial region is preset. If the value is higher than the value, the recognition algorithm for authentication is selected as the analysis algorithm, and if it is less than the preset value, the recognition algorithm for identification is selected as the analysis algorithm.

즉, 각 안면영역들에 포함된 픽셀수의 평균이 기준픽셀수와 오차범위 내에서 일치하는 경우는, 모든 안면영역 이미지들이 기준픽셀수와 유사한 픽셀수로 구성되는 경우와, 안면영역 이미지들의 구성 픽셀수와 기준픽셀수의 차이는 크나 그 양(+) 값과 음(-)의 값이 유사한 경우가 있다. In other words, when the average of the number of pixels included in each facial area matches the reference pixel number within the error range, all facial area images are composed of the number of pixels similar to the reference pixel number, and the configuration of the facial area images The difference between the number of pixels and the number of reference pixels is large, but there are cases where the positive (+) value and the negative (-) value are similar.

후자의 경우에는 일부의 안면영역 이미지들이 영상 입력 단말기 가까이에 위치하고, 다른 안면 이미지들은 배경으로 멀리 배치되는 경우를 의미하므로, 이 경우에는 영상 입력 단말기 가까이 위치하여 높은 해상도를 갖는 안면 이미지를 기준으로 분석을 수행하는 반면 전자의 경우, 전체적인 안면 이미지들이 분석 가능 한계픽셀수 경계에 군집된 경우이므로, 낮은 픽셀수를 기준으로 하는 분석을 수행하도록 한다.In the latter case, it means that some of the facial area images are located near the image input terminal and other facial images are located far from the background. On the other hand, in the former case, since the entire facial images are clustered at the boundary of the limit of the number of possible analysis pixels, the analysis based on the low number of pixels is performed.

전술한 바와 같은 분석 알고리즘 선택은 상기 K-밸류부에서 산출하는 K값에 의해 결정될 수 있다.The selection of the analysis algorithm as described above may be determined by the K value calculated by the K-value unit.

즉, 상기 K값은 입력 영상데이터의 해상도 픽셀을 기준으로 최소인식 안면크기, 이미지상의 안면비율 및 안면 개수에 따라 안면에 포함된 픽셀수를 판별하여 산출되는데, 상기 K값은 특정한 하나의 숫자로 환산될 수도 있고, 다양한 판별요소에 대한 복수의 데이터일 수도 있다.That is, the K value is calculated by determining the number of pixels included in the face according to the minimum recognized face size, the face ratio on the image, and the number of faces based on the resolution pixels of the input image data, and the K value is a specific number. It may be converted or may be a plurality of data for various discrimination factors.

이하에서는 영상보정부에 의한 다양한 영상보정에 대하여 살피기로 한다.Hereinafter, various image corrections by the image correction unit will be examined.

먼저, 상기 조도보정 알고리즘은, 입력 이미지 내에서 3×3 pixel 크기의 스크린 윈도우를 이용하여 입력이미지 전체를 스캔하면서, 스크린 윈도우 상에서 추출된 특징점들을 스크린 윈도우의 중심 픽셀과 비교하여 차이값에 따라 2진화(0과 1로 구분)하여 보정을 수행할 수도 있다.First, the illuminance correction algorithm scans the entire input image using a screen window having a size of 3×3 pixels within the input image, and compares the feature points extracted on the screen window with the center pixel of the screen window, according to the difference value. It is also possible to perform corrections by evolution (divided by 0 and 1).

또한, 상기 해상도보정 알고리즘은, 안면 이미지를 포함하는 정방향 영역의 구성 픽셀이 25×25 내지 50×50인 경우 수행되는 것으로, 이 경우 안면 인식의 기준요건은 만족하나 해상도가 충분하지는 못하므로 해상도 향상을 위한 보정을 수행하는 것이다. In addition, the resolution correction algorithm is performed when the constituent pixels of the forward region including the facial image are 25×25 to 50×50, and in this case, the reference requirement for facial recognition is satisfied, but the resolution is not sufficient. Is to perform corrections for

이때, 상기 해상도 보정은 상기 안면 이미지를 포함하는 정방향 영역 테두리 부분을 마스킹하는 패딩(padding)작업에 의해 수행될 수 있다.In this case, the resolution correction may be performed by a padding operation of masking an edge portion of the forward region including the facial image.

즉, 패딩(padding)작업은 해상도가 비교적 낮은 얼굴안면부의 경우, 낮은 해상도에도 얼굴검출 효과를 높이기 위하여, 안면이미지와 대비되는 Boundary layer를 만들어서 작은 얼굴이미지의 검출을 용이하게 하는 것을 말한다. In other words, the padding operation is to facilitate detection of a small face image by creating a boundary layer in contrast to the face image in order to enhance the face detection effect even at a low resolution in the case of a face face portion having a relatively low resolution.

한편, 안면 이미지의 얼굴의 특징적 요소는 각 부분(눈, 코, 입, 광대뼈, 턱)의 위치/ 크기/ 모양, 피부나 머리카락의 질감(주름/패턴,번점)이 주로 사용된다. On the other hand, as the characteristic elements of the face of the facial image, the location/size/shape of each part (eyes, nose, mouth, cheekbones, chin), and texture of skin or hair (wrinkles/patterns, blemishes) are mainly used.

이때, 안면검출을 위해서는 appearance base/learning base 2 가지 방법을 사용할 수 있다.At this time, two methods of appearance base/learning base can be used for facial detection.

한편, 상기 실물보정 알고리즘은, 입력 영상의 오브젝트 영역이 안면인지 비안면이지 및 안면이라면 실물 안면인지 비실물안면(안면 이미지가 재촬영된 영상)인지 여부를 판별하여, 해당 오브젝트 영역을 보정하는 알고리즘으로, 해당 알고리즘은 CNN 신경망을 통해 학습되어, 도 6에 도시된 바와 같이, 입력 영상데이터에 포함된 오브젝트를 검출하고, 검출된 오브젝트를 실물안면 오브젝트, 비실물안면 오브젝트 및 비안면 오브젝트로 구별하여, 해당 오브젝트가 실물안면이 아닌 경우, 해당 영역을 보정처리하도록 구현된다. On the other hand, the real correction algorithm is an algorithm that corrects the object area by determining whether the object area of the input image is a facial or non-facial, and if it is a facial, whether it is a real or non-real facial (an image from which a facial image is retaken). , The algorithm is learned through a CNN neural network, detects an object included in the input image data, as shown in FIG. 6, and distinguishes the detected object into a real facial object, a non-real facial object, and a non-facial object. If the object is not a real face, it is implemented to correct the corresponding area.

본 발명의 다른 실시예에 의하면, 본 발명에서는 상기 입력유닛에 안면인식 프로그램이 설치될 수도 있다.According to another embodiment of the present invention, in the present invention, a facial recognition program may be installed in the input unit.

한편, 상기 입력유닛에 안면인식 프로그램이 설치되지 않은 경우, 백엔드 서버(210)로 단말기 식별번호를 전송하고, 백엔드 서버(210)로 API링크를 요청하여, 암호화된 URL 및 연결주소를 제공받아, 모바일기기의 default browser( IoS의 경우에는 Safari, 안드로이드 기기는 chrome browser) 를 이용하여, 제공받은 URL(API)에 접속한다. On the other hand, when the facial recognition program is not installed in the input unit, the terminal identification number is transmitted to the backend server 210, and an API link is requested to the backend server 210 to receive an encrypted URL and a connection address, Use the default browser of the mobile device (Safari for IoS, chrome browser for Android devices) to access the provided URL (API).

또한, 상기 백엔드 서버(210)는, 동시에 다중의 입력유닛으로부터 전송되는 스트리밍 데이터를 순차적으로 분석하기 위하여, 입력유닛으로 응답요청신호를 발신하여 해당 입력유닛의 반응속도에 따라 데이터 전송속도를 로컬에서 지연시킬 수 있도록하는 딜레이부를 포함하여 구성될 수도 있다.In addition, in order to sequentially analyze the streaming data transmitted from multiple input units at the same time, the backend server 210 transmits a response request signal to the input unit and determines the data transmission rate locally according to the response rate of the corresponding input unit. It may be configured to include a delay unit to enable delay.

본 발명의 권리는 위에서 설명된 실시예에 한정되지 않고 청구범위에 기재된 바에 의해 정의되며, 본 발명의 분야에서 통상의 지식을 가진 자가 청구범위에 기재된 권리범위 내에서 다양한 변형과 개작을 할 수 있다는 것은 자명하다.The rights of the present invention are not limited to the embodiments described above, but are defined by what is described in the claims, and that a person having ordinary knowledge in the field of the present invention can make various modifications and adaptations within the scope of the rights described in the claims. It is self-evident.

본 발명은 비용 효율적인 방법으로 AI인공지능 학습망을 이용한 안면검출(face detection)과 안면인식(face recognition)을 수행하는 시스템에 관한 것으로, 본 발명에서는, 비대면 금융거래 등을 위한 모바일기기에서 안면에 대한 무자각 지속인증으로, 사용자에게 명시적인 고지 없이 사용자의 신원을 모니터링하고 확인함으로써 정상적인 서비스 흐름을 방해하지 않고 보안적인 위험성을 제거할 수 있는 효과가 있다.The present invention relates to a system for performing face detection and face recognition using an AI artificial intelligence learning network in a cost-effective method, and in the present invention, face detection in a mobile device for non-face-to-face financial transactions, etc. By monitoring and verifying the user's identity without explicit notice to the user, it is effective in removing security risks without interfering with the normal service flow.

110 : 모바일 단말기 120 : 웹 단말기
130 ; 로컬 단말기 200 : 관리서버
210 : 백엔드 서버 220 : AI분석 서버110: mobile terminal 120: web terminal
130; Local terminal 200: management server
210: backend server 220: AI analysis server

Claims

An input unit for receiving image data;
It comprises a management server for receiving and processing image data from the input unit:
The management server,
A back-end server that receives video data received from the input unit in a streaming format and adjusts a load and a frame of the received video data;
An AI analysis server configured to analyze the image data adjusted from the backend server through an artificial neural network algorithm to detect or identify a face on the image data, comprising:
The AI analysis server,
A region specifying unit for specifying a facial region from the image data input through the RPN neural network (a task of bounding the facial region with the result of facial detection and the corresponding facial position detection);
A K-value unit for determining the number of facial areas, resolution information of the input image data, and the number of input image data;
Consisting of a face detection algorithm to identify the shape of a face image by distinguishing it from other objects, and a face recognition algorithm to identify and authenticate characteristic information of the face image, the face detection algorithm or face recognition algorithm for input image data An AI analysis unit for selectively applying the facial information to analyze facial information;
Facial information analysis selected by the AI analysis unit, including illuminance correction algorithm, resolution correction algorithm, angle correction algorithm, and real object correction algorithm so that illumination, resolution, angle and liveness correction for input image data can be performed. It includes an image correction unit that selectively selects and performs correction algorithms according to the type of algorithm:
The facial recognition algorithm,
An identification algorithm for identifying characteristic information (race, sex, age, skin color and shape structure) of a facial image;
It includes a recognition algorithm for authentication that verifies and authenticates that the person is the same as a specific person, and:
The K-value unit,
Calculating a K value according to the average number of pixels per facial area of the input image data;
The AI analysis unit,
A multi-focusing AI neural network, characterized in that facial information is analyzed by selecting one of the facial recognition algorithm, identification recognition algorithm, or authentication recognition algorithm according to the average number of average pixels per facial area of the input image. Used face detection and face identification system.

The method of claim 1,
The AI analysis unit, according to the K value,
If the average value of the average number of pixels per facial area of the input image is between the number of reference pixels for face detection (25 × 25 pixels) and the number of reference essential cells for identification (50 × 50 pixels), select a face detection algorithm to obtain facial information. To analyze;
If the average value of the average number of pixels per facial area of the input image is between the number of reference mandatory cells for identification (50×50 pixels) and the number of reference pixels for authentication (90×90 pixels), select a recognition algorithm for identification and retrieve facial information. Analyze;
If the average value of the average number of pixels per facial area of the input image exceeds the reference number of pixels for authentication (90×90 pixels), select a recognition algorithm for authentication and analyze the facial information:
If the average value of the average number of pixels per facial area of the input image matches within the error range with the number of reference required cells for identification (50×50 pixels),
When the average deviation value of the number of constituent pixels of each facial region is greater than or equal to a preset value, the facial information is analyzed by selecting an identification recognition algorithm;
When the average deviation value of the number of constituent pixels of each facial area is less than the preset value, the facial information is analyzed by selecting as a facial detection algorithm:
If the average value of the average number of pixels per facial area of the input image matches the reference number of pixels for authentication (90×90 pixels) within the error range,
When the average deviation value of the number of constituent pixels of each facial region is greater than or equal to a preset value, the facial information is analyzed by selecting a recognition algorithm for authentication;
Face detection and face identification system using a multi-focusing AI neural network, characterized in that, when the average deviation value of the number of constituent pixels of each facial region is less than a preset value, facial information is selected as an identification recognition algorithm.

The method of claim 2,
The input unit,
A mobile terminal provided with a self-taking function through a self-contained camera;
A web terminal connected to a PC and capable of recognizing a plurality of people in a browser environment using a web camera; or
A local terminal installed in an open place so as to observe a number of people and configured to enable communication; Face detection and facial identification system using a multi-focusing AI neural network, characterized in that it is configured to include any one or more of.

delete

The method according to any one of claims 1 to 3,
The input unit,
A face detection and face identification system using a multi-focusing AI neural network, characterized in that a face recognition program is installed.

The method of claim 7,
The input unit,
If the facial recognition program is not installed, it transmits the terminal identification number to the backend server, requests an API link to the backend server, receives the encrypted URL and connection address, and uses the default browser of the mobile device, Face detection and face identification system using a multi-focusing AI neural network characterized by access to the provided URL (API).

The method of claim 8,
The backend server,
In order to sequentially analyze streaming data transmitted from multiple input units at the same time, it includes a delay unit that sends a response request signal to the input unit to locally delay the data transmission rate according to the response rate of the corresponding input unit. Face detection and face identification system using a multi-focusing AI neural network, characterized in that.

The method of claim 7,
The illuminance correction algorithm,
While scanning the entire input image using a screen window of 3×3 pixel size within the input image, the feature points extracted on the screen window are compared with the center pixel of the screen window, and the binarization (0 and 1 Classification) Face detection and face identification system using a multi-focusing AI neural network characterized by performing correction by numbers.

The method of claim 10,
The resolution correction algorithm,
Performed when the constituent pixels of the forward area including the facial image are 25×25 to 50×50;
A face detection and face identification system using a multi-focusing AI neural network, characterized in that a padding operation for masking an edge of a forward area including the facial image is performed.

The method of claim 7,
The real correction algorithm,
It is learned through a CNN neural network, and includes an object detection module to detect an object included in the input image data;
It is designed to classify the detected object into 1) real facial object, 2) non-real facial object, and 3) non-facial object. A face detection and face identification system using a multi-focusing AI neural network characterized by learning through a CNN artificial neural network.