KR102184490B1

KR102184490B1 - Edge Device for Face Recognition

Info

Publication number: KR102184490B1
Application number: KR1020180168667A
Authority: KR
Inventors: 박성찬; 현승훈; 김수상; 백지현; 김영재; 석재호; 김채홍
Original assignee: 주식회사 포스코아이씨티
Priority date: 2018-12-24
Filing date: 2018-12-24
Publication date: 2020-11-30
Also published as: KR20200079095A

Abstract

중앙서버에서 생성된 얼굴인식모델을 각 에지 디바이스에 배포함으로써 얼굴인식 및 인증처리를 에지 디바이스에 수행할 수 있는 본 발명의 일 측면에 따른 얼굴인식용 에지 디바이스는 인증대상이 되는 타겟 사용자의 촬영 이미지를 획득하는 제1 촬영부; 상기 촬영 이미지로부터 생성된 입력 이미지들을 얼굴인식모델에 입력하여 상기 입력 이미지로부터 타겟 얼굴이미지를 추출하고, 추출된 타겟 얼굴이미지로부터 타겟 특징벡터를 생성하는 얼굴인식부; 상기 타겟 특징벡터를 사용자의 식별정보 및 해당 사용자의 얼굴이미지로부터 추출된 특징벡터들을 갖는 복수개의 어레이로 구성된 어레이 파일과 비교하여 상기 타겟 사용자를 인증하는 인증부; 및 중앙서버로부터 상기 어레이 파일과 상기 얼굴인식모델을 수신하는 인터페이스부를 포함하는 것을 특징으로 한다. The edge device for face recognition according to an aspect of the present invention capable of performing face recognition and authentication processing to the edge device by distributing the face recognition model generated in the central server to each edge device is a photographed image of a target user to be authenticated. A first photographing unit to obtain a signal; A face recognition unit that inputs the input images generated from the captured image into a face recognition model, extracts a target face image from the input image, and generates a target feature vector from the extracted target face image; An authentication unit for authenticating the target user by comparing the target feature vector with an array file consisting of a plurality of arrays having user identification information and feature vectors extracted from the user's face image; And an interface unit for receiving the array file and the face recognition model from a central server.

Description

Edge Device for Face Recognition

본 발명은 얼굴인식에 관련된 것으로서, 보다 구체적으로 얼굴인식용 다바이스에 관한 것이다.The present invention relates to face recognition, and more specifically, to a device for face recognition.

얼굴인식(Face Recognition) 기술이란 생체인식(Biometrics) 분야 중의 하나로써 사람마다 얼굴에 담겨있는 고유한 특징 정보를 이용하여 기계가 자동으로 사람을 식별하고 인증하는 기술을 의미하는 것으로서, 비밀번호 등에 의한 기존의 인증방식에 비해 보안성이 뛰어나 최근 다양한 분야에서 널리 이용되고 있다. Face Recognition technology is one of the fields of biometrics, and refers to a technology in which a machine automatically identifies and authenticates a person using unique feature information contained in each person's face. It has superior security compared to the authentication method of, and is widely used in various fields recently.

일반적인 얼굴인식시스템은 출입게이트 등에 설치된 디바이스에서 촬영된 얼굴이미지를 서버로 전송하고, 서버가 얼굴인식 및 얼굴인식에 따른 사용자 인증을 수행하고 인증결과를 디바이스로 전송함으로써 출입게이트의 개방여부를 결정한다.In a general face recognition system, a face image captured by a device installed at an entrance gate is transmitted to the server, and the server performs user authentication according to face recognition and face recognition, and the authentication result is transmitted to the device to determine whether the entrance gate is open. .

상술한 바와 같은 일반적인 얼굴인식시스템의 경우, 얼굴인식기능은 물론 얼굴인식 결과에 따른 인증기능까지 모두 서버에서 구현되기 때문에, 출입하는 사용자의 인원이 많은 곳에 얼굴인식시스템을 적용하고자 하는 경우 고성능 및 고가의 서버가 요구되어 얼굴인식시스템의 구축비용이 증가하게 된다는 문제점이 있다.In the case of the general face recognition system as described above, not only the face recognition function but also the authentication function according to the face recognition result are all implemented on the server. Therefore, when applying the face recognition system to a place with a large number of users entering and exiting, high performance and expensive There is a problem that the cost of building a face recognition system increases due to the need for a server of

또한, 일반적인 얼굴인식시스템의 경우 얼굴인식기능이 서버에 집중되어 있기 때문에 서버 또는 네트워크에 장애가 발생하게 되면 얼굴인식 서비스 제공 자체가 불가능해진다는 문제점이 있다.In addition, in the case of a general face recognition system, since the face recognition function is concentrated on the server, there is a problem that it becomes impossible to provide the face recognition service itself when a failure occurs in the server or the network.

또한, 일반적인 얼굴인식시스템의 경우 디바이스에서 촬영된 얼굴이미지가 네트워크를 통해 서버로 전송되어야 하기 때문에 해킹등을 통해 얼굴이미지가 외부로 유출될 수 있어 개인정보 보안에 취약하다는 문제점이 있다.In addition, in the case of a general face recognition system, since the face image photographed by the device needs to be transmitted to the server through the network, the face image may be leaked to the outside through hacking, and thus there is a problem that personal information security is vulnerable.

또한, 일반적인 얼굴인식시스템의 경우 동일인임에도 불구하고 다른 환경에서 얼굴이 촬영되거나 다른 조도에서 얼굴이 촬영되는 경우 동일인임을 구별해 내지 못한다는 문제점이 있다.In addition, in the case of a general face recognition system, there is a problem that even though the person is the same person, when a face is photographed in a different environment or a face is photographed in a different illuminance, it cannot be distinguished as the same person.

본 발명은 상술한 문제점을 해결하기 위한 것으로서, 중앙서버에서 생성된 얼굴인식모델을 각 에지 디바이스에 배포함으로써 얼굴인식 및 인증처리를 에지 디바이스에 수행할 수 있는 얼굴인식용 에지 디바이스를 제공하는 것을 그 기술적 과제로 한다.The present invention provides an edge device for face recognition capable of performing face recognition and authentication processing to edge devices by distributing a face recognition model generated in a central server to each edge device. Make it a technical task.

또한, 본 발명은 사용자의 얼굴이미지 및 개인정보의 저장 없이 얼굴인식 및 인증처리를 수행할 수 있는 얼굴인식용 에지 디바이스를 제공하는 것을 다른 기술적 과제로 한다.In addition, another technical object of the present invention is to provide an edge device for face recognition capable of performing face recognition and authentication processing without storing a user's face image and personal information.

또한, 본 발명은 신규 사용자의의 얼굴 이미지에 대응되는 어레이 파일을 용이하게 업데이트할 수 있는 얼굴인식용 에지 다비이스를 제공하는 것을 또 다른 기술적 과제로 한다.In addition, another technical object of the present invention is to provide an edge device for face recognition that can easily update an array file corresponding to a face image of a new user.

상술한 목적을 달성하기 위한 본 발명의 일 측면에 따른 얼굴인식용 에지 디바이스는 인증대상이 되는 타겟 사용자의 촬영 이미지를 획득하는 제1 촬영부; 상기 촬영 이미지로부터 생성된 입력 이미지들을 얼굴인식모델에 입력하여 상기 입력 이미지로부터 타겟 얼굴이미지를 추출하고, 추출된 타겟 얼굴이미지로부터 타겟 특징벡터를 생성하는 얼굴인식부; 상기 타겟 특징벡터를 사용자의 식별정보 및 해당 사용자의 얼굴이미지로부터 추출된 특징벡터들을 갖는 복수개의 어레이로 구성된 어레이 파일과 비교하여 상기 타겟 사용자를 인증하는 인증부; 및 중앙서버로부터 상기 어레이 파일과 상기 얼굴인식모델을 수신하는 인터페이스부를 포함하는 것을 특징으로 한다.An edge device for face recognition according to an aspect of the present invention for achieving the above object comprises: a first photographing unit for obtaining a photographed image of a target user to be authenticated; A face recognition unit that inputs the input images generated from the captured image into a face recognition model, extracts a target face image from the input image, and generates a target feature vector from the extracted target face image; An authentication unit for authenticating the target user by comparing the target feature vector with an array file consisting of a plurality of arrays having user identification information and feature vectors extracted from the user's face image; And an interface unit for receiving the array file and the face recognition model from a central server.

본 발명에 따르면, 중앙서버에서 생성된 얼굴인식모델을 각 에지 디바이스에 배포됨으로써 얼굴인식 및 인증처리가 에지 디바이스에 수행되기 때문에 출입하는 사용자의 인원이 많은 곳에 얼굴인식시스템이 적용되더라도 고성능 및 고가의 서버가 요구되지 않아 얼굴인식시스템의 구축비용을 감소시킬 수 있다는 효과가 있다.According to the present invention, face recognition and authentication processing is performed on edge devices by distributing the face recognition model generated in the central server to each edge device, so even if the face recognition system is applied to a place with a large number of users entering and exiting, high performance and expensive Since a server is not required, there is an effect of reducing the cost of building a face recognition system.

또한, 본 발명에 따르면 각 에지 디바이스에 배포됨으로써 얼굴인식 및 인증처리가 에지 디바이스에 수행되기 때문에 서버 또는 네트워크에 장애가 발생하더라도 얼굴인식 서비스를 지속적으로 제공할 수 있어 서비스 제공 신뢰도를 향상시킬 수 있고, 에지 디바이스에서 촬영된 얼굴이미지가 서버로 전송되지 않기 때문에 얼굴이미지가 외부로 유출될 가능성이 사전에 차단되어 개인정보 보안을 향상시킬 수 있다는 효과가 있다.In addition, according to the present invention, since face recognition and authentication processing is performed on the edge device by being distributed to each edge device, the face recognition service can be continuously provided even if a server or network failure occurs, thereby improving the reliability of service provision. Since the face image captured by the edge device is not transmitted to the server, the possibility of leakage of the face image to the outside is prevented in advance, thereby improving the security of personal information.

또한, 본 발명에 따르면 에지 디바이스에는 얼굴인식을 위한 얼굴인식모델 및 어레이 파일만 저장될 뿐 사용자의 얼굴이미지나 개인정보가 저장되지 않기 때문에 에지 디바이스가 해킹되더라도 사용자의 개인정보가 유출될 염려가 없어 보안이 강화된다는 효과가 있다.In addition, according to the present invention, since only the face recognition model and array file for face recognition are stored in the edge device, the user's face image or personal information is not stored, so even if the edge device is hacked, there is no fear of leakage of the user's personal information. It has the effect of enhancing security.

또한, 본 발명은 신규 사용자의 추가시 하드웨어 변경이나 얼굴인식모델의 변경 없이 해당 신규 사용자의 얼굴 이미지에 대응되는 어레이 파일만을 에지 디바이스에 업데이트하면 되므로 신규 사용자의 추가가 용이해진다는 효과가 있다.In addition, according to the present invention, when a new user is added, only the array file corresponding to the face image of the new user needs to be updated on the edge device without changing hardware or changing the face recognition model, so that it is easy to add new users.

도 1은 본 발명의 일 실시예에 따른 얼굴인식시스템의 구성을 개략적으로 보여주는 블록도이다
도 2는 본 발명의 일 실시예에 따른 중앙서버의 구성을 개략적으로 보여주는 블록도이다.
도 3a는 사용자 이미지를 다운샘플링하여 해상도가 다른 복수개의 사용자 이미지를 획득하는 방법을 예시적으로 보여주는 도면이다.
도 3b는 얼굴이미지에서 랜드마크 좌표를 예시적으로 보여주는 도면이다.
도 4a 내지 도 4d는 얼굴인식모델을 구성하는 얼굴이미지 추출부의 구성을 보여주는 블록도이다.
도 5는 본 발명에 따른 특징벡터 추출부의 구성을 개략적으로 보여주는 블록도이다.
도 6은 얼굴이미지 처리부에 포함된 제1 유닛의 구성을 보여주는 블록도이다.
도 7은 얼굴이미지 처리부에 포함된 제2 유닛의 구성을 보여주는 블록도이다.
도 8은 일반적인 얼굴인식모델이 동일인을 인식하지 못하는 예를 보여주는 도면이다.
도 9는 학습이미지들간의 거리에 따라 학습이미지를 벡터공간에 배치할 때 중첩되는 영역이 발생되는 예를 보여주는 도면이다.
도 10은 학습이미지를 2차원 각도 평면 상에 배치한 예를 보여주는 도면이다.
도 11은 2차원 각도 평면 상에서 학습 이미지들간에 마진각도가 부여되어 학습 이미지들이 이격되는 것을 예시적으로 보여주는 도면이다.
도 12는 오차감소부에 의해 오차감소가 수행되었을 때 서로 다른 환경에서 촬영된 동일인 이미지가 정확하게 분류되는 예를 보여주는 도면이다.
도 13은 본 발명의 제1 실시예에 따른 에지 디바이스의 구성을 개략적으로 보여주는 블록도이다
도 14는 인증부가 타겟 사용자를 인증하는 방법을 예시적으로 보여주는 도면이다.
도 15는 본 발명의 제2 실시예에 따른 에지 디바이스의 구성을 개략적으로 보여주는 블록도이다.
도 16a 및 도 16b는 제2 촬영부에 의해 생성되는 뎁스 이미지의 일 예를 보여주는 도면이다.1 is a block diagram schematically showing the configuration of a face recognition system according to an embodiment of the present invention
2 is a block diagram schematically showing the configuration of a central server according to an embodiment of the present invention.
3A is a diagram illustrating a method of down-sampling a user image to obtain a plurality of user images having different resolutions.
3B is a diagram illustrating landmark coordinates in a face image as an example.
4A to 4D are block diagrams showing the configuration of a face image extracting unit constituting a face recognition model.
5 is a block diagram schematically showing the configuration of a feature vector extraction unit according to the present invention.
6 is a block diagram showing the configuration of a first unit included in a face image processing unit.
7 is a block diagram showing the configuration of a second unit included in a face image processing unit.
8 is a diagram illustrating an example in which a general face recognition model does not recognize the same person.
9 is a diagram illustrating an example in which overlapping regions are generated when a learning image is arranged in a vector space according to a distance between learning images.
10 is a diagram showing an example of arranging a learning image on a two-dimensional angular plane.
FIG. 11 is a diagram exemplarily showing that training images are spaced apart by giving margin angles between training images on a 2D angular plane.
12 is a diagram illustrating an example in which images of the same person photographed in different environments are accurately classified when error reduction is performed by the error reduction unit.
13 is a block diagram schematically showing a configuration of an edge device according to a first embodiment of the present invention
14 is a diagram illustrating a method of authenticating a target user by an authentication unit.
15 is a block diagram schematically showing a configuration of an edge device according to a second embodiment of the present invention.
16A and 16B are diagrams illustrating an example of a depth image generated by a second photographing unit.

본 명세서에서 서술되는 용어의 의미는 다음과 같이 이해되어야 할 것이다.The meaning of the terms described in this specification should be understood as follows.

단수의 표현은 문맥상 명백하게 다르게 정의하지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "제1", "제2" 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위한 것으로, 이들 용어들에 의해 권리범위가 한정되어서는 아니 된다.Singular expressions should be understood as including plural expressions unless clearly defined differently in context, and terms such as “first” and “second” are used to distinguish one element from other elements, The scope of rights should not be limited by these terms.

"포함하다" 또는 "가지다" 등의 용어는 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.It is to be understood that terms such as "comprise" or "have" do not preclude the presence or addition of one or more other features or numbers, steps, actions, components, parts, or combinations thereof.

"적어도 하나"의 용어는 하나 이상의 관련 항목으로부터 제시 가능한 모든 조합을 포함하는 것으로 이해되어야 한다. 예를 들어, "제1 항목, 제2 항목 및 제 3항목 중에서 적어도 하나"의 의미는 제1 항목, 제2 항목 또는 제3 항목 각각 뿐만 아니라 제1 항목, 제2 항목 및 제3 항목 중에서 2개 이상으로부터 제시될 수 있는 모든 항목의 조합을 의미한다.The term “at least one” is to be understood as including all possible combinations from one or more related items. For example, the meaning of “at least one of the first item, the second item, and the third item” means 2 among the first item, the second item, and the third item, as well as the first item, the second item, and the third item. It means a combination of all items that can be presented from more than one.

이하, 본 발명에 따른 얼굴인식시스템의 구성을 도 1을 참조하여 보다 구체적으로 설명한다.Hereinafter, the configuration of the face recognition system according to the present invention will be described in more detail with reference to FIG. 1.

도 1은 본 발명의 일 실시예에 따른 얼굴인식시스템의 구성을 개략적으로 보여주는 블록도이다. 도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 얼굴인식시스템(100)은 중앙서버(110) 및 복수개의 에지 디바이스(120)들을 포함한다.1 is a block diagram schematically showing the configuration of a face recognition system according to an embodiment of the present invention. As shown in FIG. 1, the face recognition system 100 according to an embodiment of the present invention includes a central server 110 and a plurality of edge devices 120.

중앙서버(110)는 얼굴인식모델을 생성하고, 생성된 얼굴인식모델을 이용하여 사용자 단말기(130)로부터 입력되는 사용자의 얼굴정보로부터 추출된 특징벡터를 이용하여 타겟사용자의 인증을 위한 어레이 파일(Array File)을 생성한다. 중앙서버(110)는 생성된 어레이 파일을 에지 디바이스(120)로 전송함으로써 에지 디바이스(120)가 타겟사용자를 인증할 수 있도록 한다.The central server 110 generates a face recognition model, and an array file for authentication of the target user using the feature vector extracted from the user's face information input from the user terminal 130 using the generated face recognition model ( Array File). The central server 110 transmits the generated array file to the edge device 120 so that the edge device 120 can authenticate the target user.

이를 위해, 본 발명에 따른 중앙서버(110)는 도 2에 도시된 바와 같이, 사용자 등록부(210), 입력 이미지 생성부(215), 얼굴인식부(220), 얼굴인식모델(225), 어레이 파일 생성부(230), 얼굴인식모델 트레이닝부(240), 및 인터페이스부(250)를 포함한다.To this end, the central server 110 according to the present invention, as shown in Fig. 2, the user registration unit 210, the input image generation unit 215, the face recognition unit 220, the face recognition model 225, an array It includes a file generation unit 230, a face recognition model training unit 240, and an interface unit 250.

사용자 등록부(210)는 등록을 희망하는 사용자의 사용자 단말기(130)로부터 하나 이상의 사용자 이미지를 수신한다. 사용자 등록부(210)는 사용자 이미지가 수신되면 해당 사용자가 사용자 이미지와 동일인인지 여부를 확인하고, 동일인인 것으로 판단되면 해당 사용자에게 부여되어 있는 출입권한정보를 획득하여 사용자 이미지와 함께 사용자 데이터베이스(212)에 등록한다.The user registration unit 210 receives one or more user images from the user terminal 130 of a user who wishes to register. When the user image is received, the user registration unit 210 checks whether the user is the same person as the user image, and if it is determined that the user is the same person, the user registration unit 210 acquires the access permission information granted to the user, and the user database 212 together with the user image Register at

일 실시예에 있어서, 사용자 등록부(210)는 사용자 단말기(130)로부터 해당 사용자의 식별정보를 사용자 이미지와 함께 수신할 수 있다. 예컨대, 사용자 등록부(210)는 사용자의 아이디, 성명, 전화번호, 또는 사용자의 직원번호 등과 같은 사용자의 식별정보를 해당 사용자 이미지와 함께 수신할 수 있다. 이러한 실시예에 따르는 경우 사용자 등록부(210)는 사용자의 식별정보 및 사용자의 출입원한정보를 해당 사용자 이미지와 함께 사용자 데이터베이스(212)에 등록할 수 있다.In an embodiment, the user registration unit 210 may receive identification information of a corresponding user from the user terminal 130 together with a user image. For example, the user registration unit 210 may receive the user's identification information, such as the user's ID, name, phone number, or the user's employee number, together with a corresponding user image. According to this embodiment, the user registration unit 210 may register the user's identification information and the user's access point information together with a corresponding user image in the user database 212.

한편, 사용자 등록부(210)는 사용자 단말기(130)로부터 복수개의 사용자 이미지를 입력 받는 경우 서로 다른 사용자 이미지가 입력 되도록 유도할 수 있다. 예컨대, 사용자 등록부(210)는 사용자가 사용자 단말기(130)를 통해 다른 환경에서 촬영된 사용자 이미지 또는 다른 조도에서 촬영된 사용자 이미지를 입력하도록 유도할 수 있다. 이와 같이, 사용자 등록부(210)가 한 명의 사용자로부터 서로 다른 환경 또는 서로 다른 조도에서 촬영된 복수개의 사용자 이미지를 수신함으로써 얼굴인식의 정확도를 향상시킬 수 있게 된다.Meanwhile, when a plurality of user images are input from the user terminal 130, the user registration unit 210 may induce different user images to be input. For example, the user registration unit 210 may induce a user to input a user image photographed in a different environment or a user image photographed in a different illuminance through the user terminal 130. In this way, the user registration unit 210 may improve the accuracy of face recognition by receiving a plurality of user images photographed in different environments or different illuminances from one user.

입력 이미지 생성부(215)는 사용자 등록부(210)에 의해 입력된 사용자 이미지로부터 얼굴인식에 이용될 입력 이미지를 생성한다. 구체적으로 입력 이미지 생성부(215)는 하나의 사용자 이미지를 미리 정해진 단계까지 다운샘플링하거나 업샘플링함으로써 하나의 사용자 이미지로부터 해상도가 서로 다른 복수개의 사용자 이미지들을 생성한다. 예컨대, 입력 이미지 생성부(215)는 도 3a에 도시된 바와 같이 하나의 사용자 이미지(400)를 다운샘플링함으로써 해상도가 서로 다른 복수개의 사용자 이미지(400a~400n)를 생성할 수 있다.The input image generation unit 215 generates an input image to be used for face recognition from the user image input by the user registration unit 210. Specifically, the input image generation unit 215 generates a plurality of user images having different resolutions from one user image by down-sampling or up-sampling one user image to a predetermined level. For example, the input image generator 215 may generate a plurality of user images 400a to 400n having different resolutions by down-sampling one user image 400 as illustrated in FIG. 3A.

일 실시예에 있어서, 입력 이미지 생성부(215)는 사용자 이미지에 가우시안 피라미드(Gaussian Pyramid)를 적용함으로써 다운샘플링된 사용자 이미지를 생성하거나, 사용자 이미지에 라플라시안 피라미드(Laplacian Pyramid)를 적용함으로써 업샘플링된 사용자 이미지를 생성할 수 있다.In one embodiment, the input image generator 215 generates a downsampled user image by applying a Gaussian Pyramid to the user image, or up-sampled by applying a Laplacian Pyramid to the user image. User images can be created.

해상도가 서로 다른 복수개의 사용자 이미지가 생성되면, 입력 이미지 생성부(215)는 각각의 사용자 이미지에 대해, 도 3b에 도시된 바와 같이 사용자 이미지(400) 상에서 미리 정해진 픽셀크기의 윈도우(405)를 이동시켜가면서 획득되는 복수개의 이미지를 입력 이미지로 생성한다. 입력 이미지 생성부(215)는 생성된 복수개의 입력 이미지를 얼굴인식부(220)로 입력한다.When a plurality of user images having different resolutions are generated, the input image generator 215 creates a window 405 of a predetermined pixel size on the user image 400 for each user image as shown in FIG. 3B. A plurality of images acquired while moving are generated as input images. The input image generating unit 215 inputs a plurality of generated input images to the face recognition unit 220.

얼굴인식부(220)는 얼굴인식모델 트레이닝부(250)에 의해 트레이닝된 얼굴인식모델(225)에 입력 이미지 생성부(215)에 의해 생성된 복수개의 입력 이미지를 입력함으로써 얼굴영역이 포함된 얼굴이미지를 획득하고, 획득된 얼굴 이미지로부터 특징벡터를 추출한다.The face recognition unit 220 inputs a plurality of input images generated by the input image generation unit 215 to the face recognition model 225 trained by the face recognition model training unit 250 to provide a face including a face region. An image is acquired, and a feature vector is extracted from the acquired face image.

일 실시예에 있어서 얼굴인식모델(225)은 입력 이미지로부터 얼굴이미지를 추출하는 얼굴이미지 추출부(227) 및 얼굴이미지로부터 특징벡터를 추출하는 특징벡터 추출부(229)를 포함할 수 있다.In one embodiment, the face recognition model 225 may include a face image extracting unit 227 for extracting a face image from an input image and a feature vector extracting unit 229 for extracting a feature vector from the face image.

이하, 도 4 내지 도 7을 참조하여 얼굴인식부(220)가 얼굴인식모델(225)에 포함된 얼굴이미지 추출부(227) 및 특징벡터 추출부(229)를 이용하여 입력 이미지로부터 얼굴이미지와 특징벡터를 추출하는 내용에 대해 구체적으로 설명한다.Hereinafter, with reference to FIGS. 4 to 7, the face recognition unit 220 uses the face image extraction unit 227 and the feature vector extraction unit 229 included in the face recognition model 225 to obtain a face image from the input image. Details of extracting feature vectors will be described in detail.

도 4a 내지 도 4d는 얼굴인식모델을 구성하는 얼굴이미지 추출부의 구성을 보여주는 블록도이다. 본 발명에 따른 얼굴이미지 추출부(227)는 컨벌루션 신경망(Convolutional Neural Network: CNN)을 기반으로 구성되어 입력 이미지로부터 얼굴영역이 포함된 얼굴이미지를 추출한다. 이러한 얼굴이미지 추출부(227)는 도 4a에 도시된 바와 같이, 제1 얼굴탐지부(310), 제2 얼굴탐지부(320), 제3 얼굴탐지부(330), 및 얼굴 이미지 정렬부(340)를 포함한다.4A to 4D are block diagrams showing the configuration of a face image extracting unit constituting a face recognition model. The face image extracting unit 227 according to the present invention is configured based on a convolutional neural network (CNN) to extract a face image including a face region from an input image. As shown in FIG. 4A, the face image extracting unit 227 includes a first face detection unit 310, a second face detection unit 320, a third face detection unit 330, and a face image alignment unit ( 340).

제1 얼굴탐지부(310)는 얼굴인식부(220)에 의해 입력되는 입력 이미지에 컨벌루션 연산을 적용함으로써 각 입력 이미지들의 피쳐를 추출하고, 추출된 피쳐를 기초로 해당 입력 이미지 상에서 얼굴영역을 1차적으로 추출한다.The first face detection unit 310 extracts features of each input image by applying a convolution operation to the input image inputted by the face recognition unit 220, and detects a face region on the input image based on the extracted feature. It is extracted primarily.

이를 위해, 제1 얼굴탐지부(310)는 도 4b에 도시된 바와 같이 n개의 컨벌루션 연산부(311a~311c), 샘플링부(313), 제1 및 제2 차원감소부(315a, 315b), 및 제1 확률값 연산부(317)를 포함한다.To this end, the first face detection unit 310 includes n convolution calculation units 311a to 311c, a sampling unit 313, first and second dimension reduction units 315a and 315b, as shown in FIG. 4B, and And a first probability value calculating unit 317.

일 실시예에 있어서, 제1 얼굴탐지부(310)는 3개의 컨벌루션 연산부(311a~311c)를 포함할 수 있다. 도 4b에서는 설명의 편의를 위해 제1 얼굴탐지부(310)가 3개의 컨벌루션 연산부(311a~311c)를 포함하는 것으로 도시하였지만, 이는 하나의 예일 뿐 제1 얼굴탐지부(310)는 4개 이상의 컨벌루션 연산부를 포함하거나 1개 또는 2개의 컨벌루션 연산부를 포함할 수도 있을 것이다.In an embodiment, the first face detection unit 310 may include three convolution calculation units 311a to 311c. In FIG. 4B, for convenience of explanation, the first face detection unit 310 is illustrated as including three convolutional calculation units 311a to 311c, but this is only an example, and the first face detection unit 310 includes four or more It may include a convolution operation unit or may include one or two convolution operation units.

제1 내지 제3 컨벌루션 연산부(311a~311c) 각각은 입력되는 이미지에 컨벌루션 필터를 적용하여 피쳐맵을 생성하고, 생성된 피쳐맵에 활성화함수를 적용함으로써 피쳐맵에 비선형적 특성을 반영한다. 이때, 제1 내지 제3 컨벌루션 연산부(311a~311c)에 적용되는 컨벌루션 필터는 서로 상이한 필터일 수 있다.Each of the first to third convolution calculation units 311a to 311c generates a feature map by applying a convolution filter to the input image, and applies an activation function to the generated feature map to reflect nonlinear characteristics to the feature map. In this case, the convolution filters applied to the first to third convolution calculation units 311a to 311c may be different filters.

일 실시예에 있어서, 제1 내지 제3 컨벌루션 연산부(311a~311c)에서 이용되는 활성화함수는 피쳐맵의 픽셀값들 중 양의 값은 그대로 출력하고 음의 값은 미리 정해진 크기만큼 감소된 값으로 출력하는 활성화함수일 수 있다. 여기서, 활성화함수란 복수의 입력정보에 가중치를 부여하여 결합해 완성된 결과값을 출력하는 함수를 의미한다.In one embodiment, the activation function used in the first to third convolution calculation units 311a to 311c outputs a positive value among pixel values of the feature map as it is, and a negative value decreases by a predetermined size. It may be an output activation function. Here, the activation function refers to a function that gives a weight to a plurality of input information and combines it to output a completed result value.

샘플링부(313)는 제1 컨벌루션 연산부(311a)로부터 출력되는 피쳐맵에 샘플링 필터를 적용하여 피쳐맵으로부터 특징값을 추출한다. 일 실시예에 있어서, 샘플링부(313)는 피쳐맵 상에서 샘플링 필터에 상응하는 영역에 포함된 픽셀값들 중 최대값을 피쳐맵의 특징값으로 추출할 수 있다. 이러한 실시예에 따르는 경우 샘플링부(313)는 맥스 풀링(Max Pooling) 레이어로 구현될 수 있고, 맥스 풀링 레이어를 통해 피쳐맵의 차원이 감소된다. 샘플링부(313)는 차원이 감소된 피쳐맵을 제2 컨벌루션 연산부(311b)로 입력한다.The sampling unit 313 extracts a feature value from the feature map by applying a sampling filter to the feature map output from the first convolution operation unit 311a. In an embodiment, the sampling unit 313 may extract a maximum value of pixel values included in an area corresponding to the sampling filter on the feature map as a feature value of the feature map. According to this embodiment, the sampling unit 313 may be implemented as a Max Pooling layer, and the dimension of the feature map is reduced through the Max Pooling layer. The sampling unit 313 inputs the feature map with a reduced dimension to the second convolution operation unit 311b.

제1 차원감소부(315a)는 제3 컨벌루션 연산부(311c)에서 출력되는 피쳐맵에 제1 차원감소 필터를 적용함으로써 제3 컨벌루션 연산부(311c)에서 출력되는 피쳐맵의 차원을 감소시킨다. 일 실시예에 있어서 제1 차원감소 필터는 피쳐맵을 2차원으로 감소시킬 수 있는 필터로 설정될 수 있다.The first dimension reduction unit 315a reduces the dimension of the feature map output from the third convolution operation unit 311c by applying a first dimension reduction filter to the feature map output from the third convolution operation unit 311c. In an embodiment, the first dimensionality reduction filter may be set as a filter capable of reducing the feature map in two dimensions.

제1 확률값 연산부(317)는 제1 차원감소부(315a)에 의해서 출력되는 2차원의 출력 데이터에 미리 정해진 분류함수를 적용함으로써 해당 입력 이미지에 얼굴영역이 포함되어 있는지 여부에 대한 제1 확률값을 계산한다. 일 실시예에 있어서, 제1 확률값 연산부(317)는 산출된 제1 확률값이 제1 문턱값 이상이면 입력 이미지에 얼굴영역이 포함되어 있는 것으로 판단할 수 있다.The first probability value calculation unit 317 applies a predetermined classification function to the two-dimensional output data output by the first dimension reduction unit 315a to calculate a first probability value for whether a face region is included in the corresponding input image. Calculate. In an embodiment, the first probability value calculating unit 317 may determine that the face region is included in the input image when the calculated first probability value is equal to or greater than the first threshold value.

제2 차원감소부(315b)는 제1 확률값 연산부(317)에 의해 산출된 제1 확률값이 제1 문턱값 이상인 경우 제3 컨벌루션 연산부(311c)에서 출력되는 피쳐맵에 제2 차원감소 필터를 적용함으로써 제3 컨벌루션 연산부(311c)에서 출력되는 피쳐맵의 차원을 감소시킨다. 일 실시예에 있어서 제2 차원감소 필터는 피쳐맵을 4차원으로 감소시킬 수 있는 필터로 설정될 수 있고, 제2 차원 감소부(315b)는 4차원으로 출력되는 4개의 값을 해당 입력 이미지 상에서의 얼굴영역 좌표로 결정한다. 이때, 얼굴영역의 좌표는 얼굴이 포함된 영역을 사각형 형태의 바운딩박스(Bounding Box)로 표시하였을 때 좌측 상단 꼭지점의 좌표와 우측 하단 꼭지점의 좌표로 정의되거나, 우측상단 꼭지점의 좌표와 좌측 하단 꼭지점의 좌표로 정의될 수 있다.The second dimensionality reduction unit 315b applies a second dimensionality reduction filter to the feature map output from the third convolution calculation unit 311c when the first probability value calculated by the first probability value calculation unit 317 is greater than or equal to the first threshold value. Thus, the dimension of the feature map output from the third convolution operation unit 311c is reduced. In one embodiment, the second dimensionality reduction filter may be set as a filter capable of reducing the feature map to four dimensions, and the second dimensionality reduction unit 315b stores four values output in four dimensions on a corresponding input image. It is determined by the coordinates of the face area. At this time, the coordinates of the face area are defined as the coordinates of the upper left vertex and the coordinates of the lower right vertex when the area containing the face is displayed in a rectangular bounding box, or the coordinates of the upper right vertex and the lower left vertex Can be defined by the coordinates of

제2 얼굴탐지부(320)는 제1 얼굴탐지부(310)에 의해 얼굴영역이 포함된 것으로 판단된 입력 이미지들 및 해당 입력 이미지들 상에서의 얼굴영역의 좌표를 입력 받고, 해당 입력 이미지들 상에서 얼굴영역의 좌표에 해당하는 제1 서브 입력 이미지들에 컨벌루션 연산을 적용함으로써 제1 서브 입력 이미지들의 피쳐를 추출하고, 추출된 피쳐를 기초로 제1 서브 입력 이미지들 상에서 얼굴영역을 2차적으로 추출한다.The second face detection unit 320 receives input images determined to include a face area by the first face detection unit 310 and coordinates of the face area on the input images, and Extracting features of the first sub-input images by applying a convolution operation to the first sub-input images corresponding to the coordinates of the face region, and secondly extracting the face region from the first sub-input images based on the extracted features do.

이를 위해, 도 4c에 도시된 바와 같이, 제2 얼굴탐지부(320)는 n개의 컨벌루션 연산부(321a~321c), 제2 내지 제3 샘플링부(323a, 323b), 제1 차원증가부(325), 제3 및 제4 차원감소부(327a, 327b), 및 제2 확률값 연산부(328)를 포함한다.To this end, as shown in FIG. 4C, the second face detection unit 320 includes n convolution calculation units 321a to 321c, second to third sampling units 323a and 323b, and a first dimension increasing unit 325. ), third and fourth dimensional reduction units 327a and 327b, and a second probability value calculating unit 328.

일 실시예에 있어서, 제2 얼굴탐지부(320)는 3개의 컨벌루션 연산부(321a~321c)를 포함할 수 있다. 도 4c에서는 설명의 편의를 위해 제2 얼굴탐지부(320)가 3개의 컨벌루션 연산부(321a~321c)를 포함하는 것으로 도시하였지만, 이는 하나의 예일 뿐 제2 얼굴탐지부(320)는 제1 얼굴탐지부(320)에 포함된 컨벌루션 연산부(311a~312c)의 개수 이상의 컨벌루션 연산부를 포함할 수 있다.In one embodiment, the second face detection unit 320 may include three convolution calculation units 321a to 321c. In FIG. 4C, for convenience of explanation, the second face detection unit 320 is shown to include three convolution calculation units 321a to 321c, but this is only an example, and the second face detection unit 320 is The number of convolution calculation units 311a to 312c included in the detection unit 320 or more may be included.

제4 내지 제6 컨벌루션 연산부(321a~312c) 각각은 입력되는 이미지에 컨벌루션 필터를 적용하여 피쳐맵을 생성하고, 생성된 피쳐맵에 활성화함수를 적용함으로써 피쳐맵에 비선형적 특성을 반영한다. 일 실시예에 있어서, 제4 내지 제6 컨벌루션 연산부(321a~312c)가 이용하는 활성화함수는 피쳐맵의 픽셀값들 중 양의 값은 그대로 출력하고 음의 값은 미리 정해진 크기만큼 감소된 값으로 출력하는 활성화함수일 수 있다.Each of the fourth to sixth convolution calculation units 321a to 312c generates a feature map by applying a convolution filter to the input image, and applies an activation function to the generated feature map to reflect nonlinear characteristics to the feature map. In one embodiment, the activation function used by the fourth to sixth convolution calculation units 321a to 312c outputs a positive value among pixel values of a feature map as it is and a negative value as a value reduced by a predetermined size. It may be an activation function that does.

제2 샘플링부(323a)는 제4 컨벌루션 연산부(321a)로부터 출력되는 피쳐맵에 샘플링 필터를 적용하여 해당 피쳐맵에서 특징값을 추출하고, 제3 샘플링부(323b)는 제5 컨벌루션 연산부(321b)로부터 출력되는 피쳐맵에 샘플링 필터를 적용하여 해당 피쳐맵에서 특징값을 추출한다. 일 실시예에 있어서, 제2 및 제3 샘플링부(323a, 323b)는 각각의 피쳐맵 상에서 샘플링 필터에 상응하는 영역에 포함된 픽셀값들 중 최대값을 피쳐맵의 특징값으로 추출할 수 있다. 이러한 실시예에 따르는 경우 제2 및 제3 샘플링부(323a, 323b)는 맥스풀링 레이어로 구현될 수 있고, 맥스풀링 레이어를 통해 각 피쳐맵의 차원이 감소된다.The second sampling unit 323a extracts feature values from the feature map by applying a sampling filter to the feature map output from the fourth convolution operation unit 321a, and the third sampling unit 323b extracts a feature value from the feature map. ), a sampling filter is applied to the feature map output from () and feature values are extracted from the feature map. In an embodiment, the second and third sampling units 323a and 323b may extract a maximum value among pixel values included in an area corresponding to a sampling filter on each feature map as a feature value of the feature map. . According to this embodiment, the second and third sampling units 323a and 323b may be implemented as a maxpooling layer, and the dimension of each feature map is reduced through the maxpooling layer.

제1 차원증가부(325)는 제6 컨벌루션 연산부(321c)에서 출력되는 피쳐맵이 미리 정해진 크기의 차원을 갖도록 복수개의 노드들을 이용하여 피쳐맵의 차원을 증가시킨다. 일 실시예에 있어서, 제1 차원증가부(325)는 제6 컨벌루션 연산부(321c)에서 출력되는 피쳐맵이 128*128의 크기를 갖거나 256*256의 크기를 갖도록 차원을 증가시킬 수 있다.The first dimension increasing unit 325 increases the dimension of the feature map by using a plurality of nodes so that the feature map output from the sixth convolution operation unit 321c has a dimension of a predetermined size. In an embodiment, the first dimension increase unit 325 may increase the dimension so that the feature map output from the sixth convolution operation unit 321c has a size of 128*128 or 256*256.

제1 차원 증가부(325)는 차원이 증가된 피쳐맵에 활성화함수를 적용함으로써 차원이 증가된 피쳐맵에 비선형적 특정을 반영한다. 일 실시예에 있어서, 제1 차원 증가부(325)는 피쳐맵의 픽셀값들 중 양의 값은 그대로 출력하고 음의 값은 미리 정해진 크기만큼 감소된 값으로 출력하는 활성화함수를 적용하여 피쳐맵에 비선형적 특성을 반영할 수 있다.The first dimension increasing unit 325 applies the activation function to the feature map with an increased dimension, thereby reflecting nonlinear characteristics in the feature map with an increased dimension. In one embodiment, the first dimension increasing unit 325 applies an activation function that outputs a positive value among pixel values of the feature map as it is and outputs a negative value as a value reduced by a predetermined size. Can reflect non-linear characteristics.

제3 차원감소부(327a)는 제1 차원증가부(325)에서 출력되는 피쳐맵에 제3 차원감소 필터를 적용함으로써 제1 차원증가부(325)에서 출력되는 피쳐맵의 차원을 감소시킨다. 일 실시예에 있어서 제3 차원감소 필터는 피쳐맵을 2차원으로 감소시킬 수 있는 필터로 설정될 수 있다.The third dimension reduction unit 327a reduces the dimension of the feature map output from the first dimension increase unit 325 by applying a third dimension reduction filter to the feature map output from the first dimension increase unit 325. In an embodiment, the third dimension reduction filter may be set as a filter capable of reducing the feature map to two dimensions.

제2 확률값 연산부(329)는 제3 차원감소부(327a)에 의해서 출력되는 2차원의 출력 데이터에 미리 정해진 분류함수를 적용함으로써 해당 제1 서브 입력 이미지에 얼굴영역이 포함되어 있는지 여부에 대한 제2 확률값을 계산한다. 일 실시예에 있어서, 제2 확률값 연산부(328)는 산출된 제2 확률값이 제1 문턱값보다 큰 제2 문턱값 이상이면 해당 제1 서브 입력이미지에 얼굴영역이 포함된 것으로 판단할 수 있다.The second probability value calculating unit 329 applies a predetermined classification function to the two-dimensional output data output by the third dimension reduction unit 327a to determine whether or not the face region is included in the first sub-input image. 2 Calculate the probability value. In an embodiment, the second probability value calculator 328 may determine that the face region is included in the first sub-input image if the calculated second probability value is greater than or equal to the second threshold value greater than the first threshold value.

제4 차원감소부(327b)는 제2 확률값 연산부(329)에 의해 산출된 제2 확률값이 제2 문턱값 이상인 경우 제1 차원증가부(325)에서 출력되는 피쳐맵에 제4 차원감소 필터를 적용함으로써 제1 차원증가부(325)에서 출력되는 피쳐맵의 차원을 감소시킨다. 일 실시예에 있어서 제4 차원감소 필터는 피쳐맵을 4차원으로 감소시킬 수 있는 필터로 설정될 수 있고, 제4 차원 감소부(327b)는 4차원으로 출력되는 4개의 값을 해당 제1 서브 입력 이미지 상에서의 얼굴영역 좌표로 결정한다. 이때, 얼굴영역의 좌표는 얼굴이 포함된 영역을 사각형 형태의 바운딩박스로 표시하였을 때 좌측 상단 꼭지점의 좌표와 우측 하단 꼭지점의 좌표로 정의되거나, 우측상단 꼭지점의 좌표와 좌측 하단 꼭지점의 좌표로 정의될 수 있다.The fourth dimensionality reduction unit 327b applies a fourth dimensionality reduction filter to the feature map output from the first dimensionality increase unit 325 when the second probability value calculated by the second probability value calculation unit 329 is equal to or greater than the second threshold value. By applying it, the dimension of the feature map output from the first dimension increasing unit 325 is reduced. In an embodiment, the fourth dimensionality reduction filter may be set as a filter capable of reducing the feature map to four dimensions, and the fourth dimensionality reduction unit 327b converts four values output in four dimensions into a corresponding first sub. It is determined by the coordinates of the face area on the input image. At this time, the coordinates of the face area are defined as the coordinates of the upper left vertex and the coordinates of the lower right vertex when the area including the face is displayed in a rectangular bounding box, or the coordinates of the upper right vertex and the coordinates of the lower left vertex. Can be.

제3 얼굴탐지부(330)는 제2 얼굴탐지부(310)에 의해 얼굴영역이 포함된 것으로 판단된 제1 서브 입력 이미지들 및 해당 제1 서브 입력 이미지들 상에서의 얼굴영역의 좌표를 입력 받고, 해당 제1 서브 입력 이미지들 상에서 얼굴영역의 좌표에 해당하는 제2 서브 입력 이미지들에 컨벌루션 연산을 적용함으로써 제2 서브 입력 이미지들의 피쳐를 추출하고, 추출된 피쳐를 기초로 제2 서브 입력 이미지들 상에서 얼굴영역을 3차적으로 추출한다.The third face detection unit 330 receives the first sub-input images determined to include the face region by the second face detection unit 310 and the coordinates of the face region on the first sub-input images. , Extracting features of the second sub-input images by applying a convolution operation to second sub-input images corresponding to the coordinates of the face area on the corresponding first sub-input images, and a second sub-input image based on the extracted features The face area is extracted thirdly from the fields.

이를 위해, 제3 얼굴탐지부(330)는 도 4d에 도시된 바와 같이 n+1개의 컨벌루션 연산부(331a~331d), 제4 내지 제6 샘플링부(333a~333c), 제2 차원증가부(327), 제5 내지 제6 차원감소부(337a~337c), 및 제3 확률값 연산부(339)를 포함한다.To this end, the third face detection unit 330 includes n+1 convolution calculation units 331a to 331d, fourth to sixth sampling units 333a to 333c, and a second dimension increasing unit as shown in FIG. 4D. 327), fifth to sixth dimensional reduction units 337a to 337c, and a third probability value calculating unit 339.

일 실시예에 있어서, 제3 얼굴탐지부(330)는 4개의 컨벌루션 연산부(331a~331d)를 포함할 수 있다. 도 4d에서는 설명의 편의를 위해 제3 얼굴탐지부(330)가 4개의 컨벌루션 연산부(331a~331d)를 포함하는 것으로 도시하였지만, 이는 하나의 예일 뿐 제3 얼굴탐지부(330)는 제2 얼굴탐지부(320)에 포함된 컨벌루션 연산부(321a~322c)의 개수 이상의 컨벌루션 연산부를 포함한다면 그 개수에는 제한이 없을 수 있다.In one embodiment, the third face detection unit 330 may include four convolution calculation units 331a to 331d. In FIG. 4D, for convenience of explanation, the third face detection unit 330 is shown to include four convolution calculation units 331a to 331d, but this is only an example, and the third face detection unit 330 is a second face. If more than the number of convolution calculation units 321a to 322c included in the detection unit 320 is included, there may be no limit to the number of convolution calculation units.

제7 내지 제10 컨벌루션 연산부(331a~331d) 각각은 입력되는 이미지에 컨벌루션 필터를 적용하여 피쳐맵을 생성하고, 생성된 피쳐맵에 활성화함수를 적용함으로써 피쳐맵에 비선형적 특성을 반영한다. 일 실시예에 있어서, 제7 내지 제10 컨벌루션 연산부(331a~331d)가 이용하는 활성화함수는 피쳐맵의 픽셀값들 중 양의 값은 그대로 출력하고 음의 값은 미리 정해진 크기만큼 감소된 값으로 출력하는 활성화함수일 수 있다.Each of the seventh to tenth convolution operation units 331a to 331d generates a feature map by applying a convolution filter to the input image, and applies an activation function to the generated feature map to reflect nonlinear characteristics to the feature map. In one embodiment, the activation function used by the seventh to tenth convolutional units 331a to 331d outputs a positive value among pixel values of a feature map as it is and a negative value as a value reduced by a predetermined size. It may be an activation function that does.

제4 샘플링부(333a)는 제7 컨벌루션 연산부(331a)로부터 출력되는 피쳐맵에 샘플링 필터를 적용하여 해당 피쳐맵에서 특징값을 추출하고, 제5 샘플링부(333b)는 제8 컨벌루션 연산부(331b)로부터 출력되는 피쳐맵에 샘플링 필터를 적용하여 해당 피쳐맵에서 특징값을 추출하며, 제6 샘플링부(333c)는 제9 컨벌루션 연산부(331c)로부터 출력되는 피쳐맵에 샘플링 필터를 적용하여 해당 피쳐맵에서 특징값을 추출한다. 일 실시예에 있어서, 제4 내지 제6 샘플링부(333a~333c)는 각각의 피쳐맵 상에서 샘플링 필터에 상응하는 영역에 포함된 픽셀값들 중 최대값을 피쳐맵의 특징값으로 추출할 수 있다. 이러한 실시예에 따르는 경우 제4 내지 제6 샘플링부(333a~333c)는 맥스풀링 레이어로 구현될 수 있고, 맥스풀링 레이어를 통해 각 피쳐맵의 차원이 감소된다.The fourth sampling unit 333a extracts feature values from the feature map by applying a sampling filter to the feature map output from the seventh convolution operation unit 331a, and the fifth sampling unit 333b is the eighth convolution operation unit 331b. ), a sampling filter is applied to the feature map output from the feature map and feature values are extracted from the feature map, and the sixth sampling unit 333c applies the sampling filter to the feature map output from the ninth convolution operation unit 331c Extract feature values from the map. In an embodiment, the fourth to sixth sampling units 333a to 333c may extract a maximum value among pixel values included in an area corresponding to the sampling filter on each feature map as a feature value of the feature map. . According to this embodiment, the fourth to sixth sampling units 333a to 333c may be implemented as a max pooling layer, and the dimension of each feature map is reduced through the max pooling layer.

제2 차원증가부(335)는 제10 컨벌루션 연산부(331d)에서 출력되는 피쳐맵이 미리 정해진 크기의 차원을 갖도록 복수개의 노드들을 이용하여 피쳐맵의 차원을 증가시킨다. 일 실시예에 있어서, 제2 차원증가부(335)는 제10 컨벌루션 연산부(331d)에서 출력되는 피쳐맵이 128*128의 크기를 갖거나 256*256의 크기를 갖도록 차원을 증가시킬 수 있다.The second dimension increasing unit 335 increases the dimension of the feature map by using a plurality of nodes so that the feature map output from the tenth convolution operation unit 331d has a dimension of a predetermined size. In an embodiment, the second dimension increase unit 335 may increase the dimension so that the feature map output from the tenth convolution operation unit 331d has a size of 128*128 or 256*256.

제2 차원 증가부(335)는 차원이 증가된 피쳐맵에 활성화함수를 적용함으로써 차원이 증가된 피쳐맵에 비선형적 특정을 반영한다. 일 실시예에 있어서, 제2 차원 증가부(335)는 피쳐맵의 픽셀값들 중 양의 값은 그대로 출력하고 음의 값은 미리 정해진 크기만큼 감소된 값으로 출력하는 활성화함수를 적용하여 피쳐맵에 비선형적 특성을 반영할 수 있다.The second dimension increasing unit 335 applies the activation function to the feature map with the increased dimension, thereby reflecting the nonlinear characterization on the feature map with the increased dimension. In one embodiment, the second dimensional increase unit 335 applies an activation function that outputs a positive value among pixel values of the feature map as it is and outputs a negative value as a value reduced by a predetermined size. Can reflect non-linear characteristics.

제5 차원감소부(337a)는 제2 차원증가부(335)에서 출력되는 피쳐맵에 제5 차원감소 필터를 적용함으로써 제2 차원증가부(335)에서 출력되는 피쳐맵의 차원을 감소시킨다. 일 실시예에 있어서 제5 차원감소 필터는 피쳐맵을 2차원으로 감소시킬 수 있는 필터로 설정될 수 있다.The fifth dimension reduction unit 337a reduces the dimension of the feature map output from the second dimension increase unit 335 by applying the fifth dimension reduction filter to the feature map output from the second dimension increase unit 335. In an embodiment, the fifth dimensionality reduction filter may be set as a filter capable of reducing the feature map in two dimensions.

제3 확률값 연산부(339)는 제5 차원감소부(337a)에 의해서 출력되는 2차원의 출력 데이터에 미리 정해진 분류함수를 적용함으로써 해당 제2 서브 입력 이미지에 얼굴영역이 포함되어 있는지 여부에 대한 제3 확률값을 계산한다. 일 실시예에 있어서, 제3 확률값 연산부(339)는 산출된 제3 확률값이 제2 문턱값보다 큰 제3 문턱값 이상이면 해당 제2 서브 입력이미지에 얼굴영역이 포함된 것으로 판단한다.The third probability value calculation unit 339 applies a predetermined classification function to the two-dimensional output data output by the fifth dimension reduction unit 337a to determine whether or not the face region is included in the second sub-input image. 3 Calculate the probability value. In an embodiment, if the calculated third probability value is greater than or equal to a third threshold value greater than the second threshold value, the third probability value calculator 339 determines that the face region is included in the second sub-input image.

제6 차원감소부(337b)는 제3 확률값 연산부(339)에 의해 산출된 제3 확률값이 제3 문턱값 이상인 경우 제2 차원증가부(335)에서 출력되는 피쳐맵에 제6 차원감소 필터를 적용함으로써 제2 차원증가부(335)에서 출력되는 피쳐맵의 차원을 감소시킨다. 일 실시예에 있어서 제6 차원감소 필터는 피쳐맵을 4차원으로 감소시킬 수 있는 필터로 설정될 수 있고, 제6 차원 감소부(337b)는 4차원으로 출력되는 4개의 값을 해당 제2 서브 입력 이미지 상에서의 얼굴영역 좌표로 결정한다. 이때, 얼굴영역의 좌표는 얼굴이 포함된 영역을 사각형 형태의 바운딩박스로 표시하였을 때 좌측 상단 꼭지점의 좌표와 우측 하단 꼭지점의 좌표로 정의되거나, 우측상단 꼭지점의 좌표와 좌측 하단 꼭지점의 좌표로 정의될 수 있다.The sixth dimensionality reduction unit 337b applies a sixth dimensionality reduction filter to the feature map output from the second dimensionality increase unit 335 when the third probability value calculated by the third probability value operation unit 339 is equal to or greater than the third threshold value. By applying, the dimension of the feature map output from the second dimension increasing unit 335 is reduced. In one embodiment, the sixth dimensionality reduction filter may be set as a filter capable of reducing the feature map to four dimensions, and the sixth dimensionality reduction unit 337b converts four values output in four dimensions into a corresponding second sub It is determined by the coordinates of the face area on the input image. At this time, the coordinates of the face area are defined as the coordinates of the upper left vertex and the coordinates of the lower right vertex when the area including the face is displayed in a rectangular bounding box, or the coordinates of the upper right vertex and the coordinates of the lower left vertex Can be.

제6 차원감소부(337b)는 산출된 얼굴영역 좌표를 이용하여 얼굴영역이 포함된 것으로 판단된 제2 서브 입력 이미지 상에서 얼굴 이미지를 추출한다.The sixth dimension reduction unit 337b extracts a face image from the second sub-input image determined to include the face region by using the calculated face region coordinates.

제7 차원감소부(337c)는 제3 확률값 연산부(339)에 의해 산출된 제3 확률값이 제3 문턱값 이상인 경우 제2 차원증가부(335)에서 출력되는 피쳐맵에 제7 차원감소 필터를 적용함으로써 제2 차원증가부(335)에서 출력되는 피쳐맵의 차원을 감소시킨다. 일 실시예에 있어서 제7 차원감소 필터는 피쳐맵을 10차원으로 감소시킬 수 있는 필터로 설정될 수 있고, 제7 차원 감소부(337c)는 10차원으로 출력되는 10개의 값을 해당 제2 서브 입력 이미지 상에서의 랜드마크 좌표로 결정한다. 이때, 랜드마크 좌표는 도 3c에 도시된 바와 같이 제2 서브 입력 이미지(350) 상에서의 2개의 눈의 좌표(420, 425), 코의 좌표(430), 2개의 입의 좌표(440, 450)를 의미하고, 2개의 입의 좌표(440, 450)는 입의 좌측 꼬리에 대한 좌표(440) 및 입의 우측 꼬리에 대한 좌표(450)를 의미한다.The seventh dimension reduction unit 337c applies a seventh dimension reduction filter to the feature map output from the second dimension increase unit 335 when the third probability value calculated by the third probability value calculator 339 is equal to or greater than the third threshold value. By applying, the dimension of the feature map output from the second dimension increasing unit 335 is reduced. In one embodiment, the seventh dimensionality reduction filter may be set as a filter capable of reducing the feature map to ten dimensions, and the seventh dimensionality reduction unit 337c applies ten values output in ten dimensions to the corresponding second sub It is determined by the coordinates of the landmark on the input image. At this time, the landmark coordinates are the coordinates of the two eyes 420 and 425 on the second sub-input image 350, the coordinates of the nose 430, and the coordinates of the two mouths 440 and 450 as shown in FIG. 3C. ), and the coordinates of the two mouths 440 and 450 refer to the coordinates 440 for the left tail of the mouth and the coordinates 450 for the right tail of the mouth.

이와 같이, 본 발명에 따르면 얼굴이미지 추출부(300)가 제1 내지 제3 얼굴탐지부(310~330)로 구현되고, 제1 얼굴탐지부(310)는 제2 얼굴탐지부(320)에 비해 얕은(Shallow) 뎁스의 네트워크로 구성되고, 제2 얼굴탐지부(320)는 제3 얼굴탐지부(330)에 비해 얕은 뎁스의 네트워크로 구성됨으로써 얼굴이미지 추출부(300)가 전체적으로 Shallow-to-Deep구조로 단계적인 형태로 형성되도록 한다. 이를 통해, 얼굴이미지 추출의 정확도를 향상시킴과 동시에 복잡도를 감소시킴으로써 얼굴인식 속도 측면에서 이득을 취할 수 있게 된다.As described above, according to the present invention, the face image extraction unit 300 is implemented as the first to third face detection units 310 to 330, and the first face detection unit 310 is in the second face detection unit 320 It is composed of a network of shallow depth, and the second face detection unit 320 is configured as a network of a shallower depth than the third face detection unit 330, so that the face image extracting unit 300 is overall Shallow-to -It is formed in a stepwise form with a deep structure. Through this, it is possible to take advantage in terms of face recognition speed by improving the accuracy of facial image extraction and reducing the complexity at the same time.

얼굴 이미지 정렬부(340)는 제3 얼굴탐지부(330)에서 출력된 랜드마크 좌표를 이용하여 얼굴이미지를 정렬한다. 구체적으로, 얼굴 이미지 정렬부(340)는 추출된 얼굴이미지에 대한 랜드마크 좌표를 이용하여 얼굴이미지에 대해 회전, 평행이동, 확대 및 축소 중 적어도 하나를 수행하여 얼굴이미지를 정렬한다. 본 발명에서 얼굴 이미지 정렬부(340)를 이용하여 얼굴이미지를 정렬하는 이유는 특징벡터 추출부(229)에 입력으로 제공될 얼굴이미지에 일관성을 부여함으로써 얼굴인식 성능을 향상시키기 위함이다.The face image alignment unit 340 aligns the face image using the landmark coordinates output from the third face detection unit 330. Specifically, the face image alignment unit 340 aligns the face image by performing at least one of rotation, translation, enlargement and reduction on the face image by using landmark coordinates for the extracted face image. In the present invention, the reason for aligning face images using the face image alignment unit 340 is to improve face recognition performance by giving consistency to a face image to be provided as an input to the feature vector extraction unit 229.

일 실시예에 있어서, 얼굴 이미지 정렬부(340)는 제3 얼굴탐지부(330)에 의해 추출된 얼굴이미지를 특징벡터 추출부(229)에서 이용되는 얼굴이미지와 동일한 해상도로 리사이징(Resizing)하고, 특징벡터 추출부(229)에서 이용되는 해상도의 얼굴이미지에 대한 기준 랜드마크 좌표를 기준으로 제3 얼굴탐지부(330)에 의해 산출된 랜드마크 좌표를 이동시킴으로써 얼굴이미지를 회전, 평행이동, 확대 또는 축소시킬 수 있다.In one embodiment, the face image alignment unit 340 resizes the face image extracted by the third face detection unit 330 to the same resolution as the face image used in the feature vector extraction unit 229. , By moving the landmark coordinates calculated by the third face detection unit 330 based on the reference landmark coordinates for the resolution face image used in the feature vector extraction unit 229, the face image is rotated and translated, You can enlarge or reduce it.

다시 도 2를 참조하면, 특징벡터 추출부(229)는 얼굴이미지 추출부(227)로부터 얼굴이미지가 입력되면, 해당 얼굴이미지에 포함된 얼굴로부터 특징벡터를 추출한다. 이하, 도 5를 참조하여 본 발명에 따른 특징벡터 추출부에 대해 보다 구체적으로 설명한다.Referring back to FIG. 2, when a face image is input from the face image extracting unit 227, the feature vector extractor 229 extracts a feature vector from the face included in the face image. Hereinafter, a feature vector extraction unit according to the present invention will be described in more detail with reference to FIG. 5.

도 5는 본 발명에 따른 특징벡터 추출부의 구성을 개략적으로 보여주는 블록도이다. 도 5에 도시된 바와 같이, 본 발명의 일 실시예에 따른 특징벡터 추출부는 복수개의 얼굴이미지 처리부(510a~510n) 및 특징벡터 생성부(520)를 포함한다.5 is a block diagram schematically showing the configuration of a feature vector extraction unit according to the present invention. As shown in FIG. 5, the feature vector extraction unit according to an embodiment of the present invention includes a plurality of face image processing units 510a to 510n and a feature vector generation unit 520.

복수개의 얼굴이미지 처리부(510a~510n)는 입력 데이터를 영상 처리하여 출력 데이터를 생성한다. 이때, 복수개의 얼굴이미지 처리부(510a~510n) 중 1 번째 얼굴이미지 처리부(510a)에는 입력 이미지로써 얼굴이미지 추출부(227)로부터 출력되는 얼굴이미지가 입력되고, n+1번째 얼굴이미지 처리부(510n+1)에는 입력 이미지로써 n번재 얼굴이미지 처리부(510n)의 출력 데이터가 입력된다.The plurality of face image processing units 510a to 510n process input data to generate output data. At this time, a face image output from the face image extraction unit 227 as an input image is input to the first face image processing unit 510a among the plurality of face image processing units 510a to 510n, and the n+1th face image processing unit 510n Output data of the n-th face image processing unit 510n is input to +1) as an input image.

예컨대, 1번째 얼굴이미지 처리부(510a)는 얼굴이미지를 영상처리하여 출력 데이터를 생성하고, 생성된 출력 데이터를 2번?? 얼굴이미지 처리부(510b)로 입력한다. 1번째 얼굴이미지 처리부(510b)는 1번째 얼굴이미지 처리부(510a)에서 출력되는 출력 데이터를 영상처리하여 새로운 출력 데이터를 생성하고, 생성된 새로운 출력 데이터를 3번?? 얼굴이미지 처리부(510c)로 입력한다. For example, the first face image processing unit 510a generates output data by image-processing the face image, and generates output data twice?? It is input to the face image processing unit 510b. The first face image processing unit 510b generates new output data by image processing the output data output from the first face image processing unit 510a, and generates new output data 3 times?? It is input to the face image processing unit 510c.

도 5에 도시된 복수개의 얼굴이미지 처리부(510a~510n)의 기능 및 세부구성은 동일하므로 이하에서는 복수개의 얼굴이미지 처리부(510a~510n)들 중 제1 얼굴이미지 처리부(510a)에 대해 예시적으로 설명한다. 이하에서는 설명의 편의를 위해 제1 얼굴이미지 처리부(510a)를 얼굴이미지 처리부(510)로 표기하기로 한다.Since the functions and detailed configurations of the plurality of face image processing units 510a to 510n shown in FIG. 5 are the same, hereinafter, the first face image processing unit 510a among the plurality of face image processing units 510a to 510n is exemplary. Explain. Hereinafter, for convenience of description, the first face image processing unit 510a will be referred to as the face image processing unit 510.

얼굴이미지 처리부(510)는 도 5에 도시된 바와 같이 입력 데이터(얼굴이미지 또는 이전 얼굴이미지 처리부의 출력 데이터임)에 대해 컨벌루션 연산을 수행하여 피쳐맵을 생성하는 제1 유닛(512), 제1 유닛(512)에 의해 생성된 피쳐맵에 가중치를 부여하는 제2 유닛(514), 및 연산부(516)으로 구성된다.As shown in FIG. 5, the face image processing unit 510 performs a convolution operation on input data (a face image or output data of a previous face image processing unit) to generate a feature map. It is composed of a second unit 514 that assigns weight to the feature map generated by the unit 512, and an operation unit 516.

이하, 제1 유닛(512)의 구성을 도 6을 참조하여 보다 구체적으로 설명한다. 도 6은 얼굴이미지 처리부에 포함된 제1 유닛의 구성을 보여주는 블록도이다. 도 6에 도시된 바와 같인, 제1 유닛(512)은 정규화부(600), 제1 컨벌루션 연산부(610), 및 비선형화부(620)를 포함한다.Hereinafter, the configuration of the first unit 512 will be described in more detail with reference to FIG. 6. 6 is a block diagram showing the configuration of a first unit included in a face image processing unit. As shown in FIG. 6, the first unit 512 includes a normalization unit 600, a first convolution operation unit 610, and a non-linearization unit 620.

정규화부(600)는 얼굴이미지 추출부(227)로부터 입력되는 얼굴이미지들을 배치(Batch)단위로 정규화한다. 배치란 한번에 처리할 얼굴이미지들의 개수단위를 의미한다. 본 발명에 따른 정규화부가 배치단위로 정규화를 수행하는 이유는 배치 단위로 정규화를 수행하게 되면 각 얼굴 이미지에 대한 평균 및 분산이 배치 전체에 대한 평균 및 분산과 다를 수 있는데 이러한 특징이 일종의 노이즈로 작용하게 되어 전체적인 성능이 향상될 수 있기 때문이다. The normalization unit 600 normalizes face images input from the face image extracting unit 227 in batch units. Arrangement means the number of face images to be processed at once. The reason why the normalizer according to the present invention normalizes in batch units is that when normalization is performed in batch units, the average and variance for each face image may be different from the average and variance for the entire batch. This feature acts as a kind of noise. This is because the overall performance can be improved.

또한, 배치 정규화를 통해 네트워크의 각 층마다 입력의 분포(Distribution)가 일관성 없이 바뀌는 내부 공분산 이동(Internal Covariance Shift) 현상에 의해 학습의 복잡성이 증가하고 그라디언트 소멸 또는 폭발(Gradient Vanishing or Exploding)이 일어나는 것을 방지할 수 있게 되기 때문이다. In addition, the complexity of learning increases and gradient vanishing or exploding occurs due to the internal covariance shift in which the distribution of inputs for each layer of the network changes inconsistently through batch normalization. This is because it can be prevented.

제1 컨벌루션 연산부(610)는 정규화부(600)에 의해 정규화된 얼굴이미지에 대해 제1 컨벌루션 필터를 적용하여 제1 피쳐맵을 생성한다. 일 실시예에 있어서, 제1 컨벌루션 연산부(610)는 3*3 픽셀크기를 갖고 스트라이드(Stride)의 값이 1인 제1 컨벌루션 필터를 얼굴이미지에 적용하여 제1 피쳐맵을 생성할 수 있다. 이와 같이, 본 발명에 따른 제1 컨벌루션 연산부(610)는 3*3 픽셀크기를 갖고 스트라이드 값이 1인 제1 컨벌루션 필터를 얼굴이미지에 적용하기 때문에 제1 피쳐맵의 해상도를 높게 보존할 수 있게 된다.The first convolution operation unit 610 generates a first feature map by applying a first convolution filter to the face image normalized by the normalization unit 600. In an embodiment, the first convolution operation unit 610 may generate a first feature map by applying a first convolution filter having a 3*3 pixel size and a stride value of 1 to the face image. As described above, since the first convolution operation unit 610 according to the present invention applies the first convolution filter having a 3*3 pixel size and a stride value of 1 to the face image, the resolution of the first feature map can be highly preserved. do.

비선형화부(620)는 제1 피쳐맵에 활성화함수를 적용함으로써 제1 피쳐맵에 비선형적 특성을 부여한다. 일 실시예에 있어서, 비선형화부(620)는 제1 피쳐맵의 픽셀값들 중 양의 픽셀값을 동일하게 출력하고 음의 픽셀값은 그 크기를 감소시켜 출력하는 활성화함수를 제1 피쳐맵에 적용함으로써 제1 피쳐맵에 비선형적 특성을 부여할 수 있다. The nonlinearization unit 620 imparts nonlinear characteristics to the first feature map by applying the activation function to the first feature map. In one embodiment, the nonlinearization unit 620 outputs the same positive pixel value among the pixel values of the first feature map, and reduces the size of the negative pixel value and outputs an activation function to the first feature map. By applying it, nonlinear characteristics can be given to the first feature map.

도 6에서는 비선형화부(620)가 제1 컨벌루션 연산부(610)에 의해 생성된 제1 피쳐맵에 비선형적 특성을 부여하는 것으로 설명하였다. 하지만, 변형된 실시예에 있어서 정규화부(600)는 제1 컨벌루션 연산부(610)에 의해 생성된 제1 피쳐맵들을 배치단위로 추가로 정규화할 수도 있다. 이러한 실시예에 따르는 경우 정규화부(600)는 정규화된 제1 피쳐맵을 비선형화부(620)로 제공하고, 비선형화부(620)는 정규화된 제1 피쳐맵에 활성화함수를 적용함으로써 정규화된 제1 피쳐맵에 비선형적 특성을 부여할 수 있다.In FIG. 6, it has been described that the nonlinearization unit 620 imparts nonlinear characteristics to the first feature map generated by the first convolution operation unit 610. However, in a modified embodiment, the normalization unit 600 may additionally normalize the first feature maps generated by the first convolution operation unit 610 in an arrangement unit. According to this embodiment, the normalization unit 600 provides the normalized first feature map to the non-linearization unit 620, and the non-linearization unit 620 applies an activation function to the normalized first feature map. Nonlinear characteristics can be given to feature maps.

한편, 상술한 실시예에 있어서 제1 유닛(512)은 제1 컨벌루션 연산부(610)만을 포함하는 것으로 설명하였다. 하지만, 변형된 실시예에 있어서 제1 유닛(512)은 도 6에 도시된 바와 같이 제2 컨벌루션 연산부(630)를 더 포함할 수 있다.Meanwhile, in the above-described embodiment, it has been described that the first unit 512 includes only the first convolution operation unit 610. However, in a modified embodiment, the first unit 512 may further include a second convolution operation unit 630 as shown in FIG. 6.

구체적으로, 제2 컨벌루션 연산부(630)는 비선형화부(620)에 의해 비선형적 특성이 부여된 제1 피쳐맵에 제2 컨벌루션 필터를 적용하여 제2 피쳐맵을 생성한다. 일 실시예에 있어서, 제2 컨벌루션 필터는 제1 컨벌루션 필터와 다른 필터일 수 있다. 제2 컨벌루션 필터는 제1 컨벌루션 필터의 크기는 동일하지만 다른 스트라이드 값을 갖는 필터일 수 있다. 일 예로, 제2 컨벌루션 필터는 3*3 픽셀크기를 갖고 스트라이드(Stride)의 값이 2인 필터일 수 있다.Specifically, the second convolution operation unit 630 generates a second feature map by applying a second convolution filter to the first feature map to which nonlinear characteristics are given by the nonlinearization unit 620. In an embodiment, the second convolutional filter may be a filter different from the first convolutional filter. The second convolutional filter may be a filter having the same size of the first convolutional filter but having different stride values. As an example, the second convolution filter may be a filter having a 3*3 pixel size and a stride value of 2.

이러한 실시예에 따르는 경우 정규화부(600)는 제2 컨벌루션 연산부(630)에 의해 생성된 제2 피쳐맵들을 배치단위로 추가로 정규화할 수도 있을 것이다. According to this embodiment, the normalization unit 600 may additionally normalize the second feature maps generated by the second convolution operation unit 630 in batch units.

한편, 도 6에 도시하지는 않았지만 제1 유닛(512)은 사전정규화부를 더 포함할 수 있다. 사전정규화부는 얼굴이미지 추출부로부터 입력되는 얼굴이미지에 포함된 각 픽셀들의 픽셀값을 정규화할 수 있다. 일 예로, 사전정규화부는 얼굴이미지의 각 픽셀값에서 127.5를 감산한 후, 감산 결과값을 128로 제산함으로써 얼굴이미지를 정규화할 수 있다. 사전정규화부는 사전 정규화된 입력 얼굴이미지를 정규화부(600)로 제공할 수 있다.Meanwhile, although not shown in FIG. 6, the first unit 512 may further include a pre-normalization unit. The pre-normalization unit may normalize pixel values of pixels included in the face image input from the face image extraction unit. As an example, the pre-normalization unit may normalize the face image by subtracting 127.5 from each pixel value of the face image and then dividing the subtraction result value by 128. The pre-normalization unit may provide a pre-normalized input face image to the normalization unit 600.

다시 도 5를 참조하면, 제2 유닛(514)은 제1 유닛(512)에 의해 생성된 제2 피쳐맵에 가중치를 부여한다. 본 발명에서 제2 유닛(514)을 통해 제2 피쳐맵에 가중치를 부여하는 이유는 컨벌루션 연산의 경우 입력 이미지의 모든 채널을 컨벌루션 필터와 곱한 후 합산하는 과정에서 중요한 채널과 중요하지 않은 채널들이 모두 얽히게 되므로 데이터의 민감도(Sensitivity)가 저하되므로, 제2 피쳐맵에 각 채널 별로 그 중요도에 따라 가중치를 부여하기 위한 것이다.Referring back to FIG. 5, the second unit 514 assigns a weight to the second feature map generated by the first unit 512. In the present invention, the reason why the second feature map is weighted through the second unit 514 is that in the case of a convolution operation, all channels of the input image are multiplied by a convolution filter and then summed. Since the data becomes entangled, the sensitivity of the data is lowered, so that weights are assigned to the second feature map according to the importance of each channel.

이하, 제2 유닛(514)의 구성을 도 7을 참조하여 보다 구체적으로 설명한다. 도 7은 얼굴이미지 처리부에 포함된 제2 유닛의 구성을 보여주는 블록도이다. 도 7에 도시된 바와 같인, 제2 유닛(514)은 샘플링부(710), 가중치 반영부(720), 및 업스케일링부(730)를 포함한다.Hereinafter, the configuration of the second unit 514 will be described in more detail with reference to FIG. 7. 7 is a block diagram showing the configuration of a second unit included in a face image processing unit. As illustrated in FIG. 7, the second unit 514 includes a sampling unit 710, a weight reflecting unit 720, and an upscaling unit 730.

먼저, 샘플링부(710)는 제1 유닛(512)으로부터 입력되는 제2 피쳐맵을 서브 샘플링하여 차원을 감소시킨다. 일 실시예에 있어서, 샘플링부(710)는 제2 피쳐맵에 글로벌 풀링(Global Pooling) 필터를 적용함으로써 제2 피쳐맵의 차원을 감소시킬 수 있다. 일 예로, 제2 피쳐맵의 차원이 H*W*C인 경우 샘플링부(710)는 제2 피쳐맵의 서브 샘플링을 통해 제2 피쳐맵의 차원을 1*1*C로 감소시킬 수 있다.First, the sampling unit 710 subsamples the second feature map input from the first unit 512 to reduce the dimension. In an embodiment, the sampling unit 710 may reduce the dimension of the second feature map by applying a global pooling filter to the second feature map. For example, when the dimension of the second feature map is H*W*C, the sampling unit 710 may reduce the dimension of the second feature map to 1*1*C through sub-sampling of the second feature map.

가중치 반영부(720)는 샘플링부(710)에 의해 차원이 감소된 제2 피쳐맵에 가중치를 부여한다. 이를 위해, 도 7에 도시된 바와 같이 가중치 반영부(720)는 차원 감소부(722), 제1 비선형화부(724), 차원 증가부(726), 및 제2 비선형화부(728)를 포함할 수 있다.The weight reflecting unit 720 assigns a weight to the second feature map whose dimension has been reduced by the sampling unit 710. To this end, as shown in FIG. 7, the weight reflecting unit 720 includes a dimension reducing unit 722, a first nonlinearizing unit 724, a dimension increasing unit 726, and a second nonlinearizing unit 728. I can.

차원 감소부(722)는 서브 샘플링된 제2 피쳐맵을 하나의 레이어로 연결함으로써 서브 샘플링된 제2 피쳐맵의 차원을 감소시킨다. 일 예로, 샘플링부(710)로부터 출력되는 제2 피쳐맵의 차원이 1*1*C인 경우 차원 감소부(722)는 제2 피쳐맵의 차원을 1*1*C/r로 감소시킨다. 여기서, r은 감소율을 의미하는 것으로서, 추출하기 원하는 특징벡터의 개수에 따라 결정될 수 있다.The dimension reduction unit 722 reduces the dimension of the subsampled second feature map by connecting the subsampled second feature map into one layer. For example, when the dimension of the second feature map output from the sampling unit 710 is 1*1*C, the dimension reduction unit 722 reduces the dimension of the second feature map to 1*1*C/r. Here, r denotes a reduction rate and may be determined according to the number of feature vectors desired to be extracted.

제1 비선형화부(724)는 차원 감소부(722)에 의해 차원이 감소된 제2 피쳐맵에 제1 활성화함수를 적용함으로써 차원이 감소된 제2 피쳐맵에 비선형적 특성을 부여한다. 일 실시예에 있어서, 제1 비선형화부(724)는 제2 피쳐맵의 픽셀값들 중 양의 픽셀값은 그대로 출력하고 음의 픽셀값은 0으로 출력하는 제1 활성화함수를 적용함으로써 차원이 감소된 제2 피쳐맵에 비선형적 특성을 부여할 수 있다.The first non-linearization unit 724 applies the first activation function to the second feature map whose dimension has been reduced by the dimension reduction unit 722, thereby imparting a nonlinear characteristic to the second feature map with a reduced dimension. In one embodiment, the first non-linearization unit 724 reduces the dimension by applying a first activation function that outputs a positive pixel value as it is and outputs a negative pixel value as 0 among the pixel values of the second feature map. Nonlinear characteristics can be given to the second feature map.

차원 증가부(726)는 제1 비선형화부(724)에 의해 비선형적 특성이 부여된 제2 피쳐맵의 차원을 증가시킨다. 일 예로, 비선형적 특성이 부여된 제2 피쳐맵의 차원이 1*1*c/r인 경우 차원 증가부(726)는 제2 피쳐맵의 차원을 다시 1*1*C로 증가시킨다.The dimension increasing unit 726 increases the dimension of the second feature map to which the nonlinear characteristic is given by the first nonlinear unit 724. As an example, when the dimension of the second feature map to which the nonlinear characteristic is assigned is 1*1*c/r, the dimension increasing unit 726 increases the dimension of the second feature map to 1*1*C again.

제2 비선형화부(728)는 차원 증가부(726)에 의해 차원이 증가된 제2 피쳐맵에 제2 활성화함수를 적용함으로써 차원이 증가된 제2 피쳐맵에 비선형적 특성을 다시 부여한다. 일 실시예에 있어서, 제2 활성화함수는 제1 활성화함수와 다른 함수일 수 있다. 예컨대, 제2 비선형화부(728)는 차원이 증가된 제2 피쳐맵의 픽셀값들 중 양의 픽셀값은 미리 정해진 값으로 수렴하도록 하고 음의 픽셀값은 0으로 출력하는 제2 활성화함수를 적용함으로써 차원이 증가된 제2 피쳐맵에 비선형적 특성을 부여할 수 있다.The second nonlinearization unit 728 applies the second activation function to the second feature map whose dimension is increased by the dimension increase unit 726 to give the nonlinear characteristic to the second feature map whose dimension is increased again. In one embodiment, the second activation function may be a different function from the first activation function. For example, the second non-linearization unit 728 applies a second activation function in which positive pixel values converge to a predetermined value among pixel values of the second feature map with an increased dimension and outputs a negative pixel value as 0. By doing so, it is possible to impart nonlinear characteristics to the second feature map with an increased dimension.

이와 같이, 본 발명에 따른 가중치 반영부(720)는 차원감소부(722), 제1 비선형화부(724), 차원증가부(726), 및 제2 비선형화부)728)를 통해 제2 피쳐맵에 가중치를 부여하고, 차원감소부(722)와 차원증가부(726)를 통해 병목구간을 만들어 게이팅 메커니즘을 한정함으로써 모델 복잡도를 제한하고 일반화를 지원할 수 있게 된다.As described above, the weight reflecting unit 720 according to the present invention provides a second feature map through the dimension reduction unit 722, the first non-linearization unit 724, the dimensionality increase unit 726, and the second non-linearization unit 728). A weight is assigned to and a bottleneck is created through the dimension reduction unit 722 and the dimension increase unit 726 to limit the gating mechanism, thereby limiting model complexity and supporting generalization.

업스케일링부(730)는 가중치 반영부(720)에 의해 가중치가 부여된 제2 피쳐맵을 제2 유닛(514)에 입력된 얼굴이미지와 동일한 차원으로 업스케일링한다. 일 실시예에 있어서, 제2 유닛(514)에 입력된 얼굴이미지의 차원이 H*W*C인 경우 업스케일링부(720)는 가중치가 부여된 제2 피쳐맵의 차원을 H*W*C로 업스케일링한다.The upscaling unit 730 upscales the second feature map weighted by the weight reflecting unit 720 to the same dimension as the face image input to the second unit 514. In one embodiment, when the dimension of the face image input to the second unit 514 is H*W*C, the upscaling unit 720 sets the dimension of the second feature map to which the weight is assigned to H*W*C. Upscale with.

다시 도 5를 참조하면, 연산부(516)는 제2 유닛(514)을 통해 출력되는 업스케일링된 제2 피쳐맵을 제1 유닛(512)에 입력된 얼굴이미지와 합산한다. 본 발명에서 연산부(516)를 통해 제2 유닛(514)에서 출력된 업스케일링된 제2 피쳐맵을 제1 유닛(512)에 입력된 얼굴이미지와 합산하는 이유는 컨벌루션 신경망에서 깊이가 깊어지는 경우 특징이 흐려지는 문제(Vanish Problem)를 방지하기 위한 것이다.Referring back to FIG. 5, the operation unit 516 adds the upscaled second feature map output through the second unit 514 with the face image input to the first unit 512. In the present invention, the reason why the upscaled second feature map output from the second unit 514 through the operation unit 516 is added to the face image input to the first unit 512 is a feature when the depth increases in the convolutional neural network. This is to prevent this Vanish Problem.

특징벡터 생성부(520)는 복수개의 얼굴이미지 처리부(510a~510n)들 중 마지막 얼굴 이미지 처리부(510n)로부터 출력되는 피쳐맵을 하나의 레이어로 병합하여 차원을 감소시킴으로써 미리 정해진 개수의 특징벡터를 생성한다. 일 실시예에 있어서, 특징벡터 생성부(520)는 얼굴이미지 처리부(510n)로부터 출력되는 피쳐맵으로부터 128개 이상의 특징벡터를 출력할 수 있다. 예컨대, 특징벡터 생성부(520)는 얼굴 이미지 처리부(510n)로부터 출력되는 피쳐맵으로부터 512개의 특징벡터를 출력할 수 있다.The feature vector generation unit 520 merges the feature map output from the last face image processing unit 510n among the plurality of face image processing units 510a to 510n into one layer to reduce the dimension, thereby generating a predetermined number of feature vectors. Generate. In an embodiment, the feature vector generation unit 520 may output 128 or more feature vectors from a feature map output from the face image processing unit 510n. For example, the feature vector generation unit 520 may output 512 feature vectors from a feature map output from the face image processing unit 510n.

다시 도 2를 참조하면 어레이 파일 생성부(230)는 얼굴인식부(220)에 의해 생성된 특징벡터를 이용하여 각 사용자 별로 어레이(Array)를 생성하고, 생성된 어레이들을 하나의 파일로 머지하여 어레이 파일을 생성한다. 어레이 파일 생성부(230)는 생성된 어레이 파일을 어레이 파일 데이터베이스(미도시)에 저장할 수 있다.Referring back to FIG. 2, the array file generation unit 230 generates an array for each user using the feature vector generated by the face recognition unit 220, and merges the generated arrays into one file. Create an array file. The array file generation unit 230 may store the generated array file in an array file database (not shown).

일 실시예에 있어서, 어레이 파일 생성부(230)에 의해 생성되는 어레이는 각 사용자의 얼굴이미지로부터 획득된 복수개의 특징벡터들과 각 사용자의 키(Key)값으로 구성될 수 있다. 이때, 사용자의 키 값은 각 사용자의 식별정보 및 각 사용자의 출입권한정보를 포함한다. 각 사용자의 식별정보는 상술한 바와 같이 각 사용자의 아이다, 성명, 전화번호, 또는 직원번호 등으로 정의될 수 있고, 각 사용자의 출입권한정보는 각 사용자가 출입할 수 있는 각 층에 대한 정보를 포함할 수 있다. In an embodiment, the array generated by the array file generator 230 may be composed of a plurality of feature vectors obtained from a face image of each user and a key value of each user. In this case, the user's key value includes identification information of each user and access permission information of each user. As described above, the identification information of each user can be defined as the ID, name, phone number, or employee number of each user, and the access permission information of each user includes information on each floor to which each user can access. Can include.

일 실시예에 있어서, 어레이 파일 생성부(230)는 에지 디바이스(120)가 설치되어 있는 각 장소 별로 어레이 파일을 생성할 수 있다. 예컨대, 제1 어레이 파일은 제1 층에 대한 출입권한이 부여된 사용자들의 어레이들로 구성될 수 있고, 제2 어레이 파일은 제2 층에 대한 출입원한이 부여된 사용자들의 어레이들로 구성될 수 있다. 이를 위해, 어레이 파일 생성부(230)는 각 사용자의 어레이들 또한 각 사용자가 출입할 수 있는 지역 별로 구분하여 생성할 수 있다. 예컨대, 제1 사용자가 제1 층과 제3 층에 출입 가능한 권한을 가진 경우, 어레이 파일 생성부(230)는 제1 사용자에 대해 제1 층에 대한 출입권한정보가 포함된 제1 어레이와 제3 층에 대한 출입권한정보가 포함된 제2 어레이를 별도로 생성할 수 있다.In an embodiment, the array file generator 230 may generate an array file for each location where the edge device 120 is installed. For example, the first array file may be composed of arrays of users who have been granted access rights to the first layer, and the second array file may be composed of arrays of users who have been granted access rights to the second layer. have. To this end, the array file generation unit 230 may generate arrays of each user by dividing each user's arrays for each area to which each user can access. For example, when a first user has permission to access the first and third floors, the array file generation unit 230 may provide a first array and a first array containing access permission information for the first floor for the first user. A second array including access authority information for the third floor can be separately created.

본 발명에 따른 어레이 파일 생성부(230)가 에지 디바이스(120)가 설치된 각 장소 별로 어레이 파일을 생성하는 이유는 사용자의 얼굴을 인증하는 에지 디바이스(120)가 각 장소 별로 설치되는 경우, 특정 장소에 설치된 에지 디바이스(120)로 해당 장소에 대한 출입권한정보가 포함된 어레이 파일만을 전송하면 되므로 어레이 파일의 전송 및 에지 디바이스(120)에서의 어레이 파일 관리가 용이해지기 때문이다.The reason that the array file generation unit 230 according to the present invention generates the array file for each location where the edge device 120 is installed is, when the edge device 120 for authenticating the user's face is installed for each location, a specific location This is because only the array file including the access authority information for the location needs to be transmitted to the edge device 120 installed in the device 120, so that the transmission of the array file and management of the array file in the edge device 120 are facilitated.

상술한 실시예에 있어서는 어레이 파일 생성부(230)가 각 장소 별로 어레이 파일을 생성하는 것으로 기재하였지만, 변형된 실시예에 있어서 어레이 파일 생성부(230)는 에지 디바이스(120)가 설치된 모든 장소에 대한 권한정보가 포함된 하나의 어레이 파일을 생성하고, 생성된 어레이 파일을 모든 에지 디바이스(120)로 전송할 수도 있다.In the above-described embodiment, it has been described that the array file generation unit 230 generates an array file for each location, but in the modified embodiment, the array file generation unit 230 is located at all locations where the edge device 120 is installed. It is also possible to generate one array file including the rights information for the generated array file and transmit the generated array file to all edge devices 120.

에지 디바이스 등록부(240)는 각 장소에 설치되어 있는 복수개의 에지 디바이스(120)들의 정보를 에지 디바이스 정보 데이터베이스(242)에 등록한다. 일 실시예에 있어서, 에지 디바이스 등록부(240)는 각 에지 디바이스(120)의 식별정보를 각 에지 디바이스가 설치된 장소와 매핑시켜 에지 디바이스 정보 데이터베이스(242)에 저장할 수 있다. 여기서, 에지 디바이스(120)의 식별정보는 에지 디바이스(120)의 제조사 및 시리얼 번호 등을 포함할 수 있다.The edge device registration unit 240 registers information on a plurality of edge devices 120 installed at each location in the edge device information database 242. In an embodiment, the edge device registration unit 240 may map identification information of each edge device 120 to a location where each edge device is installed, and store it in the edge device information database 242. Here, the identification information of the edge device 120 may include a manufacturer and a serial number of the edge device 120.

한편, 에지 디바이스 등록부(240)는 인터페이스부(270)를 통해 미리 정해진 기간 마다 에지 디바이스(120)로부터 인증기록을 수신하고, 수신된 출입기록을 에지 디바이스 정보 데이터베이스(242)에 저장할 수 있다.Meanwhile, the edge device registration unit 240 may receive an authentication record from the edge device 120 at each predetermined period through the interface unit 270 and store the received access record in the edge device information database 242.

출입권한정보 관리부(250)는 각 사용자 별로 부여되어 있는 출입권한정보를 변경하거나 새로운 출입권한정보를 추가한다. 일 실시예에 있어서, 출입권한 정보 관리부(250)는 각 사용자 별로 출입권한정보를 별개로 부여하거나 각 사용자가 속한 조직 단위로 출입권한정보를 부여할 수 있다.The access permission information management unit 250 changes the access permission information assigned to each user or adds new access permission information. In an embodiment, the access permission information management unit 250 may separately grant access permission information for each user, or may grant access permission information in an organizational unit to which each user belongs.

한편 본 발명에 따른 중앙서버(110)는 스케쥴러(260)를 더 포함할 수 있다. 스케쥴러(260)는 미리 정해진 기간이 도래하거나 미리 정해진 이벤트가 발생할 때마다 일괄적으로 신규 사용자를 등록하는 기능을 수행한다. 예컨대, 상술한 실시예에 있어서는 사용자 등록부(210)가 사용자로부터 등록요청이 발생하는 경우 신규 사용자의 등록절차를 수행하는 것으로 설명하였지만, 중앙서버(110)가 스케쥴러(260)를 포함하는 경우 미리 정해진 시간 단위로 또는 미리 정해진 이벤트가 발생하면 스케쥴러(260)가 사용자 등록부(210), 입력 이미지 생성부(215), 및 얼굴인식부(220)의 동작을 개시시킴으로써 신규 사용자 등록절차가 자동으로 수행되도록 할 수 있다.Meanwhile, the central server 110 according to the present invention may further include a scheduler 260. The scheduler 260 performs a function of collectively registering a new user whenever a predetermined period arrives or a predetermined event occurs. For example, in the above-described embodiment, it has been described that the user registration unit 210 performs a registration procedure for a new user when a registration request from a user occurs, but when the central server 110 includes the scheduler 260, a predetermined The scheduler 260 initiates the operation of the user registration unit 210, the input image generation unit 215, and the face recognition unit 220 when a time unit or a predetermined event occurs, so that the new user registration process is automatically performed. can do.

인터페이스부(270)는 어레이 파일 생성부(230)에 의해 생성된 어레이 파일을 미리 정해진 방식으로 암호화하여 각 에지 디바이스(120)로 전송한다. 일 실시예에 있어서, 인터페이스부(270)는 공개키 기반의 암호화 알고리즘을 이용하여 어레이 파일을 암호화하여 각 에지 디바이스(120)로 전송할 수 있다.The interface unit 270 encrypts the array file generated by the array file generation unit 230 in a predetermined manner and transmits it to each edge device 120. In one embodiment, the interface unit 270 may encrypt the array file using an encryption algorithm based on a public key and transmit it to each edge device 120.

한편, 인터페이스부(270)는 암호화된 어레이 파일을 에지 디바이스(120)와 약속된 프로토콜에 따라 에지 디바이스(120)로 전송할 수 있다.Meanwhile, the interface unit 270 may transmit the encrypted array file to the edge device 120 according to a protocol promised with the edge device 120.

또한, 인터페이스부(270)는 각 에지 디바이스(120)로부터 미리 정해진 기간 마다 인증기록을 수신하여 에지 디바이스(120)로 제공할 수 수 있다.In addition, the interface unit 270 may receive the authentication record from each edge device 120 every predetermined period and provide it to the edge device 120.

얼굴인식모델 트레이닝부(280)는 컨벌루션 신경망을 기초로 얼굴인식모델(225)을 생성하고, 생성된 얼굴인식모델(225)을 트레이닝시킨다. 구체적으로, 얼굴인식모델 트레이닝부(280)는 얼굴인식모델(225)을 구성하는 컨벌루션 신경망을 지속적으로 트레이닝킴으로써 최적의 얼굴인식모델을 생성한다.The face recognition model training unit 280 generates a face recognition model 225 based on a convolutional neural network, and trains the generated face recognition model 225. Specifically, the face recognition model training unit 280 generates an optimal face recognition model by continuously training the convolutional neural network constituting the face recognition model 225.

이를 위해, 얼굴인식모델 트레이닝부(280)는 얼굴이미지 추출부(227)를 트레이닝시키는 얼굴이미지 추출 트레이닝부(282), 특징벡터 추출부(229)를 트레이닝시키는 특징벡터 추출 트레이닝부(284), 및 특징벡터 추출부(229)의 오차를 감소시키는 오차감소부(286)를 포함한다.To this end, the face recognition model training unit 280 includes a face image extraction training unit 282 for training the face image extraction unit 227, a feature vector extraction training unit 284 for training the feature vector extraction unit 229, And an error reduction unit 286 for reducing an error of the feature vector extraction unit 229.

얼굴이미지 추출 트레이닝부(282)는 얼굴이미지 추출부(227)를 구성하는 제1 내지 제3 얼굴탐지부(310~330)를 학습 이미지를 이용하여 트레이닝시킨다. 구체적으로, 얼굴이미지 추출 트레이닝부(282)는 도 4b에 도시된 바와 같은 구조를 갖는 제1 얼굴탐지부(310)에 미리 정해진 크기를 갖는 복수개의 학습 이미지를 입력하여 학습 이미지에서 얼굴영역이 포함될 제1 확률값 및 얼굴영역 좌표를 산출하고, 산출된 제1 확률값 및 얼굴영역 좌표를 역전파(Back Propagation) 알고리즘에 따라 제1 얼굴탐지부(310)에 피드백함으로써 제1 얼굴탐지부(310)에 적용된 컨벌루션 필터들의 필터계수 및 가중치를 갱신한다.The face image extraction training unit 282 trains the first to third face detection units 310 to 330 constituting the face image extraction unit 227 by using the training image. Specifically, the face image extraction training unit 282 inputs a plurality of training images having a predetermined size to the first face detection unit 310 having a structure as shown in FIG. 4B, and the face region is included in the training image. The first probability value and the face region coordinates are calculated, and the calculated first probability value and the face region coordinates are fed back to the first face detection unit 310 according to a back propagation algorithm to the first face detection unit 310. The filter coefficients and weights of the applied convolution filters are updated.

도 4b에서는 제1 얼굴탐지부(310)가 해당 이미지가 얼굴영역을 포함할 제1 확률값과 해당 이미지 상에서 얼굴영역 좌표만을 산출하는 것으로 설명하였기 때문에 얼굴이미지 추출 트레이닝부(282)는 산출된 제1 확률값 및 얼굴영역 좌표를 역전파 알고리즘을 이용하여 제1 얼굴탐지부(310)에 피드백하여 제1 얼굴탐지부(310)에 적용된 컨벌루션 필터들의 필터계수 및 가중치를 갱신하는 것으로 설명하였다.In FIG. 4B, since it has been described that the first face detection unit 310 calculates only the first probability value that the image will include the face region and the coordinates of the face region on the image, the face image extraction training unit 282 It has been described that the probability value and the face region coordinates are fed back to the first face detection unit 310 using a backpropagation algorithm to update the filter coefficients and weights of the convolutional filters applied to the first face detection unit 310.

하지만, 다른 실시예에 있어서 얼굴이미지 추출 트레이닝부(282)는 랜드마크 좌표 추출의 정확도를 향상시키기 위해 제1 얼굴탐지부(310)의 트레이닝시 제1 얼굴탐지부(310)로부터 랜드마크 좌표를 추가로 산출하고, 산출된 랜드마크 좌표를 제1 확률값 및 얼굴영역 좌표와 함께 역전파 알고리즘을 통해 제1 얼굴탐지부(310)에 피드백함으로써 제1 얼굴탐지부(310)에 적용된 컨벌루션 필터들의 필터계수 및 가중치를 갱신할 수도 있을 것이다.However, in another embodiment, the face image extraction training unit 282 receives the landmark coordinates from the first face detection unit 310 during training of the first face detection unit 310 in order to improve the accuracy of the landmark coordinate extraction. Filter of convolutional filters applied to the first face detection unit 310 by additionally calculating and feeding the calculated landmark coordinates together with the first probability value and face region coordinates to the first face detection unit 310 through a backpropagation algorithm It may be possible to update the coefficients and weights.

이러한 실시예에 따르는 경우 제1 얼굴탐지부(310)는 랜드마크 좌표를 획득하기 위해 제3 컨벌루션 연산부(311c)에서 출력되는 피쳐맵에 차원감소 필터를 적용함으로써 제3 컨벌루션 연산부(311c)에서 출력되는 피쳐맵의 차원을 10차원으로 감소시키는 차원 감소부(미도시)를 추가로 포함할 수 있다. 이때, 10차원으로 출력되는 10개의 값이 랜드마크인 2개의 눈의 좌표, 코의 좌표, 좌측 입 꼬리의 좌표, 및 우측 입 꼬리의 좌표로 결정된다. According to this embodiment, the first face detection unit 310 applies a dimension reduction filter to the feature map output from the third convolution operation unit 311c to obtain the landmark coordinates, thereby outputting it from the third convolution operation unit 311c. It may further include a dimension reduction unit (not shown) for reducing the dimension of the feature map to 10 dimensions. At this time, 10 values output in 10 dimensions are determined as the coordinates of the two eyes, which are landmarks, the coordinates of the nose, the coordinates of the left mouth tail, and the coordinates of the right mouth tail.

또한, 얼굴이미지 추출 트레이닝부(282)는 도 4c에 도시된 바와 같은 구조를 갖는 제2 얼굴탐지부(320)에 미리 정해진 크기를 갖는 복수개의 학습 이미지를 입력하여 학습 이미지에서 얼굴영역이 포함될 제2 확률값 및 얼굴영역 좌표를 산출하고, 산출된 제2 확률값 및 얼굴영역 좌표를 역전파 알고리즘을 이용하여 제2 얼굴탐지부(320)에 피드백함으로써 제2 얼굴탐지부(320)에 적용된 컨벌루션 필터들의 필터계수 및 가중치를 갱신한다. 이때, 제2 얼굴탐지부(320)에 입력되는 학습 이미지는 제1 얼굴탐지부(310)에 의해 얼굴영역이 포함된 것으로 결정된 학습이미지로 선정될 수 있다.In addition, the face image extraction training unit 282 inputs a plurality of training images having a predetermined size to the second face detection unit 320 having a structure as shown in FIG. 4C, and the face region is included in the training image. 2 The probability value and the face region coordinates are calculated, and the calculated second probability value and the face region coordinates are fed back to the second face detection unit 320 using a backpropagation algorithm, so that the convolutional filters applied to the second face detection unit 320 are Update the filter coefficient and weight. In this case, the learning image input to the second face detection unit 320 may be selected as a learning image determined by the first face detection unit 310 to include a face region.

도 4c에서는 제2 얼굴탐지부(320)가 해당 이미지가 얼굴영역을 포함하는 확률과 해당 이미지 상에서 얼굴영역 좌표만을 산출하는 것으로 설명하였기 때문에 얼굴이미지 추출 트레이닝부(282)는 산출된 제2 확률값 및 얼굴영역 좌표를 역전파 알고리즘을 이용하여 제2 얼굴탐지부(320)에 피드백하여 제2 얼굴탐지부(320)에 적용된 컨벌루션 필터들의 필터계수 및 가중치를 갱신하는 것으로 설명하였다.In FIG. 4C, since it has been described that the second face detection unit 320 calculates the probability that the image includes the face region and only the face region coordinates on the image, the face image extraction training unit 282 calculates the calculated second probability value and It has been described that the face region coordinates are fed back to the second face detection unit 320 using a backpropagation algorithm to update the filter coefficients and weights of the convolutional filters applied to the second face detection unit 320.

하지만, 다른 실시예에 있어서 얼굴이미지 추출 트레이닝부(282)는 랜드마크 좌표 추출의 정확도를 향상시키기 위해 제2 얼굴탐지부(320)의 트레이닝시에도 제2 얼굴탐지부(320)로부터 랜드마크 좌표를 추가로 산출하고, 산출된 랜드마크 좌표를 제2 확률값 및 얼굴영역 좌표와 함께 역전파 알고리즘을 통해 제2 얼굴탐지부(320)에 피드백함으로써 제2 얼굴탐지부(320)에 적용된 컨벌루션 필터들의 필터계수 및 가중치를 갱신할 수도 있을 것이다.However, in another embodiment, in order to improve the accuracy of the landmark coordinate extraction, the face image extraction training unit 282 may also receive the landmark coordinates from the second face detection unit 320 during training of the second face detection unit 320. Is additionally calculated, and the calculated landmark coordinates are fed back to the second face detection unit 320 through a backpropagation algorithm together with the second probability value and the face region coordinates, so that the convolutional filters applied to the second face detection unit 320 are The filter coefficients and weights could be updated.

이러한 실시예에 따르는 경우 제2 얼굴탐지부(320)는 랜드마크 좌표를 획득하기 위해 제1 차원증가부(325)에서 출력되는 피쳐맵에 차원감소 필터를 적용함으로써 제1 차원증가부(325)에서 출력되는 피쳐맵의 차원을 10차원으로 감소시키는 차원 감소부(미도시)를 추가로 포함할 수 있다. 이때, 10차원으로 출력되는 10개의 값이 랜드마크인 2개의 눈의 좌표, 코의 좌표, 좌측 입 꼬리의 좌표, 및 우측 입 꼬리의 좌표로 결정된다.According to this embodiment, the second face detection unit 320 applies a dimension reduction filter to the feature map output from the first dimension increase unit 325 in order to obtain the landmark coordinates, so that the first dimension increase unit 325 A dimension reduction unit (not shown) for reducing the dimension of the feature map output from to 10 dimensions may be additionally included. At this time, 10 values output in 10 dimensions are determined as the coordinates of the two eyes, which are landmarks, the coordinates of the nose, the coordinates of the left mouth tail, and the coordinates of the right mouth tail.

또한, 얼굴이미지 추출 트레이닝부(282)는 도 4d에 도시된 바와 같은 구조를 갖는 제3 얼굴탐지부(330)에 미리 정해진 크기를 갖는 복수개의 학습 이미지를 입력하여 학습 이미지에 얼굴영역이 포함될 제3 확률값, 얼굴영역 좌표, 및 랜드마크의 좌표를 산출하고, 산출된 제3 확률값, 얼굴영역 좌표, 및 랜드마크의 좌표를 역전파 알고리즘을 이용하여 제3 얼굴탐지부(330)에 피드백함으로써 제3 얼굴탐지부(330)에 적용된 컨벌루션 필터들의 필터계수 및 가중치를 갱신한다. 이때, 제3 얼굴탐지부(330)에 입력되는 학습 이미지는 제2 얼굴탐지부(320)에 의해 얼굴영역이 포함된 것으로 결정된 학습이미지로 선정될 수 있다.In addition, the facial image extraction training unit 282 inputs a plurality of training images having a predetermined size to the third face detection unit 330 having a structure as shown in FIG. 4D, so that the face region is included in the training image. 3 The probability value, the face region coordinates, and the coordinates of the landmark are calculated, and the calculated third probability value, the face region coordinates, and the coordinates of the landmark are fed back to the third face detection unit 330 using a backpropagation algorithm. 3 The filter coefficients and weights of the convolutional filters applied to the face detection unit 330 are updated. In this case, the learning image input to the third face detection unit 330 may be selected as a learning image determined by the second face detection unit 320 to include a face region.

특징벡터 추출 트레이닝부(284)는 도 5 내지 도 7에 도시된 바와 같은 구성을 갖는 특징벡터 추출부(229)를 학습 이미지를 이용하여 트레이닝시킨다. 구체적으로, 특징벡터 추출 트레이닝부(284)는 도 5 내지 도 7에 도시된 바와 같은 구조를 갖는 특징벡터 추출부(229)에 복수개의 학습 이미지를 미리 정해진 배치단위로 입력함으로써 각 학습이미지로부터 특징벡터를 추출한다.The feature vector extraction training unit 284 trains the feature vector extraction unit 229 having the configuration as shown in FIGS. 5 to 7 by using the training image. Specifically, the feature vector extraction training unit 284 inputs a plurality of training images in a predetermined arrangement unit to the feature vector extraction unit 229 having a structure as shown in FIGS. 5 to 7 to obtain features from each training image. Extract the vector.

특징벡터 추출 트레이닝부(284)는 추출된 특징벡터들을 미리 정해진 분류함수에 적용함으로써 해당 학습 이미지가 특정 클래스에 포함될 확률값을 예측하고, 예측된 확률값(이하, '예측값'이라 함)과 실제값간의 오차를 연산하여 그 결과를 역전파 알고리즘을 이용하여 특징벡터 추출부(229)에 피드백함으로써 특징벡터 추출부(229)에 적용된 컨벌루션 필터들의 필터계수 및 가중치를 갱신한다.The feature vector extraction training unit 284 predicts a probability value to be included in a specific class by applying the extracted feature vectors to a predetermined classification function, and between the predicted probability value (hereinafter referred to as'predicted value') and the actual value. The filter coefficients and weights of the convolutional filters applied to the feature vector extractor 229 are updated by calculating the error and feeding the result back to the feature vector extractor 229 using a backpropagation algorithm.

한편, 본 발명에 따른 얼굴인식모델 트레이닝부(280)는 오차감소부(286)를 통해 특징벡터 추출시 발생되는 오차를 더욱 감소시킴으로써 특징벡터 추출부(229)의 성능을 더욱 향상시킬 수 있다.Meanwhile, the face recognition model training unit 280 according to the present invention may further improve the performance of the feature vector extracting unit 229 by further reducing an error generated when extracting a feature vector through the error reducing unit 286.

구체적으로, 오차감소부(286)는 특징벡터 추출 트레이닝부(284)가 특징벡터 추출부(229)를 트레이닝시키는 과정에서 특징벡터 추출부(229)를 통해 추출된 특징벡터들에 기초한 예측값과 실제값간의 오차를 감소시킨다. 구체적으로, 오차감소부(286)는 특징벡터 추출부(229)가 학습 이미지로부터 추출한 특징벡터들을 기초로 예측값과 실제값간의 오차를 계산하고, 오차가 감소될 수 있도록 각 학습 이미지를 2차원 각도 평면상에 배치하고 배치결과에 따른 확률값을 이용하여 특징벡터 추출 트레이닝부(284)가 특징벡터 추출부(229)를 트레이닝시킬 수 있도록 한다.Specifically, the error reduction unit 286 includes predicted values and actual values based on the feature vectors extracted through the feature vector extraction unit 229 in the process of the feature vector extraction training unit 284 training the feature vector extraction unit 229. It reduces the error between values. Specifically, the error reduction unit 286 calculates an error between the predicted value and the actual value based on the feature vectors extracted by the feature vector extraction unit 229 from the training image, and converts each training image into a two-dimensional angle so that the error can be reduced. Arranged on a plane and using a probability value according to the placement result, the feature vector extraction training unit 284 can train the feature vector extraction unit 229.

본 발명에 따른 얼굴인식모델 트레이닝부(280)가 오차감소부(286)를 통해 오차감소가 되도록 특징벡터 추출부(279)를 학습시키는 이유는 도 8에 도시된 바와 같이 일반적인 얼굴인식모델의 경우 동일인임에도 불구하고 얼굴이 촬영된 조명이나 환경이 변화하는 경우 동일인임을 구분해 내지 못하는 것과 같은 오차가 발생하기 때문에, 이러한 오차감소부(286)를 통해 얼굴인식의 오차가 감소될 수 있는 특징벡터가 추출되도록 특징벡터 추출부(229)를 트레이닝 시키기 위한 것이다.The reason why the face recognition model training unit 280 according to the present invention trains the feature vector extraction unit 279 to reduce errors through the error reduction unit 286 is in the case of a general face recognition model as shown in FIG. In spite of being the same person, when the lighting or environment in which the face was photographed changes, an error such as not being able to distinguish that person is the same person occurs.Thus, a feature vector capable of reducing the error in face recognition through the error reduction unit 286 is This is to train the feature vector extraction unit 229 to be extracted.

본 발명의 일 실시예에 따른 오차감소부(286)는 도 2에 도시된 바와 같이 얼굴이미지 배치부(287) 및 확률산출부(289)를 포함한다.The error reducing unit 286 according to an embodiment of the present invention includes a face image arranging unit 287 and a probability calculating unit 289 as shown in FIG. 2.

얼굴이미지 배치부(287)는 학습 이미지에 대해 특징벡터 추출부(229)가 추출한 복수개의 특징벡터들을 기초로 각 학습 이미지들을 2차원 각도 평면 상에 배치한다. 구체적으로, 얼굴이미지 배치부(287)는 서로 다른 클래스에 포함되는 학습 이미지들간의 코사인 유사도를 산출하고, 코사인 유사도에 따라 각 학습 이미지들 간의 이격각도인 기준각도를 산출함으로써 학습 이미지들을 2차원 각도 평면상에 배치하게 된다.The face image arranging unit 287 arranges each training image on a 2D angular plane based on a plurality of feature vectors extracted by the feature vector extraction unit 229 for the training image. Specifically, the face image arranging unit 287 calculates the cosine similarity between the training images included in different classes, and calculates the reference angle, which is the separation angle between the training images according to the cosine similarity. It is placed on a plane.

본 발명에서 얼굴이미지 배치부(287)가 각 학습 이미지들의 특징벡터를 기초로 산출되는 각 학습 이미지들 간의 거리에 따라 학습 이미지를 벡터공간에 배치하게 되면 도 9에 도시된 바와 같이 각 학습 이미지들 간에 중첩되는 영역(900)이 발생할 수 밖에 없어, 학습시 동일인과 타인을 명확하게 구분하기가 어렵다는 한계가 있다.In the present invention, when the face image arranging unit 287 arranges the training images in the vector space according to the distances between the training images calculated based on the feature vector of each training image, as shown in FIG. There is a limitation in that it is difficult to clearly distinguish between the same person and the other person during learning because the overlapping region 900 is inevitable.

따라서, 본 발명에서는 얼굴이미지 배치부(287)가 서로 다른 클래스에 포함되는 학습 이미지들 사이의 각도를 코사인 유사도를 통해 산출하고, 산출된 각도를 기초로 각 학습 이미지를 2차원 각도 평면상에 배치하는 것이다.Therefore, in the present invention, the face image arranging unit 287 calculates the angle between learning images included in different classes through cosine similarity, and arranges each learning image on a two-dimensional angular plane based on the calculated angle. Is to do.

확률 산출부(289)는, 2차원 각도 평면 상에서 얼굴이미지 배치부(287)에 의해 산출된 기준각도에 가산될 마진각도를 가변시키고, 가변되는 마진각도 별로 각 학습 이미지들이 해당 클래스에 포함될 확률을 산출한다.The probability calculation unit 289 varies a margin angle to be added to the reference angle calculated by the face image arranging unit 287 on a two-dimensional angular plane, and calculates the probability that each learning image will be included in the corresponding class for each variable margin angle. Calculate.

구체적으로, 확률 산출부(289)는 도 10에 도시된 바와 같이 각 학습 이미지 간의 기준각도(θ₁,θ₂)에 가산되는 마진각도(P1, P2)를 가변시키면서 서로 중첩되는 특성을 갖는 학습 이미지들이 2차원 각도 평면 상에서 이격되도록 한다. 일 실시예에 있어서, 마진각도(P1, P2)는 0보다 크고 90도 보다 작은 범위 내에서 학습률(Learning Rate)에 따라 결정될 수 있다.Specifically, as shown in FIG. 10, the probability calculation unit 289 varies the margin angles (P1, P2) added to the reference angles (θ ₁ , θ ₂ ) between each training image, while learning having characteristics overlapping with each other. Images are spaced apart on a two-dimensional angular plane. In an embodiment, the margin angles P1 and P2 may be determined according to a learning rate within a range greater than 0 and less than 90 degrees.

예컨대, 학습률이 증가하면 마진각도도 그에 따라 증가하고 학습률이 감소하면 마진각도도 그에 따라 감소하도록 설정될 수 있다. 이때, 확률 산출부(288)는 마진각도를 미리 정해진 기준 단위만큼 가변시킬 수 있다.For example, when the learning rate increases, the margin angle may increase accordingly, and when the learning rate decreases, the margin angle may be set to decrease accordingly. In this case, the probability calculator 288 may vary the margin angle by a predetermined reference unit.

확률산출부(289)에 의해 마진각도가 기준각도에 가산되면 도 11에 도시된 바와 같이, 벡터공간 내에서 서로 중첩되는 특징을 가졌던 학습 이미지들이 서로 이격되도록 배치된다는 것을 알 수 있다.When the margin angle is added to the reference angle by the probability calculation unit 289, it can be seen that, as shown in FIG. 11, training images having features that overlap each other in the vector space are arranged to be spaced apart from each other.

확률산출부(289)는 기준각도에 가산되는 마진각도 별로 각 학습 이미지들이 해당 클래스에 포함될 확률을 산출하고, 산출된 확률값을 특징벡터 추출 트레이닝부(284)로 제공함으로써 특징벡터 추출 트레이닝부(284)가 확률산출부(289)에 의해 산출된 확률값을 기초로 특징벡터 추출부(229)를 학습시킬 수 있도록 한다. 즉, 특징벡터 추출 트레이닝부(284)는 확률산출부(288)에 의해 산출된 확률값을 기초로 특징벡터 추출부(229)에 적용된 컨벌루션 필터들의 계수 및 가중치 중 적어도 하나를 갱신함으로써 특징벡터 추출부(279)를 학습시키게 된다.The probability calculation unit 289 calculates the probability that each training image will be included in the corresponding class for each margin angle added to the reference angle, and provides the calculated probability value to the feature vector extraction training unit 284, thereby extracting the feature vector and training unit 284. ) Can train the feature vector extraction unit 229 based on the probability value calculated by the probability calculation unit 289. That is, the feature vector extraction training unit 284 updates at least one of the coefficients and weights of the convolutional filters applied to the feature vector extraction unit 229 based on the probability value calculated by the probability calculation unit 288. You will learn (279).

일 실시예에 있어서, 확률 산출부(289)는 아래의 수학식 1을 이용하여 각 마진각도별로 각 학습 이미지들이 해당 클래스에 포함될 확률을 산출할 수 있다.In an embodiment, the probability calculator 289 may calculate a probability that each training image will be included in a corresponding class for each margin angle using Equation 1 below.

수학식 1에서 x는 기준각도를 나타내고 p는 상기 마진각도를 나타내며, n은 클래스의 개수를 나타낸다.In Equation 1, x represents a reference angle, p represents the margin angle, and n represents the number of classes.

일 실시예에 있어서, 확률 산출부(289)는 확률 산출부(289)에 의해 산출된 확률값을 기초로 특징벡터 추출 트레이닝부(284)에 의해 트레이닝된 특징벡터 추출부(229)에 미리 정해진 테스트 얼굴이미지를 적용했을 때 예측값과 실제값간의 오차가 기준치 이하가 될 때까지 마진각도를 계속하여 가변시킬 수 있다.In one embodiment, the probability calculation unit 289 is a predetermined test in the feature vector extraction unit 229 trained by the feature vector extraction training unit 284 based on the probability value calculated by the probability calculation unit 289. When the face image is applied, the margin angle can be continuously varied until the error between the predicted value and the actual value becomes less than the reference value.

즉, 확률 산출부(289)는 트레이닝된 특징벡터 추출부(229)에 미리 정해진 테스트 얼굴이미지를 적용했을 때 산출되는 예측값과 실제값간의 오차가 기준치 이하가 되는 시점의 마진각도를 최종 마진각도로 결정한다. 이때, 예측값과 실제값간의 오차는 크로스 엔트로피(Cross Entropy) 함수를 이용하여 산출할 수 있다.That is, the probability calculation unit 289 determines the margin angle at the time when the error between the predicted value and the actual value calculated when a predetermined test face image is applied to the trained feature vector extraction unit 229 is less than the reference value, as the final margin angle. Decide. In this case, the error between the predicted value and the actual value may be calculated using a cross entropy function.

상술한 바와 같은 오차감소부(286)를 통해 오차감소가 수행되면 도 12에 도시된 바와 같이, 서로 다른 환경이나 다른 조명에서 촬영된 경우라 하더라도 동일인을 정확하게 분류해 낼 수 있게 된다.When error reduction is performed through the error reduction unit 286 as described above, as illustrated in FIG. 12, even when photographed in different environments or under different lighting conditions, the same person can be accurately classified.

상술한 실시예에 있어서, 얼굴인식모델 트레이닝부(280)를 구성하는 얼굴이미지 추출 트레이닝부(282), 특징벡터 추출 트레이닝부(284), 및 오차감소부(286)는 알고리즘 형태의 소프트웨어로 구현되어 중앙서버(110)에 탑재될 수 있다.In the above-described embodiment, the face image extraction training unit 282, the feature vector extraction training unit 284, and the error reduction unit 286 constituting the face recognition model training unit 280 are implemented in algorithm-type software. It can be mounted on the central server 110.

다시 도 1을 참조하면, 에지 디바이스(120)는 특정 장소 마다 배치되어 중앙서버(110)에 의해 배포되는 얼굴인식모델(225)을 이용하여 해당 장소로의 출입을 희망하는 타겟사용자의 얼굴을 인식하고, 인식결과를 기초로 타겟사용자의 출입을 인증하는 기능을 수행한다.Referring back to FIG. 1, the edge device 120 recognizes the face of a target user who wishes to enter the corresponding place using a face recognition model 225 that is arranged in each specific place and distributed by the central server 110. And, based on the recognition result, it performs the function of authenticating the access of the target user.

본 발명에서, 중앙서버(110)가 타겟 사용자의 얼굴인식 및 인증을 수행하지 않고 에지 디바이스(120)가 타겟 사용자의 얼굴인식 및 인증을 수행하도록 한 이유는 타겟 사용자의 얼굴인식 및 인증을 중앙서버(110)에서 수행하는 경우 중앙서버(110) 또는 네트워크에서 장애가 발생되면 얼굴인식 및 인증이 수행될 수 없을 뿐만 아니라 사용자의 수가 증가함에 따라 고가의 중앙서버(110)의 증설이 요구되기 때문이다.In the present invention, the reason why the central server 110 does not perform face recognition and authentication of the target user and the edge device 120 performs face recognition and authentication of the target user is the central server for face recognition and authentication of the target user. This is because, when a failure occurs in the central server 110 or the network, face recognition and authentication cannot be performed, and as the number of users increases, an expensive central server 110 is required to be expanded.

이에 따라 본 발명은 에지 컴퓨팅(Edge Computing) 방식을 적용하여 에지 디바이스(120)에서 타겟 사용자의 얼굴인식 및 인증을 수행하도록 함으로써 중앙서버(110) 또는 네트워크에 장애가 발생하더라도 정상적으로 얼굴인식 서비스를 제공할 수 있어 서비스 제공 신뢰도를 향상시킬 수 있고, 사용자의 수가 증가하더라도 고가의 중앙서버(110)를 증설할 필요가 없어 얼굴인식시스템(100) 구축비용을 절감할 수 있게 된다.Accordingly, the present invention applies an edge computing method to allow the edge device 120 to perform face recognition and authentication of the target user, so that even if a failure occurs in the central server 110 or the network, the face recognition service can be provided. It is possible to improve the reliability of service provision, and even if the number of users increases, it is possible to reduce the cost of building the face recognition system 100 since there is no need to increase the expensive central server 110.

이하, 본 발명에 따른 에지 디바이스(120)의 구성을 도 13을 참조하여 보다 구체적으로 설명한다.Hereinafter, the configuration of the edge device 120 according to the present invention will be described in more detail with reference to FIG. 13.

도 13은 본 발명의 제1 실시예에 따른 에지 디바이스의 구성을 개략적으로 보여주는 블록도이다. 도 13에 도시된 바와 같이, 본 발명의 제1 실시예에 따른 에지 디바이스(120)는 제1 촬영부(1210), 입력 이미지 생성부(1250), 얼굴인식부(1300), 인증부(1310), 얼굴인식모델(1320), 어레이 파일 업데이트부(1330), 메모리(1340), 및 인터페이스부(1350)를 포함한다.13 is a block diagram schematically showing the configuration of an edge device according to the first embodiment of the present invention. 13, the edge device 120 according to the first embodiment of the present invention includes a first photographing unit 1210, an input image generating unit 1250, a face recognition unit 1300, and an authentication unit 1310. ), a face recognition model 1320, an array file update unit 1330, a memory 1340, and an interface unit 1350.

제1 촬영부(1210)는 인증대상이 되는 타겟 사용자가 접근하면, 타겟 사용자를 촬영하여 촬영 이미지를 생성한다. 제1 촬영부(1210)는 생성된 촬영이미지를 입력 이미지 생성부(1250)로 전송한다.When a target user to be authenticated approaches, the first photographing unit 1210 photographs the target user and generates a photographed image. The first photographing unit 1210 transmits the generated photographed image to the input image generating unit 1250.

입력 이미지 생성부(1250)는 제1 촬영부(1210)로부터 전송된 타겟 사용자의 촬영이미지로부터 얼굴인식에 이용될 입력 이미지를 생성한다. 구체적으로 입력 이미지 생성부(1250)는 하나의 타겟 사용자의 촬영이미지를 미리 정해진 단계까지 다운샘플링하거나 업샘플링함으로써 하나의 타겟 사용자의 촬영이미지로부터 해상도가 서로 다른 복수개의 타겟 사용자의 이미지들을 생성한다.The input image generating unit 1250 generates an input image to be used for face recognition from the photographed image of the target user transmitted from the first photographing unit 1210. Specifically, the input image generator 1250 generates images of a plurality of target users having different resolutions from the captured image of one target user by down-sampling or up-sampling the captured image of one target user to a predetermined level.

예컨대, 입력 이미지 생성부(1250)는 타겟 사용자의 이미지에 가우시안 피라미드를 적용함으로써 다운 샘플링된 타겟 사용자 이미지를 생성하거나, 타겟 사용자 이미지에 라플라시안 피라미드를 적용함으로써 업샘플링된 타겟 사용자 이미지를 생성할 수 있다.For example, the input image generator 1250 may generate a down-sampled target user image by applying a Gaussian pyramid to an image of the target user, or may generate an upsampled target user image by applying a Laplacian pyramid to the target user image. .

해상도가 서로 다른 복수개의 타겟 사용자 이미지가 생성되면, 입력 이미지 생성부(1250)는 각각의 타겟 사용자 이미지에 대해, 타겟 사용자 이미지 상에서 미리 정해진 픽셀크기의 윈도우를 이동시켜가면서 획득되는 복수개의 이미지를 입력 이미지로 생성한다. 입력 이미지 생성부(1250)는 생성된 입력 이미지를 얼굴인식부(1300)로 입력한다.When a plurality of target user images having different resolutions are generated, the input image generating unit 1250 inputs a plurality of images obtained by moving a window of a predetermined pixel size on the target user image for each target user image. It is created as an image. The input image generation unit 1250 inputs the generated input image to the face recognition unit 1300.

얼굴인식부(1300)는 입력 이미지 생성부(1250)로부터 타겟 사용자의 입력 이미지가 수신되면 수신된 타겟 사용자의 입력 이미지를 중앙서버(110)로부터 배포된 얼굴인식모델(1320)에 입력하여 타겟 얼굴이미지를 추출하고, 추출된 타겟 얼굴이미지로부터 타겟 특징벡터를 추출한다. 특히, 중앙서버(110)로부터 배포되는 얼굴인식모델(1320)은 상술한 오차감소부(286)를 통한 학습에 의해 오차가 감소된 것일 수 있다.When the input image of the target user is received from the input image generation unit 1250, the face recognition unit 1300 inputs the received input image of the target user into the face recognition model 1320 distributed from the central server 110 to provide a target face. An image is extracted, and a target feature vector is extracted from the extracted target face image. In particular, the face recognition model 1320 distributed from the central server 110 may have an error reduced by learning through the error reduction unit 286 described above.

또한, 얼굴인식모델(1320)은 미리 정해진 주기마다 업데이트될 수 있다. 일 예로, 에지 디바이스(120)는 중앙서버(110)에 의해 얼굴인식모델(1320)이 업데이트될 때마다 중앙서버(110)로부터 새로운 얼굴인식모델(1320)을 배포받음으로써 기 배포된 얼굴인식모델(1320)을 새로운 얼굴인식모델(1320)로 업데이트할 수 있다.In addition, the face recognition model 1320 may be updated every predetermined period. For example, the edge device 120 receives a new face recognition model 1320 from the central server 110 whenever the face recognition model 1320 is updated by the central server 110, thereby distributing a previously deployed face recognition model. The 1320 may be updated with a new face recognition model 1320.

타겟 얼굴이미지 추출 및 타겟 특징벡터 추출에 이용되는 얼굴인식모델(1320)은 도 4 내지 도 8에 도시된 얼굴인식모델(225)과 동일하므로 이에 대한 구체적인 설명은 생략한다.Since the face recognition model 1320 used for extracting the target face image and extracting the target feature vector is the same as the face recognition model 225 shown in FIGS. 4 to 8, a detailed description thereof will be omitted.

또한, 얼굴인식부(1300)가 얼굴인식모델(1320)을 이용하여 타겟 사용자의 입력 이미지로부터 타겟 얼굴이미지 및 타겟 특징벡터를 추출하는 방법은 중앙서버(110)에 포함된 얼굴인식부(220)가 얼굴인식모델(225)을 이용하여 얼굴이미지 및 특징벡터를 추출하는 것과 동일하므로 이에 대한 구체적인 설명은 생략한다.In addition, the method of extracting the target face image and the target feature vector from the input image of the target user by the face recognition unit 1300 using the face recognition model 1320 is the face recognition unit 220 included in the central server 110 Since is the same as extracting a face image and a feature vector using the face recognition model 225, a detailed description thereof will be omitted.

인증부(1310)는 얼굴인식부(1300)에 의해 획득된 타겟 특징벡터와 중앙서버(110)로부터 수신된 어레이 파일에 포함된 특징벡터들을 비교하여 타겟 사용자가 해당 장소에 출입이 가능한 정당한 사용자인지 여부를 인증한다. 이하, 인증부(1310)가 타겟 사용자를 인증하는 방법에 대해 구체적으로 설명한다.The authentication unit 1310 compares the target feature vector obtained by the face recognition unit 1300 with the feature vectors included in the array file received from the central server 110 to determine whether the target user is a legitimate user who can access the location. Whether to verify. Hereinafter, a method of authenticating the target user by the authentication unit 1310 will be described in detail.

먼저, 인증부(1310)는 어레이 파일에 포함된 각 어레이 마다 해당 어레이에 포함된 특징벡터에서 타겟 특징벡터를 동일 인덱스 별로 감산하여 제곱한 제1 결과값을 산출한다. 인증부(1310)는 인덱스 별로 산출된 제1 결과값을 합산하여 제2 결과값을 산출하고, 미리 정해진 기준값에서 제2 결과값을 감산한 제3 결과값을 산출한다.First, for each array included in the array file, the authentication unit 1310 calculates a first result value obtained by subtracting the target feature vector by the same index from the feature vector included in the corresponding array for each array included in the array file. The authentication unit 1310 calculates a second result value by summing the first result value calculated for each index, and calculates a third result value obtained by subtracting the second result value from a predetermined reference value.

인증부(1310)는 어레이 파일에 포함된 어레이들 중 제3 결과값이 가장 큰 어레이에 매핑되어 있는 사용자가 타겟사용자와 가장 유사한 사용자인 것으로 결정한다. 이때, 인증부(1310)는 제3 결과값이 미리 정해진 문턱값 이상인 경우 타겟사용자가 정당한 권한을 가진 사용자로 인증하게 되고, 이에 따라 타겟 사용자가 해당 장소의 출입이 허가될 수 있다.The authentication unit 1310 determines that the user mapped to the array having the largest third result value among the arrays included in the array file is the user most similar to the target user. In this case, when the third result value is greater than or equal to a predetermined threshold value, the authentication unit 1310 authenticates the target user as a user with legitimate authority, and accordingly, the target user may be permitted to enter the corresponding place.

일 실시예에 있어서, 문턱값은 에지 디바이스(120)가 설치되는 장소의 보안레벨에 따라 차등 설정될 수 있다. 예컨대, 에지 디바이스(120)가 높은 보안 레벨이 적용되는 지역에 설치되는 경우 문턱값은 높게 설정될 수 있고 에지 디바이스(120)가 낮은 보안 레벨이 적용되는 지역에 설치되는 경우 문턱값은 낮게 설정될 수 있다.In one embodiment, the threshold value may be set differently according to the security level of the place where the edge device 120 is installed. For example, when the edge device 120 is installed in an area to which a high security level is applied, the threshold value may be set high, and when the edge device 120 is installed in an area to which a low security level is applied, the threshold value may be set low. I can.

이하, 인증부(1310)가 얼굴인식부(1300)에 의해 획득된 타겟 특징벡터와 어레이 파일에 포함된 특징벡터들을 비교하여 타겟 사용자가 해당 층에 출입이 가능한 정당한 사용자인지 여부를 인증하는 방법을 예를 들어 설명하기로 한다.Hereinafter, the authentication unit 1310 compares the target feature vector acquired by the face recognition unit 1300 with the feature vectors included in the array file to verify whether the target user is a legitimate user who can access the floor. Let me explain with an example.

도 14는 인증부가 타겟 사용자를 인증하는 방법을 예시적으로 보여주는 도면이다. 도 14에 도시된 바와 같이, 어레이 파일은 각 사용자의 특징벡터들을 포함하는 어레이가 각 로우 별로 배치되어 있다. 예컨대, 1번째 로우에는 제1 사용자에 대한 특징벡터들이 순차적으로 배치되어 있고, 2번째 로우에는 제2 사용자에 대한 특징벡터들이 순차적으로 배치되어 있다. 이때, 각 사용자의 특징벡터들은 인덱스 순서에 따라 하나의 로우에 배치되어 있다.14 is a diagram illustrating a method of authenticating a target user by an authentication unit. As shown in FIG. 14, in the array file, an array including feature vectors of each user is arranged for each row. For example, feature vectors for a first user are sequentially arranged in a first row, and feature vectors for a second user are sequentially arranged in a second row. In this case, feature vectors of each user are arranged in one row according to the index order.

도 14a에 도시된 바와 같이 인증부(1310)는 어레이 파일(1410)의 각 어레이의 특징벡터들과 타겟 특징벡터들 간의 차이값을 인덱스 별로 산출하고, 도 14b에 도시된 바와 같이 산출된 차이값들을 제곱하여 제1 결과값을 산출하며, 도 14c에 도시된 바와 같이 각 어레이 별로 제1 결과값들을 모두 합산하여 제2 결과값을 산출한다.As shown in FIG. 14A, the authentication unit 1310 calculates a difference value between the feature vectors of each array of the array file 1410 and the target feature vectors for each index, and the difference value calculated as shown in FIG. 14B A first result value is calculated by squaring them, and a second result value is calculated by summing all the first result values for each array, as shown in FIG.

이후, 도 14d에 도시된 바와 같이, 인증부(1310)는 미리 정해진 기준값(예컨대, 1)에서 제2 결과값을 감산함으로써 제3 결과값을 산출하고, 산출된 제3 결과값들 중 제일 큰 값인 0.310528에 해당하는 어레이에 매핑되어 있는 사용자를 타겟 사용자와 가장 유사한 사용자로 결정한다. 또한, 인증부(1310)는 가장 큰 값으로 결정된 제3 결과값이 미리 정해진 문턱값(예컨대, 0.15)보다 큰 값이므로 타겟 사용자를 해당 어레이에 매핑되어 있는 사용자로 최종 인증한다.Thereafter, as shown in FIG. 14D, the authentication unit 1310 calculates a third result value by subtracting the second result value from a predetermined reference value (eg, 1), and the largest among the calculated third result values. The user mapped to the array corresponding to the value of 0.310528 is determined as the user most similar to the target user. Further, since the third result value determined as the largest value is a value greater than a predetermined threshold value (eg, 0.15), the authentication unit 1310 finally authenticates the target user as a user mapped to the corresponding array.

인증부(1310)가 타겟 사용자를 인증하는 방식을 수학식으로 표현하면 아래의 수학식2와 같이 표현할 수 있다.When the authentication unit 1310 authenticates the target user by an equation, it can be expressed as in equation 2 below.

수학식 2에서 Z는 제3 결과값을 나타내고, R은 미리 정해진 기준값을 나타내며, Xi는 n개의 특징벡터들 중 i번째 인덱스에 해당하는 특징벡터를 나타내고 Yi는 n개의 특징벡터들 중 i번째 인덱스에 해당하는 타겟 특징벡터를 나타낸다.In Equation 2, Z represents a third result value, R represents a predetermined reference value, Xi represents a feature vector corresponding to the i-th index among n feature vectors, and Yi is the i-th index among n feature vectors. Represents a target feature vector corresponding to.

얼굴인식모델(1320)은 중앙서버(110)에 의해 생성되어 배포된 것으로서, 얼굴인식모델(1320)은 중앙서버(110)에 의해 얼굴인식모델(225)이 트레이닝되어 갱신될 때마다 갱신된 얼굴인식모델(225)로 대체된다. 이때, 얼굴인식모델(1320)은 인터페이스부(1350)를 통해 중앙서버(110)로부터 수신될 수 있다.The face recognition model 1320 is generated and distributed by the central server 110, and the face recognition model 1320 is an updated face whenever the face recognition model 225 is trained and updated by the central server 110. It is replaced by the recognition model 225. In this case, the face recognition model 1320 may be received from the central server 110 through the interface unit 1350.

어레이 파일 업데이트부(1330)는 인터페이스부(1350)를 통해 중앙서버(110)로부터 어레이 파일이 수신되면 이를 제1 메모리(1342)에 업로드하여 인증부(1310)가 이를 이용하여 타겟 사용자를 인증할 수 있도록 한다. 특히, 본 발명에 따른 어레이 파일 업데이트부(1330)는 어레이 파일을 동적으로 로딩할 수 있다.When an array file is received from the central server 110 through the interface unit 1350, the array file update unit 1330 uploads it to the first memory 1342 so that the authentication unit 1310 authenticates the target user using it. Make it possible. In particular, the array file update unit 1330 according to the present invention may dynamically load an array file.

구체적으로, 어레이 파일 업데이트부(1330)는 제1 메모리(1342)에 어레이 파일이 로딩되어 있을 때, 중앙서버(110)로부터 신규 어레이 파일이 수신되는 경우 신규 어레이 파일을 제2 메모리(1344)에 로딩하고, 제2 메모리(1344)에 신규 레이 파일의 로딩이 완료되면 제1 메모리(1342)에 로딩되어 있는 어레이 파일을 제2 메모리(1344)에 로딩되어 있는 신규 어레이 파일로 대체한다.Specifically, the array file update unit 1330 stores the new array file in the second memory 1344 when a new array file is received from the central server 110 when an array file is loaded in the first memory 1342. After loading, and when loading of the new ray file in the second memory 1344 is completed, the array file loaded in the first memory 1342 is replaced with a new array file loaded in the second memory 1344.

본 발명에 따른 어레이 파일 업데이트부(1330)가 상술한 바와 같이 어레이 파일을 동적 로딩하는 이유는 인증부(1310)가 타겟 사용자에 대한 인증처리를 수행함과 동시에 어레이 파일 업데이트부(1330)가 신규 어레이 파일을 업데이트할 수 있도록 함으로써 에지 디바이스(120)가 새롭게 업데이트된 어레이 파일을 기초로 실시간으로 얼굴인식이 수행될 수 있도록 하기 위함이다.The reason why the array file update unit 1330 according to the present invention dynamically loads the array file as described above is that the authentication unit 1310 performs authentication processing for the target user and the array file update unit 1330 This is to enable the edge device 120 to perform face recognition in real time based on the newly updated array file by enabling the file to be updated.

제1 메모리(1342)에는 인증부(1310)에 의해 이용되는 어레이 파일이 로딩되고, 제2 메모리(1344)에는 새롭게 수신된 신규 어레이 파일이 로딩된다. 제2 메모리(1344)에 신규 어레이 파일의 로딩이 완료되면 어레이 파일 업데이트부(1330)에 의해 제1 메모리(1342)에 기록된 어레이 파일이 신규 어레이 파일로 대체되게 된다.An array file used by the authentication unit 1310 is loaded into the first memory 1342, and a new array file newly received is loaded into the second memory 1344. When the loading of the new array file in the second memory 1344 is completed, the array file recorded in the first memory 1342 is replaced by the new array file by the array file update unit 1330.

인터페이스부(1350)는 에지 디바이스(120)와 중앙서버(110)간의 데이터 송수신을 매개한다. 구체적으로, 인터페이스부(1350)는 중앙서버(110)로부터 얼굴인식모델(1320)을 수신하고, 중앙서버(110)로부터 어레이 파일을 수신하여 어레이 파일 업데이트부(1330)를 통해 제1 메모리(1342) 또는 제2 메모리(1344)에 로딩한다. 또한, 인터페이스부(1350)는 인증부(1330)에 의한 인증기록을 중앙서버(110)로 주기적으로 전송한다.The interface unit 1350 mediates data transmission/reception between the edge device 120 and the central server 110. Specifically, the interface unit 1350 receives the face recognition model 1320 from the central server 110, receives the array file from the central server 110, and receives the first memory 1342 through the array file update unit 1330. ) Or the second memory 1344. In addition, the interface unit 1350 periodically transmits the authentication record by the authentication unit 1330 to the central server 110.

일 실시예에 있어서, 어레이 파일 및 얼굴인식모델(1320)은 인터페이스부(1350)를 통해 미리 정해진 주기마다 업데이트될 수 있다.In an embodiment, the array file and the face recognition model 1320 may be updated at predetermined cycles through the interface unit 1350.

상술한 바와 같이, 본 발명에 따르면 에지 디바이스(120)에는 얼굴인식을 위한 얼굴인식모델(1320) 및 어레이 파일만 저장될 뿐 사용자의 얼굴이미지나 개인정보가 저장되지 않기 때문에 에지 디바이스(120)가 해킹되더라도 사용자의 개인정보가 유출될 염려가 없어 보안이 강화된다.As described above, according to the present invention, only the face recognition model 1320 and array file for face recognition are stored in the edge device 120, and the user's face image or personal information is not stored. Even if it is hacked, there is no fear that the user's personal information will be leaked, thus enhancing security.

상술한 제1 실시예에 따른 에지 디바이스(120)의 경우, 제1 촬영부(1210)에 의해 생성된 촬영이미지가 입력 이미지 생성부(1250)로 직접 입력되는 것으로 설명하였다. 하지만 이러한 실시예에 따르는 경우 제1 촬영부(1210)에 의해 타겟 사용자의 얼굴이 포함된 사진이 촬영되는 경우에도 제1 촬영부(1210)는 정상적인 촬영 이미지를 생성하여 입력 이미지 생성부(1250)로 전달하게 됨으로써, 타겟 사용자가 아닌 사용자가 타겟 사용자의 사진을 이용하여 인증을 수행하게 될 수 있다는 문제점이 있다.In the case of the edge device 120 according to the first embodiment described above, it has been described that the photographed image generated by the first photographing unit 1210 is directly input to the input image generating unit 1250. However, according to this embodiment, even when a picture including the face of the target user is taken by the first photographing unit 1210, the first photographing unit 1210 generates a normal photographed image and the input image generating unit 1250 By transmitting to, there is a problem that a user other than the target user may perform authentication using the photo of the target user.

이하, 상술한 문제점을 해결할 수 있는 제2 실시예에 따른 에지 디바이스(120)에 대해 구체적으로 설명한다.Hereinafter, the edge device 120 according to the second embodiment capable of solving the above-described problem will be described in detail.

도 15는 본 발명의 제2 실시예에 따른 에지 디바이스의 구성을 보여주는 블록도이다. 도 15에 도시된 제2 실시예에 따른 에지 디바이스는 도 13에 도시된 제1 실시예에 따른 에지 디바이스에 비해 제2 촬영부(1370) 및 진위판단부(1380)를 더 포함한다는 점에서 제1 실시예에 따른 에지 디바이스와 구별된다. 이하에서는, 설명의 편의를 위해 제1 실시예에 따른 에지 디바이스와 동일한 기능을 하는 구성에 대한 설명은 생략하고, 새롭게 추가된 제2 촬영부(1510) 및 진위판단부(1520)와 새롭게 추가된 구성으로 인해 그 기능이 변경된 제1 촬영부(1210)에 대해서만 기재하기로 한다.15 is a block diagram showing a configuration of an edge device according to a second embodiment of the present invention. The edge device according to the second embodiment shown in FIG. 15 is first compared to the edge device according to the first embodiment shown in FIG. 13 in that it further includes a second photographing unit 1370 and an authenticity determining unit 1380. It is distinguished from the edge device according to the first embodiment. Hereinafter, for convenience of explanation, a description of a configuration having the same function as the edge device according to the first embodiment is omitted, and the newly added second photographing unit 1510 and the authenticity determining unit 1520 and the newly added It will be described only for the first photographing unit 1210 whose function has been changed due to the configuration.

제1 촬영부(1210)는 촬영대상을 촬영하여 촬영이미지를 생성한다. 제1 촬영부(1210)는 생성된 촬영이미지를 진위판단부(1520)로 전송한다.The first photographing unit 1210 photographs a photographing object and generates a photographed image. The first photographing unit 1210 transmits the generated photographed image to the authenticity determining unit 1520.

제2 촬영부(1510)는 촬영대상을 촬영하여 뎁스(Depth) 이미지를 생성한다. 제2 촬영부(1510)는 제1 촬영부(1210)에 의해 촬영대상이 촬영되는 시점과 동일한 시점 또는 제1 촬영부(1210)에 촬영대상이 촬영되는 시점으로부터 소정시간 이전 또는 소정시간 이후에 촬영대상을 촬영할 수 있다.The second photographing unit 1510 photographs an object to be photographed and generates a depth image. The second photographing unit 1510 may be performed at the same time point as when the object to be photographed is photographed by the first photographing unit 1210 or before or after a predetermined time from the time the object is photographed by the first photographing unit 1210. You can shoot the subject.

일 실시예에 있어서, 제2 촬영부(1510)는 촬영대상을 촬영하여 뎁스 이미지를 생성할 수 있는 IR 카메라로 구현될 수 있다.In one embodiment, the second photographing unit 1510 may be implemented as an IR camera capable of generating a depth image by photographing a photographing object.

이와 같이 제2 실시예에 따른 에지 디바이스(120)가 제2 촬영부(1510)를 통해 촬영대상를 촬영하여 뎁스 이미지를 생성하는 이유는, 제2 촬영부(1510)에 의해 촬영대상의 실제 얼굴이 촬영되는 경우와 촬영대상의 얼굴이 포함된 사진이 촬영된 경우 서로 다른 형태의 뎁스 이미지가 생성되기 때문이다.As described above, the reason that the edge device 120 according to the second embodiment photographs a photographing object through the second photographing unit 1510 to generate a depth image is that the actual face of the photographing object is determined by the second photographing unit 1510. This is because different types of depth images are generated when a picture is taken and a picture including a face of the subject is taken.

예컨대, 제2 촬영부(1510)에 의해 촬영대상의 얼굴이 포함된 사진이 촬영된 경우 도 16a에 도시된 바와 같은 형태의 뎁스 이미지가 생성됨에 반해, 제2 촬영부(1510)에 의해 촬영대상의 실제 얼굴이 촬영된 경우 도 16b에 도시된 바와 같은 형태의 제2 뎁스 이미지가 생성된다.For example, when a picture including the face of the subject to be photographed is taken by the second photographing unit 1510, a depth image in the form as shown in FIG. 16A is generated, whereas the photographing subject by the second photographing unit 1510 When the actual face of is captured, a second depth image of the shape as shown in FIG. 16B is generated.

제2 촬영부(1510)는 생성된 뎁스 이미지를 진위판단부(1520)로 전송한다.The second photographing unit 1510 transmits the generated depth image to the authenticity determining unit 1520.

진위판단부(1520)는 제2 촬영부(1510)로부터 전송된 뎁스 이미지를 이용하여 제2 촬영부(1510)에 의해 촬영된 촬영대상이 사진인지 또는 실제 촬영대상의 얼굴인지 여부를 판단한다.The authenticity determining unit 1520 determines whether the photographed object photographed by the second photographing unit 1510 is a photograph or whether the actual photographing object is a face using the depth image transmitted from the second photographing unit 1510.

구체적으로, 진위판단부(1520)는 제2 촬영부(1510)로부터 수신한 뎁스 이미지로부터 뎁스 데이터를 추출하고, 이진 분류(Binary Classification)를 통해 뎁스 이미지가 촬영대상의 실제 얼굴인지 여부를 판단한다. 일 실시예에 있어서, 진위판단부(1520)는 딥 러닝(Deep Learning) 알고리즘 기반의 트레이닝을 통해 실제 얼굴과 사진에 대한 분류 정확도가 향상되도록 할 수 있다.Specifically, the authenticity determination unit 1520 extracts depth data from the depth image received from the second photographing unit 1510, and determines whether the depth image is a real face of the subject to be photographed through binary classification. . In an embodiment, the authenticity determination unit 1520 may improve classification accuracy for real faces and photos through training based on a deep learning algorithm.

진위판단부(1520)는 제2 촬영부(1510)에 의해 촬영된 촬영대상이 실제 얼굴인 것으로 판단되면 제1 촬영부(1210)로부터 수신된 촬영 이미지를 입력 이미지 생성부(1250)로 전송한다. 한편, 진위판단부(1520)는 제2 촬영부(1510)에 의해 촬영된 촬영대상이 실제 얼굴이 아닌 것으로 판단되면 제1 촬영부(1210)로부터 수신된 촬영 이미지를 입력 이미지 생성부(1250)로 전송하지 않고, 인증처리가 실패하였음을 문자 형태 또는 음성 형태의 알람 메시지를 이용하여 출력하거나, 비정상적 접근 시도가 있었음을 시스템 운영자에게 통지할 수 있다.The authenticity determining unit 1520 transmits the photographed image received from the first photographing unit 1210 to the input image generating unit 1250 when it is determined that the photographing target photographed by the second photographing unit 1510 is a real face. . On the other hand, the authenticity determination unit 1520, if it is determined that the photographed object photographed by the second photographing unit 1510 is not a real face, the photographed image received from the first photographing unit 1210 is input to the input image generating unit 1250. Without sending to, the authentication process can be output using an alarm message in text or voice format, or an abnormal access attempt can be notified to the system operator.

이와 같이, 본 발명에 따르면 촬영된 촬영대상이 실제 얼굴인지 사진인지 여부를 정확하게 판별하여 사진이 촬영된 경우 인증처리가 수행되지 않도록 함으로써 정당한 권원없는 사용자가 타인의 사진을 이용하여 인증을 받고자 수행하는 시도를 원천적으로 차단할 수 있고, 이를 통해 보안을 향상시킬 수 있게 된다.As described above, according to the present invention, by accurately determining whether the photographed subject is a real face or a photograph, the authentication process is not performed when the photograph is taken, so that a user without a legitimate authority attempts to obtain authentication using another person's photograph. Attempts can be fundamentally blocked, and security can be improved through this.

다시 도 1을 참조하면, 사용자 단말기(130)는 사용자를 신규 등록하기 위한 사용자 이미지를 사용자의 식별정보와 함께 중앙서버(110)로 전송한다. 일 실시예에 있어서, 사용자 단말기(130)에는 중앙서버(110)와 연동할 수 있는 얼굴등록 에이전트(미도시)가 탑재되어 있고, 사용자는 사용자 단말기(130) 상에서 얼굴등록 에이전트를 실행시킴으로써 사용자의 얼굴을 촬영한 이미지나 기 촬영된 이미지를 사용자 식별정보와 함께 중앙서버(110)로 전송할 수 있다.Referring back to FIG. 1, the user terminal 130 transmits a user image for newly registering a user together with the user's identification information to the central server 110. In one embodiment, the user terminal 130 is equipped with a face registration agent (not shown) capable of interworking with the central server 110, and the user executes the face registration agent on the user terminal 130 An image of a face or a previously photographed image may be transmitted to the central server 110 together with user identification information.

일 실시예에 있어서, 사용자 단말기(130)는 각 사용자 별로 복수개의 사용자 이미지를 등록하도록 요청할 수 있다. 이때, 각 사용자 별로 등록요청되는 복수개의 이미지는 서로 다른 환경에서 촬영된 사진이거나 서로 다른 조명하에서 촬영된 사진일 수 있다.In one embodiment, the user terminal 130 may request to register a plurality of user images for each user. In this case, the plurality of images requested for registration for each user may be photos taken in different environments or photos taken under different lighting.

사용자 단말기(130)는 중앙서버(110)로 사용자 이미지를 전송하여 사용자 등록을 요청할 수 있는 것이라면 그 종류에 제한 없이 어떤 것이든 이용 가능하다. 예컨대, 사용자 단말기(130)는 스마트폰, 노트북, 데스크탑 또는 테플릿 PC등으로 구현될 수 있다.The user terminal 130 may use any type without limitation as long as it can request user registration by transmitting a user image to the central server 110. For example, the user terminal 130 may be implemented as a smartphone, notebook, desktop, or tablet PC.

본 발명이 속하는 기술분야의 당업자는 상술한 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다.Those skilled in the art to which the present invention pertains will appreciate that the above-described present invention can be implemented in other specific forms without changing the technical spirit or essential features thereof.

예컨대, 도 2에 도시된 중앙서버의 구성 및 도 13에 도시된 에지 디바이스의 구성은 프로그램 형태로 구현될 수도 있을 것이다. 본 발명에 따른 중앙서버의 구성 및 에지 디바이스의 구성이 프로그램으로 구현되는 경우, 도 2 및 도 13에 도시된 각 구성들이 코드로 구현되고, 특정 기능을 구현하기 위한 코드들이 하나의 프로그램으로 구현되거나, 복수개의 프로그램을 분할되어 구현될 수도 있을 것이다.For example, the configuration of the central server illustrated in FIG. 2 and the configuration of the edge device illustrated in FIG. 13 may be implemented in a program form. When the configuration of the central server and the edge device according to the present invention are implemented as programs, each of the configurations shown in FIGS. 2 and 13 are implemented as codes, and codes for implementing specific functions are implemented as one program, or In addition, a plurality of programs may be divided and implemented.

그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로 이해해야만 한다. 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 등가 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.Therefore, it should be understood that the embodiments described above are illustrative in all respects and not limiting. The scope of the present invention is indicated by the claims to be described later rather than the detailed description, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. do.

100: 얼굴인식시스템 110: 중앙서버
120: 에지 디바이스 130: 사용자 단말기
210: 사용자 등록부 215: 입력 이미지 생성부
220: 얼굴인식부 225: 얼굴인식모델
230: 어레이 파일 생성부 240: 얼굴인식모델 트레이닝부
250: 인터페이스부100: face recognition system 110: central server
120: edge device 130: user terminal
210: user register 215: input image generation unit
220: face recognition unit 225: face recognition model
230: array file generation unit 240: face recognition model training unit
250: interface unit

Claims

A first photographing unit that acquires a photographed image of a target user to be authenticated;
A face recognition unit that inputs the input images generated from the captured image into a face recognition model, extracts a target face image from the input image, and generates a target feature vector from the extracted target face image;
An authentication unit for authenticating the target user by comparing the target feature vector with an array file consisting of a plurality of arrays having feature vectors extracted from face images of each of a plurality of users and identification information of each user; And
Including an interface unit for receiving the array file and the face recognition model from the central server,
The face recognition model is trained through an error reduction algorithm,
The error reduction algorithm arranges the training images on a two-dimensional angular plane based on a plurality of feature vectors obtained by inputting a training image to the face recognition model, and adds them to a reference angle between training images included in different classes. The edge device for face recognition, characterized in that, by varying the margin angle, the probability that each learning image will be included in each class for each of the variable margin angles is calculated, and the face recognition model is trained based on the calculated probability.

The method of claim 1,
The authentication unit,
A first result value obtained by subtracting the target feature vector by the same index from the feature vector included in each array and squared is calculated, and a second result value obtained by summing the first result values calculated for each index in the array is used. Edge device for face recognition, characterized in that to authenticate the target user.

The method of claim 2,
The authentication unit,
A user whose third result value is mapped to the largest array among the plurality of arrays included in the array file by subtracting the second result value from a predetermined reference value for each array Edge device for face recognition, characterized in that it is determined to be the user most similar to the target user.

The method of claim 2,
The authentication unit,
A face recognition edge, characterized in that: subtracting the second result value from a predetermined reference value for each array to calculate a third result value, and authenticating the target user when the calculated third result value is greater than or equal to a predetermined threshold value device.

The method of claim 4,
The edge device for face recognition, characterized in that the threshold value is differentially set according to the security level of a place where the edge device is installed.

The method of claim 1,
When a new array file is received from the central server through the interface unit when the array file is loaded into the first memory, the new array file is loaded into the second memory, and the new array file is loaded into the second memory. The edge device for face recognition, further comprising an array file update unit for replacing the array file loaded in the first memory with a new array file loaded in the second memory upon completion.

The method of claim 1,
And an input image generator configured to generate a plurality of input images having different resolutions by up-sampling or down-sampling the photographed image captured by the first photographing unit.

The method of claim 1,
The face recognition model,
A face image extraction unit for extracting the target face image from the input image;
A plurality of face image processing units that image-process the input data to generate output data; And
A feature vector generator for generating a predetermined number of feature vectors by merging the output data output from the last face image processor among the plurality of face image processors into one layer,
The target face image extracted by the face image extracting unit as the input image is input to a first face image processing unit among the plurality of face image processing units, and the nth face as the input image is input to the n+1th face image processing unit. Edge device for face recognition, characterized in that the output data of the image processing unit is input.

The method of claim 8,
The face image extracting unit extracts the target face image from the input images using two or more face detection units having neural network networks of different depths,
The two or more face detection units are sequentially arranged in the order of deepening of the depth, so that the number of input images input to the n-th face detection unit is reduced than the number of input images input to the n-1th face detection unit. Edge device for face recognition.

The method of claim 8,
The face image extraction unit,
A feature map of a plurality of input images is generated using a first neural network network having a first depth, and a plurality of first sub-input images including a face region among the plurality of input images based on the feature map A first face detection unit that primarily selects them;
Using a second neural network network having a second depth deeper than the first depth, feature maps of the plurality of first sub-input images are generated, and the plurality of feature maps are generated based on the feature maps generated through the second neural network network. A second face detection unit that secondarily selects a plurality of second sub-input images including a face region from among the first sub-input images; And
A feature map of the plurality of second sub-input images is generated using a third neural network network having a third depth deeper than the second depth, and the plurality of feature maps are generated based on the feature map generated through the third neural network network. And a third face detection unit configured to select the target face image including a face region among second sub-input images.

The method of claim 8,
The face image extraction unit,
Obtaining landmark coordinates from the target face image, and performing at least one of rotation, translation, enlargement, and reduction on the target face image so that the obtained landmark coordinates match a predetermined reference landmark coordinate Face recognition edge device, characterized in that it further comprises a face image alignment unit for aligning the target face image.

The method of claim 8,
The face image processing unit,
A normalization unit normalizing the input data;
A first convolution operation unit generating a first feature map by applying a first convolution filter to the normalized input data; And
And a non-linearization unit that outputs a positive value among pixel values of the first feature map as it is and outputs a negative value by reducing its size to give the first feature map a non-linear characteristic. Edge device for recognition.

The method of claim 12,
The face image processing unit,
Further comprising a second convolution operation unit for generating a second feature map by applying a second convolution filter to the first feature map to which the nonlinear characteristic is assigned,
The first convolutional filter and the second convolutional filter are filters having the same size and different stride values.

The method of claim 8,
The face image processing unit,
A sampling unit for sub-sampling the generated feature map to reduce the dimension of the feature map;
A weight reflecting unit that reflects a weight on the sub-sampled feature map; And
And an upscaling unit for upscaling the feature map in which the weight is reflected to the same dimension as the feature map input to the sampling unit.

The method of claim 14,
The weight reflecting unit,
A dimension reduction unit for reducing a dimension by connecting the sub-sampled feature maps into one layer;
A first non-linearizer configured to provide a nonlinear characteristic to the reduced-dimensional feature map by outputting a positive value as it is and a negative value as 0 among pixel values of the feature map whose dimension has been reduced;
A dimension increasing unit for increasing the dimension of the feature map to which the nonlinear characteristic is assigned; And
Includes a second non-linearization unit that gives non-linear characteristics to the feature map with the increased dimension by converging a positive value to a predetermined value and outputting a negative value as 0 among pixel values of the feature map with an increased dimension. Edge device for face recognition, characterized in that.

delete

The method of claim 1,
The face recognition edge device, characterized in that the array file and the face recognition model are updated every predetermined period.

The method of claim 1,
A second photographing unit that photographs an object to be photographed and generates a depth image; And
Based on the depth image generated by the second photographing unit, it is determined whether the photographing object is a face or a photograph, and when it is determined that the photographing object is a face, the photographing image generated by the first photographing unit is input Edge device for face recognition, characterized in that it further comprises an authenticity determination unit to be generated as an image.