KR20230119574A

KR20230119574A - Method of operating neural network, training method of neural network, method of detecting whether biometruc information is spoofed using neural network, and electric devece thereof

Info

Publication number: KR20230119574A
Application number: KR1020220041081A
Authority: KR
Inventors: 이재윤; 조기호; 김지환; 박성언; 한재준
Original assignee: 삼성전자주식회사
Priority date: 2022-02-07
Filing date: 2022-04-01
Publication date: 2023-08-16

Abstract

일 실시예에 따르면, 입력 레이어, 복수의 중간 레이어들, 및 출력 레이어를 포함하는 뉴럴 네트워크의 동작 방법은 복수의 중간 레이어들 중 입력 레이어에 인접한 임의의 제1 중간 레이어에 속한 제1 노드들에게 제1 활성화 함수를 적용하여 제1 중간 벡터를 생성하고, 제1 중간 벡터를, 중간 레이어들 중 출력 레이어에 인접한 제2 중간 레이어에 속한 제2 노드들에게 전달하고, 제2 노드들에게 제2 활성화 함수를 적용하여 제2 중간 벡터를 생성하며, 제2 중간 벡터를 출력 레이어에 인가하며, 제2 활성화 함수는 제2 활성화 함수의 피크 값이 고정되도록 제2 활성화 함수의 승수가 제2 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터 및 제2 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터에 의해 결정된다.According to an embodiment, a method of operating a neural network including an input layer, a plurality of intermediate layers, and an output layer provides first nodes belonging to an arbitrary first intermediate layer adjacent to an input layer among a plurality of intermediate layers. A first intermediate vector is generated by applying a first activation function, the first intermediate vector is transferred to second nodes belonging to a second intermediate layer adjacent to an output layer among intermediate layers, and the second intermediate vector is transmitted to the second nodes. An activation function is applied to generate a second intermediate vector, the second intermediate vector is applied to an output layer, and the multiplier of the second activation function is such that the peak value of the second activation function is fixed. It is determined by a first hyperparameter associated with the rising slope of and a second hyperparameter associated with the falling slope of the second activation function.

Description

A method of operating a neural network, a method of training a neural network, a method of detecting forgery of biometric information using a neural network, and an electronic device thereof SPOOFED USING NEURAL NETWORK, AND ELECTRIC DEVECE THEREOF}

아래의 개시는 뉴럴 네트워크의 동작 방법, 뉴럴 네트워크의 트레이닝 방법, 뉴럴 네트워크를 이용하여 생체 정보의 위조 여부를 검출하는 방법 및 그 전자 장치에 관한 것이다.The following disclosure relates to a method of operating a neural network, a method of training a neural network, a method of detecting forgery of biometric information using a neural network, and an electronic device thereof.

신경망은 현대 기계 학습의 필수 구성 요소가 되고 있다. 신경망의 매개변수에 대한 확률 분포를 가정함으로써 신경망의 입력에서 출력으로 분포가 유도될 수 있다. 사용자는 신경망이 보다 유연하게 모델링되기는 희망하지만, 신경망은 예를 들어, 트레이닝되지 않은 영역에 대해 제대로 판단할 수 없는 것이 일반적이다.Neural networks are becoming an essential component of modern machine learning. Distributions can be derived from inputs to outputs of a neural network by assuming probability distributions for the parameters of the neural network. Users want the neural network to be modeled more flexibly, but it is common for the neural network to be unable to make good judgments about untrained regions, for example.

일 실시예에 따르면, 입력 레이어(input layer), 복수의 중간 레이어들, 및 출력 레이어(out layer)를 포함하는 뉴럴 네트워크의 동작 방법은 상기 복수의 중간 레이어들 중 상기 입력 레이어에 인접한 임의의 제1 중간 레이어에 속한 제1 노드들에게 제1 활성화 함수를 적용하여 제1 중간 벡터를 생성하는 단계; 상기 제1 중간 벡터를, 상기 중간 레이어들 중 상기 출력 레이어에 인접한 제2 중간 레이어에 속한 제2 노드들에게 전달하는 단계; 상기 제2 노드들에게 제2 활성화 함수를 적용하여 제2 중간 벡터를 생성하는 단계; 및 상기 제2 중간 벡터를 상기 출력 레이어에 인가하는 단계를 포함하고, 상기 제2 활성화 함수는 상기 제2 활성화 함수의 피크(peak) 값이 고정되도록 상기 제2 활성화 함수의 승수(multiplier)가 상기 제2 활성화 함수의 상승 슬로프(ascending slope)와 연관된 제1 하이퍼 파라미터 및 상기 제2 활성화 함수의 하강 슬로프(descending slope)와 연관된 제2 하이퍼 파라미터에 의해 결정될 수 있다. According to an embodiment, a method of operating a neural network including an input layer, a plurality of intermediate layers, and an out layer includes any one of the plurality of intermediate layers adjacent to the input layer. generating a first intermediate vector by applying a first activation function to first nodes belonging to 1 intermediate layer; transferring the first intermediate vector to second nodes belonging to a second intermediate layer adjacent to the output layer among the intermediate layers; generating a second intermediate vector by applying a second activation function to the second nodes; and applying the second intermediate vector to the output layer, wherein a multiplier of the second activation function is set such that a peak value of the second activation function is fixed. It may be determined by a first hyper parameter associated with an ascending slope of a second activation function and a second hyper parameter associated with a descending slope of the second activation function.

상기 제2 활성화 함수의 상기 동적 범위는 [0, 1]으로 제한될 수 있다. The dynamic range of the second activation function may be limited to [0, 1].

상기 제2 활성화 함수()는 다음의 수학식으로 표현되고, The second activation function ( ) is expressed by the following equation,

, ,

여기서, 상기 a는 상기 제2 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터를 나타내고, 상기 b는 상기 제2 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터를 나타내며, 상기 는 오일러 넘버를 나타내며, 상기 x는 상기 제2 노드들의 입력을 나타내며, 상기 는 상기 x가 0보다 작을 때에 상기 제2 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝(Heaviside Step) 함수를 나타낼 수 있다. Here, a represents a first hyperparameter associated with an ascending slope of the second activation function, and b represents a second hyperparameter associated with a descending slope of the second activation function, denotes an Euler number, the x denotes the input of the second nodes, may represent a Heaviside Step function that makes an output of the second activation function 0 when the x is less than 0.

상기 제1 활성화 함수는 계단 함수(Step function), 시그모이드 함수(Sigmoid function), 하이퍼볼릭 탄젠트 함수(Hyperbolic tangent function), 렐루(ReLU) 함수, 및 리키 렐루(Leaky ReLU) 함수 중 어느 하나를 포함할 수 있다. The first activation function is any one of a step function, a sigmoid function, a hyperbolic tangent function, a ReLU function, and a leaky ReLU function can include

상기 뉴럴 네트워크는 CNN(Convolution Neural Network), 및 DNN(Deep Neural Network) 중 어느 하나를 포함할 수 있다. The neural network may include any one of a Convolution Neural Network (CNN) and a Deep Neural Network (DNN).

일 실시예에 따르면, 입력 레이어, 복수의 중간 레이어들, 및 출력 레이어를 포함하는 뉴럴 네트워크의 트레이닝 방법은 상기 복수의 중간 레이어들 각각에 속한 중간 노드들에 제1 활성화 함수를 적용한 제1 결과값을 추출하는 단계; 상기 복수의 중간 레이어들 중 하나 이상의 임의의 레이어에 속한 중간 노드들에 연결된 추가 노드들에 상기 제1 활성화 함수와 상이한 제2 활성화 함수를 적용하여 제2 결과값을 추출하는 단계; 및 상기 제1 결과 값과 상기 제2 결과값 간의 차이에 기초하여, 상기 뉴럴 네트워크를 트레이닝하는 단계를 포함한다. According to an embodiment, a method for training a neural network including an input layer, a plurality of intermediate layers, and an output layer is a first result value obtained by applying a first activation function to intermediate nodes belonging to each of the plurality of intermediate layers. Extracting; extracting a second result value by applying a second activation function different from the first activation function to additional nodes connected to intermediate nodes belonging to at least one arbitrary layer among the plurality of intermediate layers; and training the neural network based on the difference between the first result value and the second result value.

상기 제2 활성화 함수는 상기 제2 활성화 함수의 피크 값이 고정되도록 상기 제2 활성화 함수의 승수가 상기 제2 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터 및 상기 제2 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터에 의해 결정될 수 있다. The second activation function is such that a multiplier of the second activation function is associated with a first hyperparameter associated with an ascending slope of the second activation function and a descending slope of the second activation function such that a peak value of the second activation function is fixed. It can be determined by the second hyperparameter.

상기 추가 노드들의 개수는 상기 중간 노드들의 개수 -1 개이고, 상기 추가 노드들과 상기 중간 노드들은 완전 연결(fully connected)될 수 있다. The number of additional nodes is equal to the number of intermediate nodes -1, and the additional nodes and the intermediate nodes may be fully connected.

, ,

여기서, 상기 a는 상기 제2 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터를 나타내고, 상기 b는 상기 제2 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터를 나타내며, 상기 는 오일러 넘버를 나타내며, 상기 x는 상기 추가 노드들에 대한 입력을 나타내고, 상기 는 상기 x가 0보다 작을 때에 상기 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝 함수를 나타낼 수 있다. Here, a represents a first hyperparameter associated with an ascending slope of the second activation function, and b represents a second hyperparameter associated with a descending slope of the second activation function, denotes an Euler number, where x denotes an input to the additional nodes, may represent a Heaviside step function that makes the output of the activation function 0 when the x is less than 0.

상기 제1 활성화 함수는 계단 함수, 시그모이드 함수, 하이퍼볼릭 탄젠트 함수, 렐루 함수, 및 리키 렐루 함수 중 어느 하나를 포함할 수 있다. The first activation function may include any one of a step function, a sigmoid function, a hyperbolic tangent function, a relu function, and a ricky relu function.

일 실시예에 따르면, 입력 레이어, 복수의 중간 레이어들, 및 출력 레이어를 포함하는 뉴럴 네트워크의 트레이닝 방법은 상기 입력 레이어에 입력된 학습 데이터를, 상기 중간 레이어들 중 상기 입력 레이어에 인접한 제1 중간 레이어에 속하며 제1 활성화 함수에 따라 동작하는 제1 노드들에 전파하여 제1 특징 벡터를 생성하는 단계; 상기 제1 특징 벡터와 상기 학습 데이터에 대응하는 정답 벡터 간의 차이에 기초하여, 상기 뉴럴 네트워크를 1차 트레이닝하는 단계; 상기 제1 특징 벡터를, 상기 1차 트레이닝된 뉴럴 네트워크의 중간 레이어들 중 상기 출력 레이어에 인접한 제2 중간 레이어에 속하며 제2 활성화 함수에 따라 동작하는 제2 노드들에 전파하여 제2 특징 벡터를 생성하는 단계; 및 상기 제2 임베딩 벡터를 상기 출력 레이어를 통해 출력한 출력값과 상기 학습 데이터에 대응하는 정답값 간의 차이에 기초하여, 상기 제1 트레이닝된 뉴럴 네트워크를 2차 트레이닝하는 단계를 포함한다. According to an embodiment, a method for training a neural network including an input layer, a plurality of intermediate layers, and an output layer includes learning data input to the input layer, and a first intermediate layer adjacent to the input layer among the intermediate layers. generating a first feature vector by propagating to first nodes belonging to the layer and operating according to a first activation function; primary training of the neural network based on a difference between the first feature vector and the correct answer vector corresponding to the learning data; The first feature vector is propagated to second nodes belonging to a second intermediate layer adjacent to the output layer among intermediate layers of the first-trained neural network and operating according to a second activation function to obtain a second feature vector. generating; and performing secondary training on the first trained neural network based on a difference between an output value obtained by outputting the second embedding vector through the output layer and a correct answer value corresponding to the learning data.

[수학식][mathematical expression]

, ,

여기서, 상기 a는 상기 제2 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터를 나타내고, 상기 b는 상기 제2 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터를 나타내며, 상기 는 오일러 넘버를 나타내며, 상기 x는 상기 제2 특징 벡터를 나타내고, 상기 는 상기 x가 0보다 작을 때에 상기 제2 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝 함수를 나타낼 수 있다. Here, a represents a first hyperparameter associated with an ascending slope of the second activation function, and b represents a second hyperparameter associated with a descending slope of the second activation function, denotes an Euler number, the x denotes the second feature vector, and may represent a Heaviside step function that makes an output of the second activation function 0 when the x is less than 0.

상기 제1 활성화 함수는 계단 함수, 시그모이드 함수, 하이퍼 볼릭탄젠트 함수, 및 렐루 함수, 및 리키 렐루 함수 중 어느 하나를 포함할 수 있다. The first activation function may include any one of a step function, a sigmoid function, a hyperbolic tangent function, a relu function, and a ricky relu function.

일 실시예에 따르면, 뉴럴 네트워크를 이용하여 생체 정보의 위조 여부를 검출하는 방법은 사용자의 생체 정보를 포함하는 입력 데이터로부터 상기 생체 정보의 위조(spoof) 여부를 검출하는 상기 뉴럴 네트워크의 복수의 중간 레이어들로부터, 미리 트레이닝된 하나 이상의 제1 분류기를 이용하여, 하나 이상의 제1 특징 벡터를 추출하는 단계; 상기 하나 이상의 제1 특징 벡터에 기초하여, 상기 생체 정보의 제1 위조 여부를 검출하는 단계; 상기 제1 위조 여부가 검출되는지 여부에 따라, 상기 출력 레이어로부터 출력되는 출력 벡터를 미리 트레이닝된 제2 분류기에 인가하여 제2 스코어를 산출하는 단계; 및 상기 하나 이상의 제1 특징 벡터에 기초하여 산출된 제1 스코어 및 상기 제2 스코어를 융합한 스코어에 의해 상기 생체 정보의 제2 위조 여부를 검출하는 단계를 포함하고, 상기 하나 이상의 제1 분류기 및 상기 제2 분류기 중 적어도 하나는 상기 뉴럴 네트워크를 위한 활성화 함수의 피크 값이 고정되도록 상기 활성화 함수의 승수가 상기 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터 및 상기 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터에 의해 결정된 상기 활성화 함수에 의해 트레이닝된 것일 수 있다. According to an embodiment, a method of detecting whether biometric information is forged or not using a neural network includes a plurality of middle points of the neural network that detects whether or not the biometric information is spoofed from input data including the user's biometric information. extracting one or more first feature vectors from the layers using one or more pretrained first classifiers; detecting whether the biometric information is first forged based on the one or more first feature vectors; calculating a second score by applying an output vector output from the output layer to a pretrained second classifier according to whether the first forgery is detected; and detecting whether the biometric information is second forged or not by a fusion score of the first score calculated based on the one or more first feature vectors and the second score, wherein the one or more first classifiers and In at least one of the second classifiers, a multiplier of the activation function is a first hyperparameter associated with an ascending slope of the activation function and a second hyperparameter associated with a descending slope of the activation function such that a peak value of the activation function for the neural network is fixed. It may be trained by the activation function determined by hyperparameters.

상기 활성화 함수의 상기 동적 범위는 [0, 1]으로 제어될 수 있다. The dynamic range of the activation function may be controlled as [0, 1].

상기 활성화 함수()는 다음의 수학식으로 표현되고, The activation function ( ) is expressed by the following equation,

[수학식][mathematical expression]

여기서, 상기 a는 상기 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터를 나타내고, 상기 b는 상기 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터를 나타내며, 상기 는 오일러 넘버를 나타내며, 상기 x는 상기 입력 데이터는 나타내고, 상기 는 상기 x가 0보다 작을 때에 상기 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝 함수를 나타낼 수 있다. Here, a represents a first hyperparameter associated with an ascending slope of the activation function, and b represents a second hyperparameter associated with a descending slope of the activation function. represents the Euler number, the x represents the input data, and the may represent a Heaviside step function that makes the output of the activation function 0 when the x is less than 0.

상기 하나 이상의 제1 특징 벡터를 추출하는 단계는 상기 하나 이상의 제1 분류기 중 제1-1 분류기를 이용하여 상기 복수의 중간 레이어들 중 제1 중간 레이어로부터 제1-1 특징 벡터를 추출하는 단계; 상기 하나 이상의 제1 분류기 중 제1-2 분류기를 이용하여 상기 제1 중간 레이어 이후의 제2 중간 레이어로부터 제1-2 특징 벡터를 추출하는 단계; 및 상기 제1-1 특징 벡터 및 상기 제1-2 특징 벡터를 조합한 특징 벡터를 추출하는 단계를 포함할 수 있다. The extracting of the one or more first feature vectors may include extracting a 1-1 feature vector from a first intermediate layer among the plurality of intermediate layers by using a 1-1 classifier among the one or more first classifiers; extracting 1-2 feature vectors from a second intermediate layer after the first intermediate layer by using a 1-2 classifier among the one or more first classifiers; and extracting a feature vector obtained by combining the 1-1 feature vector and the 1-2 feature vector.

상기 생체 정보의 제1 위조 여부를 검출하는 단계는 미리 구비된 등록 특징 벡터 및 위조 특징 벡터 중 적어도 하나와 상기 조합한 특징 벡터 간의 유사도에 기초하여 상기 제1 스코어를 산출하는 단계; 및 상기 하나 이상의 제1 분류기를 이용하여 상기 제1 스코어가 위조 정보로 판단되는 스코어인지 또는 실제 정보로 판단되는 스코어인지를 분류하는 단계를 포함할 수 있다. The step of detecting whether the biometric information is first forged includes: calculating the first score based on a similarity between at least one of a previously prepared registered feature vector and a fake feature vector and the combined feature vector; and classifying whether the first score is a score determined as fake information or a score determined as real information using the one or more first classifiers.

상기 생체 정보는 상기 사용자의 지문, 홍채, 및 얼굴 중 어느 하나를 포함할 수 있다. The biometric information may include any one of the user's fingerprint, iris, and face.

일 실시예에 따르면, 뉴럴 네트워크를 이용하여 생체 정보의 위조 여부를 검출하는 전자 장치는 사용자의 상기 생체 정보를 포함하는 입력 데이터를 캡쳐하는 센서; 상기 입력 데이터로부터 상기 생체 정보의 위조 여부를 검출하는 상기 뉴럴 네트워크의 복수의 중간 레이어들로부터, 미리 트레이닝된 하나 이상의 제1 분류기를 이용하여, 하나 이상의 제1 특징 벡터를 추출하고, 상기 하나 이상의 제1 특징 벡터에 기초하여, 상기 생체 정보의 제1 위조 여부를 검출하고, 상기 제1 위조 여부가 검출되는지 여부에 따라, 상기 출력 레이어로부터 출력되는 출력 벡터를 미리 트레이닝된 제2 분류기에 인가하여 제2 스코어를 산출하며, 상기 하나 이상의 제1 특징 벡터에 기초하여 산출된 제1 스코어 및 상기 제2 스코어를 융합한 스코어에 의해 상기 생체 정보의 제2 위조 여부를 검출하는 프로세서; 및 상기 제1 위조 여부 및 상기 제2 위조 여부 중 적어도 하나를 출력하는 출력 장치를 포함하고, 상기 하나 이상의 제1 분류기 및 상기 제2 분류기 중 적어도 하나는 상기 뉴럴 네트워크를 위한 활성화 함수의 피크 값이 고정되도록 상기 활성화 함수의 승수가 상기 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터 및 상기 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터에 의해 결정되는 상기 활성화 함수에 기초하여 트레이닝된다. According to an embodiment, an electronic device that detects whether biometric information is forged using a neural network includes a sensor that captures input data including the biometric information of a user; From the input data, one or more first feature vectors are extracted from a plurality of intermediate layers of the neural network that detects whether the biometric information is forged or not, using one or more first classifiers trained in advance, and the one or more first feature vectors are extracted. Based on 1 feature vector, whether the biometric information is first forged or not is detected, and according to whether the first forgery or not is detected, an output vector output from the output layer is applied to a pre-trained second classifier to generate a second classifier. a processor that calculates two scores and detects whether the biometric information is second forged or not based on a fusion score of the first score calculated based on the one or more first feature vectors and the second score; and an output device outputting at least one of the first falsification status and the second falsification status, wherein at least one of the one or more first classifiers and the second classifier has a peak value of an activation function for the neural network A multiplier of the activation function to be fixed is trained based on the activation function determined by a first hyperparameter associated with a rising slope of the activation function and a second hyperparameter associated with a falling slope of the activation function.

[수학식][mathematical expression]

여기서, 상기 a는 상기 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터를 나타내고, 상기 b는 상기 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터를 나타내며, 상기 는 오일러 넘버를 나타내며, 상기 x는 상기 추가 노드들에 대한 입력을 나타내고, 상기 는 상기 x가 0보다 작을 때에 상기 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝 함수를 나타낼 수 있다.Here, a represents a first hyperparameter associated with an ascending slope of the activation function, and b represents a second hyperparameter associated with a descending slope of the activation function. denotes an Euler number, where x denotes an input to the additional nodes, may represent a Heaviside step function that makes the output of the activation function 0 when the x is less than 0.

도 1은 일 실시예에 따른 뉴럴 네트워크를 포함하는 전자 장치가 사용되는 환경을 도시한 도면이다.
도 2는 일 실시예에 따른 뉴럴 네트워크의 동작 방법을 나타낸 흐름도이다.
도 3은 일 실시예에 따른 뉴럴 네트워크의 구조를 도시한 도면이다.
도 4는 일 실시예에 따른 뉴럴 네트워크가 위조 정보 및 실제 정보를 구분하는 영역을 설명하기 위한 도면이다.
도 5는 일 실시예에 따른 뉴럴 네트워크의 각 레이어들에 적용되는 활성화 함수를 설명하기 위한 도면이다.
도 6은 일 실시예에 따른 뉴럴 네트워크의 트레이닝 방법을 나타낸 흐름도이다.
도 7은 도 6의 트레이닝 방법을 설명하기 위한 도면이다.
도 8은 다른 실시예에 따른 뉴럴 네트워크의 트레이닝 방법을 나타낸 흐름도이다.
도 9는 도 8의 트레이닝 방법을 설명하기 위한 도면이다.
도 10은 일 실시예에 따른 뉴럴 네트워크를 이용하여 생체 정보의 위조 여부를 검출하는 방법을 나타낸 흐름도이다.
도 11은 도 10의 뉴럴 네트워크의 구조 및 동작을 설명하기 위한 도면이다.
도 12는 일 실시예에 따른 뉴럴 네트워크를 이용하여 생체 정보의 위조 여부를 검출하는 전자 장치의 블록도이다.1 is a diagram illustrating an environment in which an electronic device including a neural network according to an exemplary embodiment is used.
2 is a flowchart illustrating a method of operating a neural network according to an exemplary embodiment.
3 is a diagram illustrating the structure of a neural network according to an embodiment.
4 is a diagram for explaining a region in which a neural network distinguishes fake information and real information according to an exemplary embodiment.
5 is a diagram for explaining an activation function applied to each layer of a neural network according to an exemplary embodiment.
6 is a flowchart illustrating a method for training a neural network according to an exemplary embodiment.
FIG. 7 is a diagram for explaining the training method of FIG. 6 .
8 is a flowchart illustrating a method for training a neural network according to another embodiment.
FIG. 9 is a diagram for explaining the training method of FIG. 8 .
10 is a flowchart illustrating a method of detecting forgery of biometric information using a neural network according to an embodiment.
FIG. 11 is a diagram for explaining the structure and operation of the neural network of FIG. 10 .
12 is a block diagram of an electronic device that detects forgery of biometric information using a neural network according to an exemplary embodiment.

실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 구현될 수 있다. 따라서, 실제 구현되는 형태는 개시된 특정 실시예로만 한정되는 것이 아니며, 본 명세서의 범위는 실시예들로 설명한 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of the embodiments are disclosed for illustrative purposes only, and may be changed and implemented in various forms. Therefore, the form actually implemented is not limited only to the specific embodiments disclosed, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical idea described in the embodiments.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Although terms such as first or second may be used to describe various components, such terms should only be construed for the purpose of distinguishing one component from another. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.It should be understood that when an element is referred to as being “connected” to another element, it may be directly connected or connected to the other element, but other elements may exist in the middle.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설명된 특징, 숫자, 동작, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 동작, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, terms such as "comprise" or "have" are intended to designate that the described feature, number, operation, operation, component, part, or combination thereof exists, but one or more other features or numbers, It should be understood that the presence or addition of an operation, operation, component, part, or combination thereof is not precluded.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in this specification, it should not be interpreted in an ideal or excessively formal meaning. don't

이하, 실시예들을 첨부된 도면들을 참조하여 상세하게 설명한다. 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조 부호를 부여하고, 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. In the description with reference to the accompanying drawings, the same reference numerals are given to the same components regardless of reference numerals, and overlapping descriptions thereof will be omitted.

도 1은 일 실시예에 따른 뉴럴 네트워크를 포함하는 전자 장치가 사용되는 환경을 도시한 도면이다. 도 1을 참조하면, 일 실시예에 따라 사용자의 생체 정보(예를 들어, 지문)를 센싱하는 센서(110)를 포함하는 전자 장치(100) 및 등록 지문 영상들(121, 122, 123)을 포함하는 등록 지문 데이터베이스(120)가 도시된다. 이하에서는 설명의 편의를 위하여 사용자의 생체 정보가 지문 영상인 경우를 일 예로 들어 설명하지만, 반드시 이에 한정되는 것은 아니다. 생체 정보는 지문 영상 이외에도 홍채 영상, 손금 영상, 및 얼굴 영상 등 다양한 정보를 포함할 수 있다. 1 is a diagram illustrating an environment in which an electronic device including a neural network according to an exemplary embodiment is used. Referring to FIG. 1 , an electronic device 100 including a sensor 110 for sensing user's biometric information (eg, fingerprint) and registered fingerprint images 121, 122, and 123 are provided according to an exemplary embodiment. An enrolled fingerprint database 120 comprising an enrolled fingerprint database 120 is shown. Hereinafter, for convenience of description, a case in which the user's biometric information is a fingerprint image will be described as an example, but is not necessarily limited thereto. Biometric information may include various information such as an iris image, a palm print image, and a face image in addition to a fingerprint image.

전자 장치(100)는 센서(110)를 통하여 사용자의 지문이 나타난 입력 지문 영상(115)을 획득할 수 있다. 센서(110)는 예를 들어, 사용자의 지문을 캡쳐(capture)하는 초음파 지문 센서, 광학 지문 센서, 정전 방식의 지문 센서, 또는 이미지 센서일 수 있으며, 반드시 이에 한정되지는 않는다. 센서(110)는 예를 들어, 도 12의 센서(1210)일 수 있다. The electronic device 100 may obtain an input fingerprint image 115 in which the user's fingerprint appears through the sensor 110 . The sensor 110 may be, for example, an ultrasonic fingerprint sensor that captures a user's fingerprint, an optical fingerprint sensor, an electrostatic fingerprint sensor, or an image sensor, but is not limited thereto. The sensor 110 may be, for example, the sensor 1210 of FIG. 12 .

지문 인식을 위해 지문 등록이 수행될 수 있다. 등록 지문 영상들(121, 122, 123)은 지문 등록 과정을 거쳐 등록 지문 데이터베이스(120)에 미리 저장될 수 있다. 개인 정보 보호를 위하여, 등록 지문 데이터베이스(120)는 등록 지문 영상들(121, 122, 123)을 그대로 저장하는 대신, 등록 지문 영상들(121, 122, 123)로부터 추출된 특징들 또는 특징 벡터들을 저장할 수도 있다. 등록 지문 데이터베이스(120)는 전자 장치(100)에 포함된 메모리(예: 도 12의 메모리(1270))에 저장되거나, 전자 장치(100)와 통신할 수 있는 서버, 로컬 캐시(local cache) 또는 클라우드 서버(cloud server) 등과 같은 외부 장치(미도시)에 저장될 수 있다.Fingerprint registration may be performed for fingerprint recognition. The registered fingerprint images 121 , 122 , and 123 may be previously stored in the registered fingerprint database 120 through a fingerprint registration process. To protect personal information, the registered fingerprint database 120 stores features or feature vectors extracted from the registered fingerprint images 121, 122, and 123 instead of storing the registered fingerprint images 121, 122, and 123 as they are. can also be saved. The registered fingerprint database 120 may be stored in a memory included in the electronic device 100 (eg, the memory 1270 of FIG. 12 ), or may be stored in a server, a local cache, or a server capable of communicating with the electronic device 100 . It may be stored in an external device (not shown) such as a cloud server.

인증을 위한 입력 지문 영상(115)이 수신되면, 전자 장치(100)는 입력 지문 영상(115)에 나타난 입력 지문과 등록 지문 영상들(121 내지 123)에 나타난 등록 지문들 간의 유사도에 의해 입력 지문 영상(115)의 사용자를 인증할 수도 있고, 또는 입력 지문 영상(115)의 위조(spoofing) 여부를 검출할 수 있다. 여기서, '위조'은 실제(live) 생체 정보가 아닌 가짜(fake) 생체 정보를 의미하는 것으로서, 예를 들어, 생체 정보의 복제, 위조 및 변조를 모두 포함하는 의미로 이해될 수 있다.When the input fingerprint image 115 for authentication is received, the electronic device 100 determines the similarity between the input fingerprint shown in the input fingerprint image 115 and the registered fingerprints shown in the registered fingerprint images 121 to 123. The user of the image 115 may be authenticated, or whether the input fingerprint image 115 is spoofed may be detected. Here, 'falsification' means fake (fake) biometric information, not real (live) biometric information, and can be understood as including, for example, all of the duplication, forgery, and falsification of biometric information.

아래에서 상세히 설명하겠지만, 일 실시예에 따른 전자 장치(100)는 미리 구비된 불특정 다수의 실제 지문 특징들, 미리 구비된 불특정 다수의 위조 지문 특징들 및/또는 기기 사용자의 등록 지문 특징들을 이용하여 입력 지문에 대한 인증 또는 위조 여부 등을 결정할 수 있다.As will be described in detail below, the electronic device 100 according to an embodiment uses a plurality of pre-prepared real fingerprint features, a pre-prepared unspecified number of counterfeit fingerprint features, and/or a registered fingerprint feature of a device user. Authentication or counterfeiting of the input fingerprint may be determined.

도 2는 일 실시예에 따른 뉴럴 네트워크의 동작 방법을 나타낸 흐름도이다. 이하 실시예에서 각 동작들은 순차적으로 수행될 수도 있으나, 반드시 순차적으로 수행되는 것은 아니다. 예를 들어, 각 동작들의 순서가 변경될 수도 있으며, 적어도 두 동작들이 병렬적으로 수행될 수도 있다.2 is a flowchart illustrating a method of operating a neural network according to an exemplary embodiment. In the following embodiments, each operation may be performed sequentially, but not necessarily sequentially. For example, the order of each operation may be changed, or at least two operations may be performed in parallel.

도 2를 참조하면, 일 실시예에 따른 입력 레이어(input layer), 복수의 중간 레이어들, 및 출력 레이어(out layer)를 포함하는 뉴럴 네트워크는 단계(210) 내지 단계(240)를 수행할 수 있다. 뉴럴 네트워크는 예를 들어, CNN(Convolution Neural Network), 및 DNN(Deep Neural Network) 중 어느 하나를 포함할 수 있으며, 반드시 이에 한정되지는 않는다. Referring to FIG. 2 , a neural network including an input layer, a plurality of intermediate layers, and an out layer according to an embodiment may perform steps 210 to 240. there is. The neural network may include, for example, any one of a Convolution Neural Network (CNN) and a Deep Neural Network (DNN), but is not necessarily limited thereto.

단계(210)에서, 뉴럴 네트워크는 복수의 중간 레이어들 중 입력 레이어에 인접한 임의의 제1 중간 레이어에 속한 제1 노드들에게 제1 활성화 함수를 적용하여 제1 중간 벡터를 생성한다. 제1 활성화 함수는 예를 들어, 계단 함수(Step function), 시그모이드 함수(Sigmoid function), 하이퍼볼릭 탄젠트 함수(Hyperbolic tangent function), 렐루(ReLU) 함수, 및 리키 렐루(Leaky ReLU) 함수 중 어느 하나를 포함할 수 있으며, 반드시 이에 한정되지는 않는다. In step 210, the neural network generates a first intermediate vector by applying a first activation function to first nodes belonging to a first intermediate layer adjacent to an input layer among a plurality of intermediate layers. The first activation function is, for example, a step function, a sigmoid function, a hyperbolic tangent function, a ReLU function, and a Leaky ReLU function. It may include any one, but is not necessarily limited thereto.

뉴럴 네트워크의 구조 및 동작은 아래의 도 3 내지 도 5를 참조하여 보다 구체적으로 설명한다. The structure and operation of the neural network will be described in more detail with reference to FIGS. 3 to 5 below.

단계(220)에서, 뉴럴 네트워크는 제1 중간 벡터를, 중간 레이어들 중 출력 레이어에 인접한 제2 중간 레이어에 속한 제2 노드들에게 전달한다. In step 220, the neural network transfers the first intermediate vector to second nodes belonging to a second intermediate layer adjacent to the output layer among the intermediate layers.

단계(230)에서, 뉴럴 네트워크는 제2 노드들에게 제2 활성화 함수를 적용하여 제2 중간 벡터를 생성한다. 제2 활성화 함수에 대하여는 아래의 도 5를 통해 보다 구체적으로 설명한다. In step 230, the neural network generates a second intermediate vector by applying a second activation function to the second nodes. The second activation function will be described in detail with reference to FIG. 5 below.

단계(240)에서, 뉴럴 네트워크는 제2 중간 벡터를 출력 레이어에 인가한다. At step 240, the neural network applies the second intermediate vector to the output layer.

도 3은 일 실시예에 따른 뉴럴 네트워크의 구조를 도시한 도면이다. 3 is a diagram illustrating the structure of a neural network according to an embodiment.

도 3을 참조하면, 뉴럴 네트워크의 일 예시에 해당하는 심층 신경망(Deep Neural Network; DNN)(300)의 개략적인 구조가 도시된다. 심층 신경망(300)은 딥 러닝(deep learning)을 통해 트레이닝될 수 있다. Referring to FIG. 3 , a schematic structure of a deep neural network (DNN) 300 corresponding to an example of a neural network is shown. The deep neural network 300 may be trained through deep learning.

심층 신경망(300)은 복수의 노드들로 구성되는 복수의 레이어들(310, 320, 330)을 포함할 수 있다. 심층 신경망(300)은 복수의 레이어들(310, 320, 330) 각각에 포함된 복수의 노드들을 다른 레이어에 포함된 노드에 연결시키는 연결 가중치들을 포함할 수 있다. 전자 장치는 심층 신경망(300)을 메모리(예를 들어, 아래 도 12의 메모리(1270) 참조)에 저장된 내부 데이터베이스로부터 획득하거나, 출력 장치(예를 들어, 도 12의 출력 장치(1250) 참조)를 통해 외부 서버로부터 수신하여 획득할 수 있다.The deep neural network 300 may include a plurality of layers 310, 320, and 330 composed of a plurality of nodes. The deep neural network 300 may include connection weights that connect a plurality of nodes included in each of the plurality of layers 310, 320, and 330 to nodes included in other layers. The electronic device obtains the deep neural network 300 from an internal database stored in a memory (eg, see memory 1270 in FIG. 12 below) or an output device (eg, see output device 1250 in FIG. 12). It can be obtained by receiving it from an external server through

예를 들어, 심층 신경망(300)은 선 형태의 에지들(edges)로 연결된 많은 수의 인공 뉴런들을 포함할 수 있다. 인공 뉴런은 도 3에서 원형의 도형으로 표시되며, 노드(node)라고 지칭될 수 있다. 인공 뉴런들은 연결 가중치(connection weight)를 가지는 에지들을 통해 상호 연결될 수 있다. 연결 가중치는 에지들이 갖는 특정한 값으로, '시냅스 가중치(synapse weight)', '가중치', 또는 '연결 강도'라고 지칭될 수 있다.For example, the deep neural network 300 may include a large number of artificial neurons connected by line-shaped edges. An artificial neuron is represented by a circular shape in FIG. 3 and may be referred to as a node. Artificial neurons may be interconnected through edges having a connection weight. The connection weight is a specific value that edges have, and may be referred to as 'synapse weight', 'weight', or 'connection strength'.

심층 신경망(300)은 예를 들어, 입력 레이어(input layer)(310), 히든 레이어(hidden layer)(320), 출력 레이어(output layer)(330)을 포함할 수 있다. 입력 레이어(310), 히든 레이어(320) 및 출력 레이어(330)는 복수 개의 노드들을 포함할 수 있다. 입력 레이어(110)에 포함된 노드는 입력 노드(input node)라고 지칭되고, 히든 레이어(320)에 포함된 노드는 히든 노드(hidden node)라고 지칭될 수 있다. 히든 레이어(320)는 입력 레이어(310)와 출력 레이어(330)의 중간에 위치한다는 점에서 '중간 레이어'라고 지칭할 수도 있다. 이하, '히든 레이어'와 '중간 레이어'는 동일한 의미로 이해될 수 있다.The deep neural network 300 may include, for example, an input layer 310, a hidden layer 320, and an output layer 330. The input layer 310, the hidden layer 320, and the output layer 330 may include a plurality of nodes. A node included in the input layer 110 may be referred to as an input node, and a node included in the hidden layer 320 may be referred to as a hidden node. The hidden layer 320 may be referred to as an 'intermediate layer' in that it is located in the middle of the input layer 310 and the output layer 330. Hereinafter, 'hidden layer' and 'middle layer' may be understood as the same meaning.

출력 레이어(330)에 포함된 노드는 출력 노드(output node)라고 지칭될 수 있다.A node included in the output layer 330 may be referred to as an output node.

입력 레이어(310)는 트레이닝 또는 인식을 수행하기 위한 입력 데이터를 수신하여 히든 레이어(320)에 전달할 수 있다. 출력 레이어(330)는 히든 레이어(320)로부터 수신한 신호에 기초하여 심층 신경망(300)의 출력을 생성할 수 있다. 히든 레이어(320)는 입력 레이어(310)와 출력 레이어(330) 사이에 위치하고, 입력 레이어(310)를 통해 전달된 트레이닝 데이터의 트레이닝 입력을 예측하기 쉬운 값으로 변화시킬 수 있다. 입력 레이어(310)에 포함된 입력 노드들과 히든 레이어(320)에 포함된 히든 노드들은 연결 가중치를 가지는 연결선들을 통해 서로 연결될 수 있다. 히든 레이어(320)에 포함된 히든 노드들과 출력 레이어(330)에 포함된 출력 노드들은 연결 가중치를 가지는 연결선들을 통해 서로 연결될 수 있다.The input layer 310 may receive input data for performing training or recognition and transfer it to the hidden layer 320 . The output layer 330 may generate an output of the deep neural network 300 based on a signal received from the hidden layer 320 . The hidden layer 320 is located between the input layer 310 and the output layer 330 and can change the training input of training data transferred through the input layer 310 to a value that is easy to predict. Input nodes included in the input layer 310 and hidden nodes included in the hidden layer 320 may be connected to each other through connection lines having connection weights. Hidden nodes included in the hidden layer 320 and output nodes included in the output layer 330 may be connected to each other through connection lines having connection weights.

히든 레이어(320)는 복수 개의 레이어들(320-1, .. , 320-N)을 포함할 수 있다. 예를 들어, 히든 레이어(320)에 제1 히든 레이어, 제2 히든 레이어, 및 제3 히든 레이어가 포함되는 경우를 가정하면, 제1 히든 레이어에 속한 히든 노드의 출력은 제2 히든 레이어에 속한 히든 노드들에 연결될 수 있다. 제2 히든 레이어에 속한 히든 노드의 출력은 제3 히든 레이어에 속한 히든 노드들에 연결될 수 있다. The hidden layer 320 may include a plurality of layers 320-1, .. and 320-N. For example, assuming that the hidden layer 320 includes a first hidden layer, a second hidden layer, and a third hidden layer, an output of a hidden node belonging to the first hidden layer belongs to the second hidden layer. It can be connected to hidden nodes. Outputs of hidden nodes belonging to the second hidden layer may be connected to hidden nodes belonging to the third hidden layer.

전자 장치는 이전 히든 레이어에 포함된 이전 히든 노드들의 출력들을 연결 가중치를 가지는 연결선들을 통해 해당 히든 레이어에 입력할 수 있다. 전자 장치는 이전 히든 노드들의 출력들에 연결 가중치가 적용된 값들 및 활성화 함수(activation function)에 기초하여, 히든 레이어에 포함된 히든 노드들의 출력을 생성할 수 있다. 일 예시에 따르면, 활성화 함수의 결과가 현재 히든 노드의 임계 값을 초과하는 경우, 다음 히든 노드로 출력이 발화될 수 있다. 이 경우, 현재 히든 노드는 입력 벡터들을 통하여 특정 임계 활성화 강도에 도달하기 전에는 다음 히든 노드로 신호를 발화하지 않고, 비활성화 상태를 유지할 수 있다.The electronic device may input outputs of previous hidden nodes included in the previous hidden layer to the corresponding hidden layer through connection lines having connection weights. The electronic device may generate outputs of hidden nodes included in the hidden layer based on values to which connection weights are applied to outputs of previous hidden nodes and an activation function. According to an example, when a result of the activation function exceeds a threshold value of the current hidden node, output may be ignited to the next hidden node. In this case, the current hidden node may maintain an inactive state without igniting a signal to the next hidden node until a specific threshold activation strength is reached through input vectors.

일 실시예에 따른 전자 장치는 지도 학습(supervised learning)을 통해 심층 신경망(300)을 트레이닝시킬 수 있다. 전자 장치는 소프트웨어 모듈, 하드웨어 모듈, 또는 이들의 조합으로 구현될 수 있다. 지도 학습은 트레이닝 데이터의 트레이닝 입력 및 그에 대응하는 트레이닝 출력을 함께 심층 신경망(300)에 입력하고, 트레이닝 데이터의 트레이닝 출력에 대응하는 출력 데이터가 출력되도록 연결선들의 연결 가중치를 업데이트하는 기법이다. 트레이닝 데이터는 트레이닝 입력 및 트레이닝 출력의 쌍을 포함하는 데이터이다. An electronic device according to an embodiment may train the deep neural network 300 through supervised learning. An electronic device may be implemented as a software module, a hardware module, or a combination thereof. Supervised learning is a technique of inputting a training input of training data and a corresponding training output together to the deep neural network 300, and updating connection weights of connection lines so that output data corresponding to the training output of the training data is output. Training data is data comprising pairs of training inputs and training outputs.

도 3은 심층 신경망(300)의 구조를 노드 구조로 표현하였으나, 실시예들은 이러한 노드 구조에 국한되지 않는다. 메모리에 뉴럴 네트워크를 저장하기 위하여 다양한 데이터 구조가 이용될 수 있다.3 represents the structure of the deep neural network 300 as a node structure, embodiments are not limited to this node structure. Various data structures can be used to store the neural network in memory.

일 실시예에 따르면, 전자 장치는 심층 신경망(300)에 역전파되는 손실 및 심층 신경망(300)에 포함된 노드들의 출력 값에 기초한 기울기 하강(gradient descent) 기법을 통하여, 노드들의 파라미터를 결정할 수 있다.According to an embodiment, the electronic device may determine parameters of nodes through a gradient descent technique based on a loss backpropagated to the deep neural network 300 and output values of nodes included in the deep neural network 300. there is.

예를 들어, 전자 장치는 손실 역전파 학습(loss back-propagation learning)을 통해 노드들 사이의 연결 가중치를 업데이트할 수 있다. 손실 역전파 학습은, 주어진 트레이닝 데이터에 대해 포워드 계산(forward computation)을 통하여 손실을 추정한 후, 출력 레이어(330)에서 시작하여 히든 레이어(320) 및 입력 레이어(110)를 향하는 역 방향으로 추정한 손실을 전파하면서, 손실을 줄이는 방향으로 연결 가중치를 업데이트하는 방법이다. For example, the electronic device may update connection weights between nodes through loss back-propagation learning. Loss backpropagation learning, after estimating the loss through forward computation for given training data, is estimated starting from the output layer 330 and in the reverse direction toward the hidden layer 320 and the input layer 110. This is a method of updating connection weights in the direction of reducing the loss while propagating one loss.

심층 신경망(300)의 처리는 입력 레이어(310), 히든 레이어(320), 및 출력 레이어(330)의 방향으로 진행되지만, 손실 역 전파 트레이닝에서 연결 가중치의 업데이트 방향은 출력 레이어(330), 히든 레이어(320), 및 입력 레이어(310)의 방향으로 진행될 수 있다. 하나 또는 그 이상의 프로세서는 뉴럴 네트워크를 원하는 방향으로 처리하기 위하여, 레이어 또는 일련의 계산 데이터를 저장하는 버퍼 메모리(buffer)를 이용할 수도 있다.The processing of the deep neural network 300 proceeds in the directions of the input layer 310, the hidden layer 320, and the output layer 330, but in the lossy backpropagation training, the update direction of the connection weights is the output layer 330, the hidden layer 330, and the hidden layer 330. It may proceed in the direction of the layer 320 and the input layer 310 . One or more processors may use a buffer memory for storing layers or a series of calculation data in order to process the neural network in a desired direction.

전자 장치는 현재 설정된 연결 가중치들이 얼마나 최적에 가까운지를 측정하기 위한 목적 함수(objective function)를 정의하고, 목적 함수의 결과에 기초하여 연결 가중치들을 계속 변경하고, 트레이닝을 반복적으로 수행할 수 있다. 예를 들어, 목적 함수는 심층 신경망(300)이 트레이닝 데이터의 트레이닝 입력에 기초하여 실제 출력한 출력 값과 출력되기로 원하는 기대 값 사이의 손실을 계산하기 위한 손실 함수일 수 있다. 전자 장치는 손실 함수의 값을 줄이는 방향으로 연결 가중치들을 업데이트할 수 있다.The electronic device may define an objective function for measuring how close to optimum the currently set connection weights are, continuously change the connection weights based on the result of the objective function, and repeatedly perform training. For example, the objective function may be a loss function for calculating a loss between an output value actually output by the deep neural network 300 based on training input of training data and an expected value desired to be output. The electronic device may update connection weights in a direction of reducing the value of the loss function.

일 실시예에 따른 심층 신경망(300)은 네트워크를 통해 실제 정보 또는 위조 정보를 판단하도록 트레이닝되며, 최종 출력 레이어의 출력이 최적의 결과를 도출하도록 트레이닝되지만, 트레이닝 과정에서 출력 레이어 이외에도 네트워크를 구성하는 각 중간 레이어 역시 실제 정보와 위조 정보를 판단하는 구분력을 가질 수 있다. 중간 레이어의 출력 역시 구분력을 가질 수 있으므로 이를 활용한다면 최종 출력 레이어(330)까지 가기 전에 생체 정보의 위조 여부를 도출할 수 있어 수행 시간 단축이 가능하다. The deep neural network 300 according to an embodiment is trained to determine real information or fake information through the network, and the output of the final output layer is trained to derive an optimal result. Each intermediate layer may also have discriminative power for determining real information and counterfeit information. Since the output of the intermediate layer may also have discriminative power, if this is used, it is possible to reduce the execution time by deriving whether or not the biometric information is forged before going to the final output layer 330.

일 실시예에 따른 전자 장치는 심층 신경망(300)이 최종 결과를 도출하기 이전 단계에서 구분력을 가지는 중간 레이어를 활용하여 위조 검출 정확도의 저하를 최소화하면서도 위조 여부의 검출 속도를 향상시킬 수 있다. 또한, 전자 장치는 중간 레이어의 출력을 활용함에 따른 정확도 측면의 저하를 보완하기 위해 서로 다른 영상들을 입력으로 수신하는 네트워크들의 결과를 활용하여 정확도 저하를 최소화할 수 있다.An electronic device according to an embodiment may improve detection speed of whether or not to be counterfeit while minimizing degradation in counterfeit detection accuracy by utilizing an intermediate layer having discrimination power in a step before the deep neural network 300 derives a final result. In addition, the electronic device may minimize accuracy degradation by utilizing results of networks receiving different images as inputs in order to compensate for degradation in terms of accuracy due to utilization of the output of the intermediate layer.

도 4는 일 실시예에 따른 뉴럴 네트워크가 위조 정보 및 실제 정보를 구분하는 영역을 설명하기 위한 도면이다. 도 4를 참조하면, 일 실시예에 따른 뉴럴 네트워크가 분류한 특징 분포에 따른 영역들(410, 420, 430, 440)을 구분한 데이터그램(400)이 도시된다. 4 is a diagram for explaining a region in which a neural network distinguishes fake information and real information according to an exemplary embodiment. Referring to FIG. 4 , a datagram 400 in which regions 410, 420, 430, and 440 are divided according to a feature distribution classified by a neural network according to an exemplary embodiment is shown.

뉴럴 네트워크는 불특정 다수의 실제 생체 정보와 불특정 다수의 위조 생체 정보를 이용하여 학습된 것일 수 있다. 뉴럴 네트워크에 의하여 생성되는 벡터는 생체 정보의 특징 정보를 내포(embed)하며, '임베딩 벡터(embedding vector)' 또는 '특징 벡터(feature vector)'로 지칭될 수 있다.The neural network may be learned using an unspecified number of real biometric information and an unspecified number of counterfeit biometric information. A vector generated by a neural network embeds feature information of biometric information and may be referred to as an 'embedding vector' or a 'feature vector'.

도 4에 도시된 특징 분포에서 제1 영역(410)은 입력 데이터로부터 추출된 특징 또는 특징 벡터가 실제 정보(live information)로 분류되는 영역에 해당하고, 제2 영역(420)은 입력 데이터로부터 추출된 특징 또는 특징 벡터가 위조 정보(spoof information)로 분류되는 영역에 해당할 수 있다. In the feature distribution shown in FIG. 4, the first area 410 corresponds to an area where features or feature vectors extracted from input data are classified as live information, and the second area 420 is extracted from input data. The resulting feature or feature vector may correspond to an area classified as spoof information.

또한, 제1 영역(410)과 제2 영역(420) 사이의 제3 영역(430)은 실제 정보와 위조 정보 사이의 구분력을 결정 짓는 부분에 해당할 수 있다. 제3 영역(430)은 정규화(generalization) 성능을 위해서, 오버 피팅(over-fitting)이 일어나지 않도록 오류를 일부 허용하는 부분에 해당할 수 있다. 제3 영역(430)은 뉴럴 네트워크에 의해 산출된 입력 데이터에 대응하는 스코어('제1 스코어')가 실제(live) 정보에 해당하는 스코어인지 아니면 위조(spoof) 정보에 해당하는 스코어인지를 명확하게 구분할 수 있는 임계 범위를 포함할 수 있다. 임계 범위는 제1 스코어의 확률 분포에서 예를 들어, 제1 스코어가 위조 정보로 판단될 최대 확률에 대응하는 제1 임계치 및 제1 스코어가 실제 정보로 판단될 최소 확률에 대응하는 제2 임계치에 기초하여 결정될 수 있다.In addition, the third area 430 between the first area 410 and the second area 420 may correspond to a portion that determines the distinguishing power between real information and counterfeit information. The third region 430 may correspond to a portion that allows some errors to prevent over-fitting from occurring for generalization performance. The third region 430 clarifies whether the score ('first score') corresponding to the input data calculated by the neural network is a score corresponding to live information or a score corresponding to spoof information. It may include a critical range that can be clearly distinguished. In the probability distribution of the first score, the threshold range is, for example, a first threshold corresponding to the maximum probability that the first score is determined to be fake information and a second threshold corresponding to the minimum probability that the first score is determined to be real information. can be determined based on

제1 영역(410), 제2 영역(420), 및 제3 영역(430)은 신경망이 트레이닝한 데이터 또는 특징 벡터에 대응하는 특징 분포의 내부(In-distribution) 영역에 해당할 수 있다. The first region 410 , the second region 420 , and the third region 430 may correspond to in-distribution regions of a feature distribution corresponding to data or feature vectors trained by the neural network.

또한, 데이터그램(400)에서 특징 분포의 외부(Out-Of-Distribution; OOD) 영역에 해당하는 제4 영역(440)은 뉴럴 네트워크가 본 적 없는 데이터(unseen data), 다시 말해, 뉴럴 네트워크가 학습한 적 없는 특징에 대응하는 영역에 해당할 수 있다. 이 경우, 뉴럴 네트워크는 제4 영역(440)에 속한 특징에 대해 위조 정보인지 또는 실제 정보인지를 판단하기 어려울 수 있다. In addition, in the datagram 400, the fourth area 440 corresponding to the out-of-distribution (OOD) area of the feature distribution is data that the neural network has never seen (unseen data), that is, the neural network It may correspond to a region corresponding to a feature that has never been learned. In this case, it may be difficult for the neural network to determine whether the feature belonging to the fourth region 440 is fake information or real information.

제4 영역(440)에 대한 판단을 위해 증강(augmentation) 또는 정규화를 이용할 수도 있으나, 어떠한 위조 지문이라도 잘 판단하기 위해서는 뉴럴 네트워크가 제4 영역(440)이 불명확한 영역(uncertain area)이라는 정도의 판단은 하도록 하는 것이 바람직할 수 있다. Augmentation or normalization may be used to determine the fourth area 440, but in order to judge any forged fingerprint well, the neural network needs to have a degree that the fourth area 440 is an uncertain area. It may be desirable to make judgments.

분포의 외부(ODD) 영역에 해당하는 제4 영역(440)에 해당하는 입력을 감지하는 대부분의 방법들이 분포의 외부(ODD) 영역에 대한 지식 또는 거부 클래스로 분류기를 보강하여 문제를 공식화하거나, 또는 특정한 가정에 의존하는 반면, 일 실시예에서는 뉴럴 네트워크의 활성화 함수를 변경하여 분포의 외부(ODD) 영역에 해당하는 입력을 처리할 수 있다. Most methods for detecting an input corresponding to the fourth region 440, which corresponds to the outer (ODD) region of the distribution, formulate the problem by augmenting the classifier with knowledge or rejection classes of the outer (ODD) region of the distribution, or Alternatively, while depending on a specific assumption, in one embodiment, the activation function of the neural network may be changed to process an input corresponding to an outside (ODD) region of the distribution.

도 5는 일 실시예에 따른 뉴럴 네트워크의 각 레이어들에 적용되는 활성화 함수를 설명하기 위한 도면이다. 도 5를 참조하면, 일 실시예에 따른 뉴럴 네트워크(500)를 나타낸 도면이 도시된다. 5 is a diagram for explaining an activation function applied to each layer of a neural network according to an exemplary embodiment. Referring to FIG. 5 , a diagram illustrating a neural network 500 according to an exemplary embodiment is shown.

일 실시예에서는 뉴럴 네트워크(500)의 복수의 중간 레이어들 중 입력 레이어(510)에 인접한 임의의 제1 중간 레이어(520)에 속한 제1 노드들에게 제1 활성화 함수를 적용하여 제1 중간 벡터를 생성할 수 있다. 이때, 제1 활성화 함수는 예를 들어, 계단 함수(Step function), 시그모이드 함수(Sigmoid function), 하이퍼볼릭 탄젠트 함수(Hyperbolic tangent function), 렐루(ReLU) 함수, 및 리키 렐루(Leaky ReLU) 함수 중 어느 하나를 포함할 수 있으며, 반드시 이에 한정되지는 않는다. In an embodiment, a first intermediate vector is obtained by applying a first activation function to first nodes belonging to an arbitrary first intermediate layer 520 adjacent to the input layer 510 among a plurality of intermediate layers of the neural network 500. can create At this time, the first activation function is, for example, a step function, a sigmoid function, a hyperbolic tangent function, a ReLU function, and a Leaky ReLU It may include any one of functions, but is not necessarily limited thereto.

뉴럴 네트워크(500)는 제1 중간 레이어(520)에 의해 생성된 제1 중간 벡터를, 전파를 통해 중간 레이어들 중 출력 레이어(540)에 인접한 제2 중간 레이어(530)에 속한 제2 노드들에게 전달하고, 제2 노드들에게 제2 활성화 함수를 적용하여 제2 중간 벡터를 생성할 수 있다. 뉴럴 네트워크는 제2 중간 레이어(530)에서 생성된 제2 중간 벡터를 출력 레이어(540)에 인가하여 추정 결과를 출력할 수 있다. The neural network 500 transfers the first intermediate vector generated by the first intermediate layer 520 to second nodes belonging to the second intermediate layer 530 adjacent to the output layer 540 among the intermediate layers through radio waves. , and a second intermediate vector may be generated by applying a second activation function to the second nodes. The neural network may output an estimation result by applying the second intermediate vector generated in the second intermediate layer 530 to the output layer 540 .

특정 가정 하에서 적어도 하나의 중간 레이어를 가진 임의의 뉴럴 네트워크는 무한 너비의 한계에서 가우시안 프로세스(Gaussian Process; GP)로 수렴할 수 있다. 가우시안 프로세스(GP) 모델에서 널리 사용되는 마테른 커널(Matιrn Kernel)에 의해 유도된 속성을 모방하는 새로운 비선형 신경망을 위한 마테른 활성화 함수를 이용할 수 있다. Under certain assumptions, any neural network with at least one intermediate layer can converge to a Gaussian Process (GP) in the limit of infinite width. A Mattern activation function is available for new nonlinear neural networks that mimics the properties derived by the widely used Matιrn Kernel in the Gaussian Process (GP) model.

마테른 활성화 함수는 가우시안 프로세스(GP) 모델의 활성화 함수와 유사한 속성을 가질 수 있다. 마테른 활성화 함수는 제한된 평균 제곱 미분성과 함께 로컬 고정 속성이 베이지안 딥 러닝 작업에서 우수한 성능과 불확실성 보정 능력을 보여줄 수 있다. 특히, 국부적 정상성(local stationarity)은 분포 외(OOD) 영역의 불확실성을 보정하는 데 도움이 될 수 있다. The Maternian activation function may have similar properties to the activation function of the Gaussian Process (GP) model. The local fixed properties of the Maternian activation function together with the limited mean square derivative can show excellent performance and uncertainty correction ability in Bayesian deep learning tasks. In particular, local stationarity can help correct for uncertainty in the out-of-distribution (OOD) region.

일 실시예에서는 예를 들어, 가우시안 프로세서(Gaussian Process; GP)에서 사용되는 마테른 커널(Matιrn Kernel)로부터 유도된 마테른 활성화 함수를 개선한 활성화 함수('제2 활성화 함수')를 사용할 수 있다. In one embodiment, for example, an activation function ('second activation function') obtained by improving the Mattern activation function derived from the Matιrn Kernel used in a Gaussian Process (GP) can be used. .

우선, 마테른 커널로부터 유도된 마테른 활성화 함수()는 비선형 함수로서, 예를 들어, 아래의 수학식 1과 같이 표현될 수 있다. First, the Matern activation function derived from the Matern kernel ( ) is a non-linear function, and may be expressed as, for example, Equation 1 below.

여기서, Γ(·)는 감마(Gamma) 함수를 나타내고, q는 상수이며, ν와 는 하이퍼 파라미터(hyper parameter)일 수 있다. Here, Γ(·) represents the Gamma function, q is a constant, and ν and may be a hyper parameter.

예를 들어, ν > 1/2인 경우, 수학식 1에 따른 비선형 함수는 매끄럽고 연속적이며, 연속 미분 가능할 수 있다. 이와 달리, ν ≤ 1/2인 경우, 수학식 1에 따른 비선형 함수는 기하급수적으로 감소하는 스텝 함수의 형태를 취하므로 매끄럽지 않으며, 마테른 커널의 속성과 일치할 수 있다. For example, when ν > 1/2, the nonlinear function according to Equation 1 may be smooth, continuous, and continuously differentiable. In contrast, when ν ≤ 1/2, the nonlinear function according to Equation 1 takes the form of a step function that decreases exponentially, so it is not smooth and may match the properties of the Matern kernel.

또한, 는 입력(x)이 0보다 작을 때에 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝(Heaviside Step) 함수를 나타낼 수 있다. 수학식 1에서 는 마테른 활성화 함수의 형태(shape)를 나타내는 부분에 해당할 수 있다. also, may represent a Heaviside Step function that makes the output of the activation function 0 when the input (x) is less than 0. in Equation 1 may correspond to a part representing the shape of the Matern activation function.

수학식 1에 따른 마테른 활성화 함수()는, 베이지안 뉴럴 네트워크(Bayesian Neural Network)에 대해 다른 활성화 함수보다 고정성(Stationarity) 관점에서 향상된 것으로서, 승수(Multiplier)에 대한 제약이 없을 수 있다. The Matern activation function according to Equation 1 ( ) is improved in terms of stationarity compared to other activation functions for the Bayesian Neural Network, and there may be no restrictions on multipliers.

수학식 1의 승수 는 복잡한 수식 전개를 통해 유도되지만, 예를 들어, 논리 전개의 중간에 등장하는 백색 잡음(White Noise)에 대한 승수, 또는 푸리에 변환(Fourier Transform) 시 사용된 승수 등에서 그 자유도가 증가할 수 있으므로 수식 전개 상 수학식 1의 승수는 임의의 상수로 고정될 수도 있고, 또는 고정되지 않을 수도 있다. 다시 말해, 수학식 1의 승수 는 가변성이 충분히 있는 부분에 해당할 수 있다. Multiplier of Equation 1 is derived through complex equation development, but since the degree of freedom may increase, for example, in multipliers for white noise appearing in the middle of logic development or multipliers used in Fourier Transform, In development, the multiplier of Equation 1 may be fixed to an arbitrary constant or may not be fixed. In other words, the multiplier of Equation 1 may correspond to a part with sufficient variability.

가변성이 있는 승수 부분을 상수로 나타내는 경우, 배치 정규화(batch normalization), 및/또는 가중치(weight)에 대한 정규화 속성(normalization property)에 의해서 뉴럴 네트워크의 다이나믹스(dynamics)가 어느 정도 고정될 수 있다. 하지만, 뉴럴 네트워크에 대해 미세 조정(fine tuning)을 하는 경우, 승수 부분이 대폭 수정되어야 하고, 승수 부분이 수정되지 않는다면, 활성화 함수의 동적 범위가 0 ~ 1 사이에 존재하기 어려울 수 있다. When the variable multiplier part is expressed as a constant, the dynamics of the neural network can be fixed to some extent by batch normalization and/or normalization properties for weights. However, when fine-tuning the neural network, the multiplier part must be significantly modified, and if the multiplier part is not modified, it may be difficult for the dynamic range of the activation function to exist between 0 and 1.

일 실시예에서는 수학식 1의 승수 와 하이퍼 파라미터에 대한 자유도를 뉴럴 네트워크의 관점에서 고려하여 수학식 1을 아래의 수학식 2과 같이 표현되는 활성화 함수('제2 활성화 함수') 로 수정할 수 있다. In one embodiment, the multiplier of Equation 1 Equation 1 is an activation function ('second activation function') expressed as Equation 2 below, considering the degrees of freedom for ? and hyperparameters from the perspective of a neural network. can be modified with

여기서, a는 제2 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터를 나타내고, b는 제2 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터를 나타낼 수 있다. 하이퍼 파라미터 a,b > 0이며, 사용자에 의해 정의될 수 있다. Here, a may represent a first hyperparameter associated with an ascending slope of the second activation function, and b may represent a second hyperparameter associated with a descending slope of the second activation function. Hyperparameters a,b > 0, and can be defined by the user.

는 오일러(Euler) 넘버를 나타낼 수 있다. x는 활성화 함수에 대한 입력(예: 제2 노드들의 입력)을 나타낼 수 있다. 는 입력(x)이 0보다 작을 때에 제2 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝(Heaviside Step) 함수를 나타낼 수 있다. may represent an Euler number. x may represent an input to an activation function (eg, an input of second nodes). may represent a Heaviside Step function that makes the output of the second activation function 0 when the input (x) is less than 0.

일 실시예에 따른 제2 활성화 함수 는 제2 활성화 함수의 피크(peak) 값이 고정되도록 제2 활성화 함수의 승수(multiplier)가 제2 활성화 함수의 상승 슬로프(ascending slope)와 연관된 제1 하이퍼 파라미터(a) 및 제2 활성화 함수의 하강 슬로프(descending slope)와 연관된 제2 하이퍼 파라미터(b)에 의해 결정된다. 제2 활성화 함수의 피크(peak) 값은 예를 들어, '1'로 고정되고, 제2 활성화 함수의 동적 범위는 [0, 1]으로 제한될 수 있다. 여기서, [0, 1]는 0에서 1 사이의 값을 나타낼 수 있다. Second activation function according to an embodiment is a multiplier of the second activation function so that the peak value of the second activation function is fixed, of the first hyperparameter a and the second activation function associated with the ascending slope of the second activation function. It is determined by the second hyperparameter (b) associated with the descending slope. A peak value of the second activation function may be fixed to '1', and a dynamic range of the second activation function may be limited to [0, 1]. Here, [0, 1] may represent a value between 0 and 1.

수학식 2에서는 활성화 함수의 최대 값이 1이 되도록 승수를 제한하여 정규화(Normalize) 하여 수학식 1에서 ν 와 λ 로서 표현된 하이퍼 파라미터를 a와 b로 표현할 수 있다. 이러한 정규화를 통해 수학식 2에 따른 제2 활성화 함수를 사용하는 경우, 수학식 1에 따른 활성화 함수를 사용하는 것에 비해 다음과 같은 이점을 가질 수 있다. In Equation 2, the hyperparameters expressed as ν and λ in Equation 1 can be expressed as a and b by normalizing by limiting the multiplier so that the maximum value of the activation function is 1. When the second activation function according to Equation 2 is used through such normalization, the following advantages can be obtained compared to using the activation function according to Equation 1.

i) 뉴럴 네트워크(500)의 특성에 따라 이전 레이어에서 출력되는 특징 벡터(또는 특징)의 동적 범위(dynamic range)를 고려할 때, 제2 활성화 함수의 동적 범위를 [0, 1]으로 고정하여 사용하는 것은 다음 레이어로의 안정적인 입력을 제공할 수 있다. 뿐만 아니라, ii) 제2 활성화 함수에 의해 정규화된 안정적인 동적 범위의 입력을 통해 뉴럴 네트워크가 빠른 수렴률(convergence rate) 또는 빠른 수렴 속도를 나타낼 수 있어 뉴럴 네트워크(500)의 처리 속도를 개선할 수 있다. i) Considering the dynamic range of the feature vector (or feature) output from the previous layer according to the characteristics of the neural network 500, the dynamic range of the second activation function is fixed to [0, 1] and used Doing so can provide stable input to the next layer. In addition, ii) the processing speed of the neural network 500 can be improved because the neural network can exhibit a fast convergence rate or fast convergence rate through a stable dynamic range input normalized by the second activation function. there is.

이 밖에도, 수학식 2에 따른 제2 활성화 함수는 제1 활성화 함수에 비해 예를 들어, 도 4의 제4 영역(440)에 해당하는 분포의 외부(ODD) 영역에 속한 입력에 대한 불확실한 결정(uncertain decision)을 제공함으로써 뉴럴 네트워크(500)가 분포의 외부(ODD) 영역에 속한 입력에 대해 오류를 출력할 가능성을 낮출 수 있다. In addition, compared to the first activation function, the second activation function according to Equation 2 makes an uncertain decision about an input belonging to the outer (ODD) region of the distribution corresponding to the fourth region 440 of FIG. By providing an uncertain decision, the probability that the neural network 500 outputs an error with respect to an input belonging to an outside (ODD) region of the distribution can be reduced.

도 6은 일 실시예에 따른 뉴럴 네트워크의 트레이닝 방법을 나타낸 흐름도이고, 도 7은 도 6의 트레이닝 방법을 설명하기 위한 도면이다. 이하 실시예에서 각 동작들은 순차적으로 수행될 수도 있으나, 반드시 순차적으로 수행되는 것은 아니다. 예를 들어, 각 동작들의 순서가 변경될 수도 있으며, 적어도 두 동작들이 병렬적으로 수행될 수도 있다.6 is a flowchart illustrating a method for training a neural network according to an exemplary embodiment, and FIG. 7 is a diagram for explaining the training method of FIG. 6 . In the following embodiments, each operation may be performed sequentially, but not necessarily sequentially. For example, the order of each operation may be changed, or at least two operations may be performed in parallel.

도 6 및 도 7을 참조하면, 일 실시예에 따른 트레이닝 장치는 단계(610) 내지 단계(630)을 통해 뉴럴 네트워크(700)를 트레이닝할 수 있다. Referring to FIGS. 6 and 7 , the training apparatus according to an embodiment may train the neural network 700 through steps 610 to 630 .

단계(610)에서, 트레이닝 장치는 뉴럴 네트워크(700)의 복수의 중간 레이어들(720 및/또는 730) 각각에 속한 중간 노드들에 제1 활성화 함수를 적용한 제1 결과값을 추출한다. 제1 활성화 함수는 예를 들어, 계단 함수, 시그모이드 함수, 하이퍼볼릭 탄젠트 함수, 렐루 함수, 및 리키 렐루 함수 중 어느 하나를 포함할 수 있으며, 반드시 이에 한정되지는 않는다. In step 610, the training device extracts a first result value obtained by applying a first activation function to intermediate nodes belonging to each of the plurality of intermediate layers 720 and/or 730 of the neural network 700. The first activation function may include, for example, any one of a step function, a sigmoid function, a hyperbolic tangent function, a relu function, and a ricky relu function, but is not necessarily limited thereto.

단계(620)에서, 트레이닝 장치는 복수의 중간 레이어들 중 하나 이상의 임의의 레이어(710 및/또는 740)에 속한 중간 노드들(711 및/또는 741)에 연결된 추가 노드들(715 및/또는 745)에 제1 활성화 함수와 상이한 제2 활성화 함수를 적용하여 제2 결과값을 추출한다. 이때, 추가 노드들(715 및/또는 745)의 개수는 (추가 노드들(715 및/또는 745)에 연결된 중간 노드들(711 및/또는 741)의 개수 - 1) 개이고, 추가 노드들(715 및/또는 745)과 중간 노드들(711 및/또는 741)은 완전 연결(fully connected)될 수 있다. In step 620, the training device includes additional nodes 715 and/or 745 connected to intermediate nodes 711 and/or 741 belonging to one or more optional layers 710 and/or 740 among a plurality of intermediate layers. ) to extract a second result value by applying a second activation function different from the first activation function. At this time, the number of additional nodes 715 and / or 745 is (the number of intermediate nodes 711 and / or 741 connected to the additional nodes 715 and / or 745 - 1), and the additional nodes 715 and/or 745 and intermediate nodes 711 and/or 741 may be fully connected.

제2 활성화 함수는 제2 활성화 함수의 피크 값이 예를 들어, '1'로 고정되도록 제2 활성화 함수의 승수가 제2 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터 및 제2 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터에 의해 결정될 수 있다. 제2 활성화 함수는 예를 들어, 전술한 수학식 2와 같이 나타낼 수 있으며, x는 추가 노드들(715 및/또는 745)에 대한 입력을 나타내고, 는 입력(x)이 0보다 작을 때에 제2 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝 함수를 나타낼 수 있다. 제2 활성화 함수의 동적 범위는 예를 들어, [0, 1]으로 제한될 수 있다. The second activation function is such that the multiplier of the second activation function is fixed to '1', for example, the first hyperparameter associated with the rising slope of the second activation function and the falling of the second activation function so that the peak value of the second activation function is fixed at '1'. It may be determined by a second hyperparameter associated with the slope. The second activation function can be expressed, for example, as in Equation 2 above, where x represents an input to the additional nodes 715 and/or 745, may represent a Heaviside step function that makes the output of the second activation function 0 when the input (x) is less than 0. The dynamic range of the second activation function may be limited to, for example, [0, 1].

단계(630)에서, 트레이닝 장치는 단계(610)에서 추출한 제1 결과 값과 단계(620)에서 추출한 제2 결과값 간의 차이에 기초하여, 뉴럴 네트워크를 트레이닝한다. 트레이닝 장치는 제1 결과값과 제2 결과값 간의 차이가 최소화되도록 뉴럴 네트워크를 트레이닝할 수 있다. In step 630, the training device trains the neural network based on the difference between the first result value extracted in step 610 and the second result value extracted in step 620. The training device may train the neural network such that a difference between the first result value and the second result value is minimized.

일 실시예에서는 뉴럴 네트워크(700)의 중간 레이어들 중 하나 이상의 임의의 레이어(710 및/또는 740)에 대해 추가적인 결정 뉴런(Decision Neuron)에 해당하는 추가 노드들(715 및/또는 745)을 연결하여 제2 활성화 함수를 적용하고, 제1 활성화 함수를 적용한 제1 결과값과 제2 활성화 함수를 적용한 제2 결과값 간의 결정 로스(Decision Loss)를 적용하는 방식으로 뉴럴 네트워크(700)를 트레이닝할 수 있다. 이때, 뉴럴 네트워크(700)의 하나 이상의 임의의 레이어(710 및/또는 740)에 추가 노드들(715 및/또는 745)을 연결하여 추가적인 결정 로스를 설계하는 것은 하나 이상의 임의의 레이어(710 및/또는 740)에 직접적으로 제2 활성화 함수를 적용한 그래디언트(gradient)를 가하는 것과 같은 결과를 얻을 수 있다. In an embodiment, additional nodes 715 and/or 745 corresponding to additional decision neurons are connected to one or more arbitrary layers 710 and/or 740 among intermediate layers of the neural network 700. The neural network 700 is trained by applying a second activation function and applying a decision loss between a first result value obtained by applying the first activation function and a second result value obtained by applying the second activation function. can At this time, designing an additional decision loss by connecting additional nodes 715 and/or 745 to one or more arbitrary layers 710 and/or 740 of the neural network 700 is one or more arbitrary layers 710 and/or 740. Alternatively, the same result as applying a gradient obtained by applying the second activation function directly to 740) may be obtained.

도 8은 다른 실시예에 따른 뉴럴 네트워크의 트레이닝 방법을 나타낸 흐름도이고, 도 9는 도 8의 트레이닝 방법을 설명하기 위한 도면이다. 8 is a flowchart illustrating a method for training a neural network according to another embodiment, and FIG. 9 is a diagram for explaining the training method of FIG. 8 .

이하 실시예에서 각 동작들은 순차적으로 수행될 수도 있으나, 반드시 순차적으로 수행되는 것은 아니다. 예를 들어, 각 동작들의 순서가 변경될 수도 있으며, 적어도 두 동작들이 병렬적으로 수행될 수도 있다.In the following embodiments, each operation may be performed sequentially, but not necessarily sequentially. For example, the order of each operation may be changed, or at least two operations may be performed in parallel.

도 8 및 도 9를 참조하면, 일 실시예에 따른 트레이닝 장치는 단계(810) 내지 단계(840)을 통해 뉴럴 네트워크(900)를 트레이닝할 수 있다. 도 9에서 점선 위쪽에 도시된 뉴럴 네트워크(900)는 제1 활성화 함수로 트레이닝된 네트워크에 해당하고, 점선 아래쪽에 도시된 뉴럴 네트워크(900-1)는 제1 활성화 함수 중 일부를 제2 활성화 함수로 바꾸어 미세 조정(fine-tuning)한 뉴럴 네트워크(900-1)에 해당할 수 있다. Referring to FIGS. 8 and 9 , the training apparatus according to an embodiment may train a neural network 900 through steps 810 to 840 . In FIG. 9 , a neural network 900 shown above the dotted line corresponds to a network trained with a first activation function, and a neural network 900-1 shown below the dotted line uses part of the first activation function as a second activation function. may correspond to the fine-tuned neural network 900-1.

단계(810)에서, 트레이닝 장치는 뉴럴 네트워크(900)의 입력 레이어(910)에 입력된 학습 데이터(905)를, 중간 레이어들 중 입력 레이어(910)에 인접한 제1 중간 레이어(923)에 속하며 제1 활성화 함수에 따라 동작하는 제1 노드들에 전파하여 제1 특징 벡터를 생성한다. 제1 활성화 함수는 예를 들어, 계단 함수, 시그모이드 함수, 하이퍼볼릭 탄젠트 함수, 및 렐루 함수, 및 리키 렐루 함수 중 어느 하나를 포함할 수 있다. In step 810, the training device applies the training data 905 input to the input layer 910 of the neural network 900 to a first intermediate layer 923 adjacent to the input layer 910 among intermediate layers, A first feature vector is generated by propagating to first nodes operating according to a first activation function. The first activation function may include, for example, any one of a step function, a sigmoid function, a hyperbolic tangent function, a relu function, and a ricky relu function.

단계(820)에서, 트레이닝 장치는 제1 특징 벡터와 학습 데이터(905)에 대응하는 정답 벡터 간의 차이에 기초하여, 뉴럴 네트워크(900)를 1차 트레이닝한다. 여기서, 1차 트레이닝은 "프리-트레이닝(Pre-training)"이라 부를 수 있다. In step 820 , the training device performs primary training of the neural network 900 based on the difference between the first feature vector and the correct answer vector corresponding to the learning data 905 . Here, the first training may be referred to as "pre-training".

단계(830)에서, 트레이닝 장치는 제1 특징 벡터를, 1차 트레이닝된 뉴럴 네트워크(900-1)의 중간 레이어들(920) 중 출력 레이어(930)에 인접한 제2 중간 레이어(926)에 속하며 제2 활성화 함수에 따라 동작하는 제2 노드들에 전파하여 제2 특징 벡터를 생성한다. 제2 활성화 함수는 전술한 수학식 2로 표현될 수 있으며, x는 제2 특징 벡터를 나타내고, 는 제2 특징 벡터(x)가 0보다 작을 때에 제2 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝 함수를 나타낼 수 있다. 제2 활성화 함수의 동적 범위는 [0, 1]으로 제한될 수 있다. In step 830, the training device belongs to the second intermediate layer 926 adjacent to the output layer 930 among the intermediate layers 920 of the first trained neural network 900-1, and The second feature vector is generated by propagating to second nodes operating according to the second activation function. The second activation function may be expressed as Equation 2 above, where x represents the second feature vector, may represent a Heaviside step function that makes an output of the second activation function 0 when the second feature vector (x) is less than 0. The dynamic range of the second activation function may be limited to [0, 1].

단계(840)에서, 트레이닝 장치는 제2 중간 레이어(926)에 생성된 제2 임베딩 벡터를 출력 레이어(930)를 통해 출력한 출력값(940)과 학습 데이터(905)에 대응하는 정답값 간의 차이에 기초하여, 제1 트레이닝된 뉴럴 네트워크(900-1)를 2차 트레이닝한다. 여기서, 2차 트레이닝은 "미세 조정(Fine-tuning)"이라 부를 수 있다. In step 840, the training apparatus outputs the second embedding vector generated in the second intermediate layer 926 through the output layer 930, and the difference between the output value 940 and the correct value corresponding to the training data 905 Based on , the first trained neural network 900-1 is secondarily trained. Here, the secondary training may be referred to as "fine-tuning".

뉴럴 네트워크(900)는 학습 데이터(905)를 순방향(forward direction)으로 전파한 결과에 해당하는 출력값(940)을 출력하고, 뉴럴 네트워크(900)의 예상 출력에 해당하는 학습 데이터(905)에 대응하는 정답값과 실제 출력값(940) 간의 차이를 산출할 수 있다. 트레이닝 장치는 정답값과 출력 값(940) 간의 차이를 역방향(backward direction)으로 전파하여 차이가 최소화하도록 뉴럴 네트워크(900)의 가중치들을 조정함으로써 뉴럴 네트워크(900)를 트레이닝할 수 있다. The neural network 900 outputs an output value 940 corresponding to a result of propagating the training data 905 in a forward direction, and corresponds to the training data 905 corresponding to the expected output of the neural network 900. The difference between the correct answer value and the actual output value 940 may be calculated. The training apparatus may train the neural network 900 by propagating a difference between the correct value and the output value 940 in a backward direction and adjusting weights of the neural network 900 to minimize the difference.

일 실시예에 따르면, 일반적인 뉴럴 네트워크에 대해서도 출력 레이어에 인접한 레이어에 속한 노드들에 대해 비선형성(non-linearity)를 제안하는 제2 활성화 함수를 적용하여 미세 조정함으로써 태스크(task)를 위해 뉴럴 네트워크를 새롭게 트레이닝하지 않고도 뉴럴 네트워크의 정확도를 개선할 수 있다. According to an embodiment, a neural network for a task is fine-tuned by applying a second activation function that proposes non-linearity to nodes belonging to a layer adjacent to an output layer even for a general neural network. It is possible to improve the accuracy of the neural network without retraining.

도 10은 일 실시예에 따른 뉴럴 네트워크를 이용하여 생체 정보의 위조 여부를 검출하는 방법을 나타낸 흐름도이고, 도 11은 도 10의 뉴럴 네트워크의 구조 및 동작을 설명하기 위한 도면이다. 이하 실시예에서 각 동작들은 순차적으로 수행될 수도 있으나, 반드시 순차적으로 수행되는 것은 아니다. 예를 들어, 각 동작들의 순서가 변경될 수도 있으며, 적어도 두 동작들이 병렬적으로 수행될 수도 있다.10 is a flowchart illustrating a method of detecting forgery of biometric information using a neural network according to an embodiment, and FIG. 11 is a diagram for explaining the structure and operation of the neural network of FIG. 10 . In the following embodiments, each operation may be performed sequentially, but not necessarily sequentially. For example, the order of each operation may be changed, or at least two operations may be performed in parallel.

도 10 및 도 11을 참조하면, 일 실시예에 따른 뉴럴 네트워크(1100)를 이용하여 생체 정보의 위조 여부를 검출하는 전자 장치는 단계(1010) 내지 단계(1040)를 통해 생체 정보의 위조 여부를 검출할 수 있다. 뉴럴 네트워크(1100)는 예를 들어, 컨볼루션 신경망(Convolution Neural Network; Conv) 또는 심층 신경망(Deep neural network)일 수 있으며, 반드시 이에 한정되지는 않는다. 뉴럴 네트워크(1100)는 적어도 일부의 중간 레이어에 제2 활성화 함수를 적용하여 생체 정보의 위조 여부를 검출하도록 트레이닝된 것일 수 있다. Referring to FIGS. 10 and 11 , an electronic device that detects forgery of biometric information using a neural network 1100 according to an embodiment determines whether biometric information is forged through steps 1010 to 1040. can be detected. The neural network 1100 may be, for example, a Convolution Neural Network (Conv) or a deep neural network, but is not necessarily limited thereto. The neural network 1100 may be trained to detect whether biometric information is forged by applying a second activation function to at least some intermediate layers.

단계(1010)에서, 전자 장치는 사용자의 생체 정보를 포함하는 입력 데이터(1105)로부터 생체 정보의 위조(spoof) 여부를 검출하는 뉴럴 네트워크(1100)의 복수의 중간 레이어들(1101,1102,1103)로부터, 미리 트레이닝된 하나 이상의 제1 분류기(1120)를 이용하여, 하나 이상의 제1 특징 벡터를 추출한다. 생체 정보는 예를 들어, 사용자의 지문, 홍채, 및 얼굴 중 어느 하나를 포함할 수 있으며, 반드시 이에 한정되지는 않는다. 복수의 중간 레이어들(1101,1102,1103)은 예를 들어, 컨볼루션 레이어에 해당할 수 있으며, 반드시 이에 한정되지는 않는다. In step 1010, the electronic device includes a plurality of intermediate layers 1101, 1102, and 1103 of the neural network 1100 that detect whether the biometric information is spoofed from the input data 1105 including the user's biometric information. ), one or more first feature vectors are extracted using one or more first classifiers 1120 trained in advance. Biometric information may include, for example, any one of a user's fingerprint, iris, and face, but is not necessarily limited thereto. The plurality of intermediate layers 1101 , 1102 , and 1103 may correspond to, for example, convolutional layers, but are not necessarily limited thereto.

전자 장치는 예를 들어, 하나 이상의 제1 분류기(1120) 중 제1-1 분류기(1120-1)를 이용하여 복수의 중간 레이어들(1101,1102,1103) 중 제1 중간 레이어(1101)로부터 제1-1 특징 벡터를 추출할 수 있다. 전자 장치는 하나 이상의 제1 분류기(1120) 중 제1-2 분류기(1120-2)를 이용하여 제1 중간 레이어(1101) 이후의 제2 중간 레이어(1102)로부터 제1-2 특징 벡터를 추출할 수 있다. 전자 장치는 제1-1 특징 벡터 및 제1-2 특징 벡터를 조합한 제1 특징 벡터(1140)를 추출할 수 있다. The electronic device, for example, from the first intermediate layer 1101 among the plurality of intermediate layers 1101 , 1102 , and 1103 using the 1-1 classifier 1120 - 1 of the one or more first classifiers 1120 . A 1-1 feature vector may be extracted. The electronic device extracts a 1-2 feature vector from the second intermediate layer 1102 after the first intermediate layer 1101 by using the 1-2 classifier 1120-2 among one or more first classifiers 1120. can do. The electronic device may extract a first feature vector 1140 obtained by combining the 1-1st feature vector and the 1-2nd feature vector.

또는, 전자 장치는 하나 이상의 제1 분류기(1120) 중 제1-3 분류기(1120-3)를 이용하여 제2 중간 레이어(1102) 이후의 제3중간 레이어(1103)로부터 제1-3 특징 벡터를 추출할 수 있다. 전자 장치는 제1-1 특징 벡터, 제1-2 특징 벡터, 및 제1-3 특징 벡터를 조합한 제1 특징 벡터(1140)를 추출할 수도 있다. Alternatively, the electronic device may use the 1-3 classifiers 1120-3 of the one or more first classifiers 1120 to obtain the 1-3 feature vectors from the 3rd intermediate layer 1103 after the 2nd intermediate layer 1102. can be extracted. The electronic device may extract a first feature vector 1140 obtained by combining the 1-1 feature vectors, the 1-2 feature vectors, and the 1-3 feature vectors.

단계(1020)에서, 전자 장치는 단계(1010)에서 획득한 하나 이상의 제1 특징 벡터에 기초하여, 생체 정보의 제1 위조 여부를 검출할 수 있다. 전자 장치는 예를 들어, 데이터베이스(1150)에 미리 구비된 등록 특징 벡터 및 위조 특징 벡터 중 적어도 하나와 단계(1010)에서 획득한 하나 이상의 제1 특징 벡터 또는 단계(1010)에서 조합한 특징 벡터 간의 유사도에 기초하여 제1 스코어를 산출할 수 있다. 여기서, '유사도'는 입력 데이터(1105)가 얼마나 실제 생체 정보와 가까운지를 나타내는 지표로, 유사도 값이 높을수록 실제 생체 정보(예: 지문, 또는 홍채)일 확률이 높을 수 있다. In step 1020, the electronic device may detect whether the biometric information is first forged based on the one or more first feature vectors obtained in step 1010. In the electronic device, for example, between at least one of a registered feature vector and a fake feature vector previously provided in the database 1150 and one or more first feature vectors obtained in step 1010 or feature vectors combined in step 1010. A first score may be calculated based on the degree of similarity. Here, 'similarity' is an index indicating how close the input data 1105 is to actual biometric information, and the higher the similarity value, the higher the probability that the input data 1105 is real biometric information (eg, fingerprint or iris).

제1 스코어는 사용자에 대응하여 트레이닝된 결과에 의해 결정된다는 점에서 "사용자 의존적 유사도 점수(User-dependent similarity score)"라고 불릴 수도 있다. 전자 장치는 하나 이상의 제1 분류기(1120)를 이용하여 제1 스코어가 위조 정보로 판단되는 스코어인지 또는 실제 정보로 판단되는 스코어인지를 분류할 수 있다. 전자 장치는 데이터베이스(1150)에 미리 구비된 등록 특징 벡터 및 위조 특징 벡터와 같은 분포 내(In-distribution) 데이터를 이용하여 각 유사도 간의 비율에 따라 제1 스코어를 산출하므로 위조 여부의 판단에 대해 강건성(robustness)을 확보할 수 있다. The first score may be called a “user-dependent similarity score” in that it is determined by a result of training corresponding to a user. The electronic device may classify whether the first score is a score determined as fake information or a score determined as real information using one or more first classifiers 1120 . Since the electronic device calculates the first score according to the ratio between the similarities using in-distribution data such as the registration feature vector and the fake feature vector previously provided in the database 1150, it is robust against the determination of whether to forge or not. (robustness) can be obtained.

예를 들어, 제1 스코어가 생체 정보가 실제 정보로 판단되는 범위에 속하는지, 또는 생체 정보가 위조 정보로 판단되는 범위에 속하는지 여부를 명확하게 판단할 수 있는 영역(예: 도 4의 제1 영역(410) 및/또는 제2 영역(420))에 속한 특징에 대응하는 경우, 전자 장치는 제1 스코어에 의해 제1 위조 여부를 바로 결정할 수 있다. 전자 장치는 입력 데이터(1105)를 바로 '실제 정보' 또는 '위조 정보'로 조기 판단(Early Decision; ED)할 수 있다. 이와 달리, 제1 스코어가 생체 정보가 어디에 속하는지 명확하게 판단할 수 없는 영역(예: 도 4의 제3 영역(430) 및/또는 제4 영역(440))에 속한 특징에 대응하는 경우, 전자 장치는 제1 스코어에 의해 제1 위조 여부를 바로 결정하지 않고, 제1 스코어와 제2 스코어의 융합을 통해 위조 여부('제2 위조 여부')를 결정할 수 있다. For example, an area where the first score can clearly determine whether or not the biometric information falls within a range where the biometric information is determined to be real information or whether the biometric information falls within a range where the biometric information is determined to be fake information (e.g., in FIG. 4 ). If the feature corresponds to the first area 410 and/or the second area 420, the electronic device may directly determine whether the first forgery is forged or not based on the first score. The electronic device may immediately determine the input data 1105 as 'real information' or 'false information' (Early Decision (ED)). In contrast, when the first score corresponds to a feature belonging to an area (eg, the third area 430 and/or the fourth area 440 of FIG. 4 ) in which it is impossible to clearly determine where the biometric information belongs, The electronic device may determine forgery status ('second forgery status') through convergence of the first score and the second score, without immediately determining whether the first forgery has occurred based on the first score.

통상적인 심층 신경망의 분류기가 엔드-투-엔드(End-to-End) 구조로 구성되는 것과 달리, 하나 이상의 제1 분류기(1120)는 생체 정보를 포함하는 입력 데이터(1105)로부터 네트워크 추론을 수행하는 도중에 뉴럴 네트워크(1100)의 중간 레이어들(1101,1102,1103)로부터 추출한 특징 벡터들로부터 생체 정보의 위조 여부를 분류할 수 있다. 하나 이상의 제1 분류기(1120)는 특징 벡터들을 기반으로 입력된 영상을 분류하도록 트레이닝된 분류기일 수 있다. 하나 이상의 제1 분류기(1120)는 예를 들어, 심층 신경망 보다 계산량이 적은 쉘로우 심층 신경망들(Shallow DNN)으로 구성될 수 있으며, 중간 레이어에서의 조기 판단으로 인한 오버헤드(overhead)가 작아 속도 저하 없이 제1 위조 여부를 빠르게 검출할 수 있다. Unlike conventional deep neural network classifiers that are configured in an end-to-end structure, one or more first classifiers 1120 perform network inference from input data 1105 including biometric information. In the meantime, it is possible to classify whether biometric information is forged or not from feature vectors extracted from the middle layers 1101, 1102, and 1103 of the neural network 1100. One or more first classifiers 1120 may be classifiers trained to classify an input image based on feature vectors. One or more first classifiers 1120 may be composed of, for example, shallow deep neural networks (Shallow DNN), which require less computation than deep neural networks, and reduce speed due to small overhead due to early judgment in the middle layer. It is possible to quickly detect whether or not the first forgery is made without.

전자 장치는 출력 레이어(1104)로부터 출력 벡터를 도출하기 전에 조기 판단을 수행하는 하나 이상의 제1 분류기(1120)를 통해 생체 정보의 제1 위조 여부가 검출되는 경우, 출력 벡터를 이용하지 않고도 바로 위조 여부를 검출할 수 있으므로 생체 정보의 위조 여부 판단을 수행하는 시간을 단축할 수 있다. When the first forgery of biometric information is detected through one or more first classifiers 1120 that perform an early determination before deriving an output vector from the output layer 1104, the electronic device immediately counterfeits without using the output vector. Since it is possible to detect whether the biometric information is forged or not, the time for determining whether the biometric information is forged can be shortened.

생체 정보의 위조 여부를 판단하는 경우, 판단의 정확도와 위조 여부를 검출하는 속도는 트레이드-오프(Trade-off) 관계에 해당할 수 있다. 전자 장치는 빠른 판단을 위해 하나 이상의 제1 분류기(1120)를 순차적으로 이용하되, 하나 이상의 제1 분류기(1120)의 검출 신뢰도가 높은 경우, 제1 위조 여부를 바로 활용하고, 검출 신뢰도가 낮은 경우, 제2 분류기(1130)에 의해 출력 벡터로부터 산출된 제2 스코어를 함께 이용하여 생체 정보의 위조 여부(제2 위조 여부)를 판단할 수 있다. When determining whether biometric information is forged or not, the accuracy of determination and the speed of detecting forgery may correspond to a trade-off relationship. The electronic device sequentially uses one or more first classifiers 1120 for fast determination, but when the detection reliability of the one or more first classifiers 1120 is high, the first forgery information is immediately utilized, and the detection reliability is low. , It is possible to determine whether biometric information is forged (second forgery or not) by using the second score calculated from the output vector by the second classifier 1130 together.

단계(1030)에서, 전자 장치는 단계(1020)에서 제1 위조 여부가 검출되는지 여부에 따라, 출력 레이어(1104)로부터 출력되는 출력 벡터를 미리 트레이닝된 제2 분류기(1130)에 인가하여 제2 스코어를 산출한다. 제2 스코어는 영상에 기반하여 결정된 스코어라는 점에서 "영상-의존 결정 스코어(Image-dependent Decision Score)"라고 불릴 수도 있다. 제2 스코어만으로 위조 여부를 결정하는 경우, 도 4의 제4 영역(440)과 같이 처음 보는(unseen) 데이터에 대한 비정상성(non-stationarity)로 인해 에러 발생 확률이 매우 높을 수 있다. 따라서, 전자 장치는 제1 스코어와 제2 스코어를 함께 사용하여 생체 정보의 위조 여부('제2 위조 여부')를 결정할 수 있다. In step 1030, the electronic device applies the output vector output from the output layer 1104 to the pretrained second classifier 1130 according to whether or not the first forgery is detected in step 1020 to obtain a second classifier. Calculate score. The second score may be referred to as an “Image-dependent Decision Score” in that it is a score determined based on an image. In the case of determining forgery only with the second score, the probability of occurrence of an error may be very high due to non-stationarity of unseen data, as shown in the fourth region 440 of FIG. 4 . Accordingly, the electronic device may use the first score and the second score together to determine whether the biometric information is forged or not ('second forgery or not').

예를 들어, 제1 스코어에 의해 위조 여부('제1 위조 여부')가 검출된 경우, 전자 장치는 단계(1030)를 수행하지 않고 동작을 종료할 수도 있다. 이와 달리, 제1 스코어에 대응하는 특징이 예를 들어, 도 4의 제3 영역(430) 및/또는 제4 영역(440)에 포함되어 실제 정보에 해당하는지, 아니면 위조 정보에 해당하는지를 명확하게 판단할 수 없는 경우, 전자 장치는 제1 스코어에 의해 위조 여부를 바로 결정할 수 없다. 제1 스코어에 의해 위조 여부를 바로 결정할 수 없는 경우, 전자 장치는 출력 벡터로부터 산출된 제2 스코어와 단계(1020)에서 산출된 제1 스코어를 함께 이용하여 위조 여부('제2 위조 여부')를 결정할 수 있다. For example, if forgery status ('first forgery status') is detected by the first score, the electronic device may end the operation without performing step 1030. On the other hand, the feature corresponding to the first score is included in, for example, the third area 430 and/or the fourth area 440 of FIG. 4 to clarify whether it corresponds to real information or fake information. If it cannot be determined, the electronic device cannot directly determine whether or not it is forged based on the first score. If it is not possible to immediately determine whether or not to be counterfeited by the first score, the electronic device uses the second score calculated from the output vector together with the first score calculated in step 1020 to determine whether or not to have been counterfeited ('second counterfeit status'). can decide

단계(1030)에서, 하나 이상의 제1 분류기(1120) 및 제2 분류기(1130) 중 적어도 하나는 뉴럴 네트워크(1100)를 위한 활성화 함수의 피크 값이 고정되도록 활성화 함수의 승수가 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터 및 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터에 의해 결정된 활성화 함수에 의해 트레이닝된 것일 수 있다. 하나 이상의 제1 분류기(1120) 및 제2 분류기(1130)는 예를 들어, 완전 연결된 레이어(FC(Fully-Connected) layer)로 구성될 수 있으며, 반드시 이에 한정되지는 않는다. 이때, 활성화 함수의 동적 범위는 예를 들어, [0, 1]으로 제한될 수 있다. 활성화 함수는 예를 들어, 전술한 수학식 2로 나타낼 수 있다. 이때, 수학식 2에서 x는 입력 데이터(1105)는 나타내고, 는 입력 데이터(x)가 0보다 작을 때에 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝 함수를 나타낼 수 있다. In step 1030, at least one of the one or more first classifiers 1120 and the second classifier 1130 determines that the multiplier of the activation function is equal to the rising slope of the activation function such that the peak value of the activation function for the neural network 1100 is fixed. It may be trained by an activation function determined by a first hyperparameter associated with and a second hyperparameter associated with a descending slope of the activation function. One or more first classifiers 1120 and one or more second classifiers 1130 may be configured of, for example, fully-connected (FC) layers, but are not necessarily limited thereto. At this time, the dynamic range of the activation function may be limited to, for example, [0, 1]. The activation function may be represented by, for example, Equation 2 above. At this time, in Equation 2, x represents the input data 1105, may represent a Heaviside step function that makes the output of the activation function 0 when the input data (x) is less than 0.

단계(1040)에서, 전자 장치는 하나 이상의 제1 특징 벡터에 기초하여 산출된 제1 스코어 및 제2 스코어를 융합한 스코어에 의해 생체 정보의 제2 위조 여부를 검출한다. 전자 장치는 예를 들어, 제1 스코어와 제2 스코어를 가중합(weighted sum)에 의해 융합한 스코어를 산출할 수 있다. 융합한 스코어가 제2 여부의 검출을 위한 임계값 보다 클 경우, 전자 장치는 입력 데이터(1105)를 '실제 정보'로 결정할 수 있다. 이와 달리, 융합한 스코어가 제2 위조 여부 검출을 위한 임계값 보다 작거나 같은 경우, 전자 장치는 입력 데이터(1105)를 '위조 정보'로 결정할 수 있다. 전자 장치는 입력 데이터(1105)에 대한 결정에 의해 제2 위조 여부를 검출할 수 있다. In operation 1040, the electronic device detects whether the biometric information is second forged or not based on a fusion score of the first score and the second score calculated based on one or more first feature vectors. For example, the electronic device may calculate a fused score by performing a weighted sum of the first score and the second score. If the fused score is greater than the threshold for detecting the second status, the electronic device may determine the input data 1105 as 'actual information'. In contrast, when the fusion score is less than or equal to the second threshold for forgery detection, the electronic device may determine the input data 1105 as 'counterfeit information'. The electronic device may detect whether or not the second forgery is forged by determining the input data 1105 .

생체 정보의 위조 여부를 검출하는 전자 장치에서는 위조 정보에 해당하는 분포 외(ODD) 입력이 자주 발생할 수 있다. 전자 장치는 위조 정보에 해당하는 분포 외(ODD) 입력에 대해 안정적으로 '위조 정보'라는 결정을 내려야 하지만, 분포 외(ODD) 입력이 뉴럴 네트워크에 의해 전혀 트레이닝되지 않은 처음 보는 데이터 임에도 불구하고 마치 트레이닝 되었던 것과 같이 뉴럴 네트워크가 과신(Over-confident)하여 판단하는 과신 오류가 발생할 수 있다. 이와 같이, 뉴럴 네트워크에 분포 외(ODD) 입력이 인가되는 경우, 위조 여부에 대해 잘못된 결정을 내리기 보다는 분포 외(ODD) 입력에 대해 '판단되지 않음(not decided)'과 같이 불확실하게 결정하는 것이 더 나을 수 있다. In an electronic device that detects whether biometric information is forged or not, an out-of-distribution (ODD) input corresponding to forged information may frequently occur. The electronic device must reliably make a decision that it is 'false information' for out-of-distribution (ODD) inputs corresponding to spurious information, but it is as if the out-of-distribution (ODD) input is new data that has not been trained by the neural network at all. An over-confidence error may occur in which the neural network is over-confident and judges as it was trained. As such, when an out-of-distribution (ODD) input is applied to a neural network, it is better to make an uncertain decision, such as 'not decided', for an out-of-distribution (ODD) input than to make an incorrect decision about whether it is falsified or not. Could be better.

일 실시예에서는 뉴럴 네트워크에 제2 활성화 함수를 적용함으로써 분포 외(ODD) 입력에 대한 출력의 불확실성을 높일 수 있다. In an embodiment, the uncertainty of an output for an out-of-distribution (ODD) input may be increased by applying the second activation function to the neural network.

전자 장치는 인증 시도 시에 생성된 영상을 포함하는 입력 데이터(1105)와 데이터베이스(1150)에 미리 구비된 등록 특징 벡터 및 위조 특징 벡터 간의 유사도 비교를 통해 얻은 제1 스코어를 제2 스코어와 함께 활용하여 최종 결정 스코어(Decision Score)를 출력함으로써 오류를 줄일 수 있다. The electronic device utilizes the first score obtained by comparing the similarity between the input data 1105 including the image generated during the authentication attempt and the registered feature vector and the fake feature vector previously provided in the database 1150 together with the second score. Errors can be reduced by outputting the final decision score.

보다 구체적으로, 분포 외(ODD) 입력이 뉴럴 네트워크에 입력되면, 전술한 방식에 따라 유사도 계산 방식에서 강인(Robust)하므로 제1 스코어가 합리적인 스코어로 산출될 수 있지만, 순수하게 뉴럴 네트워크(1100)의 출력만을 사용하여 산출된 제2 스코어에서는 과신 오류가 발생할 가능성이 높아 뉴럴 네트워크(1100)에서 최종적으로 출력되는 결정 스코어 역시 오류가 많아질 수 있다. More specifically, if an out-of-distribution (ODD) input is input to the neural network, the first score can be calculated as a reasonable score because it is robust in the similarity calculation method according to the above-described method, but the neural network 1100 is purely Since an overconfidence error is highly likely to occur in the second score calculated using only the output of , the decision score finally output from the neural network 1100 may also have many errors.

이러한 상황에서 불확실성을 부여한 제2 활성화 함수를 뉴럴 네트워크에 적용하면, 분포 외(OOD) 입력에 대해 과신 오류를 일으키지 않게 되며, 제1 스코어에서의 기여가 더 높아져 상대적으로 최종 결정 스코어에서 오류가 덜 발생할 수 있다.In this situation, when the second activation function with uncertainty is applied to the neural network, overconfidence errors for out-of-distribution (OOD) inputs do not occur, and the contribution in the first score is higher, resulting in relatively less error in the final decision score. can happen

도 12는 일 실시예에 따른 뉴럴 네트워크를 이용하여 생체 정보의 위조 여부를 검출하는 전자 장치의 블록도이다. 도 12를 참조하면, 일 실시예에 따른 전자 장치(1200)는 센서(1210), 프로세서(1230), 출력 장치(1250), 및 메모리(1270)를 포함할 수 있다. 센서(1210), 프로세서(1230), 출력 장치(1250), 및 메모리(1270)는 통신 버스(1205)를 통해 서로 연결될 수 있다. 12 is a block diagram of an electronic device that detects forgery of biometric information using a neural network according to an exemplary embodiment. Referring to FIG. 12 , an electronic device 1200 according to an embodiment may include a sensor 1210, a processor 1230, an output device 1250, and a memory 1270. The sensor 1210 , processor 1230 , output device 1250 , and memory 1270 may be connected to each other through a communication bus 1205 .

전자 장치(1200)는 예를 들어, 이동 전화, 스마트 폰, PDA, 넷북, 태블릿 컴퓨터, 랩톱 컴퓨터 등과 같은 모바일 장치, 스마트 워치, 스마트 밴드, 스마트 안경 등과 같은 웨어러블 디바이스, 데스크탑, 서버 등과 같은 컴퓨팅 장치, 텔레비전, 스마트 텔레비전, 냉장고 등과 같은 가전 제품, 도어 락 등과 같은 보안 장치, 의료 장치, 로보틱스, IoT(Internet of Things) 디바이스, 스마트 차량의 적어도 일부로 구현될 수 있으나, 이에 제한되지 않고 다양한 종류의 디바이스들에 해당될 수 있다.The electronic device 1200 includes, for example, a mobile device such as a mobile phone, a smart phone, a PDA, a netbook, a tablet computer, a laptop computer, a wearable device such as a smart watch, a smart band, and smart glasses, a computing device such as a desktop, a server, and the like. , TVs, smart televisions, home appliances such as refrigerators, security devices such as door locks, medical devices, robotics, IoT (Internet of Things) devices, and smart vehicles, but may be implemented as at least a part, but are not limited thereto, and various types of devices may apply to

센서(1210)는 사용자의 생체 정보를 포함하는 입력 데이터를 캡쳐한다. 사용자의 생체 정보는 예를 들어, 사용자의 홍채, 지문, 및 얼굴을 포함할 수 있으며, 반드시 이에 한정되지는 않는다. 센서(1210)는 예를 들어, 초음파 지문 센서, 광학 지문 센서, 정전 방식 지문 센서, 깊이 센서, 홍채 센서, 이미지 센서 등을 포함할 수 있으며, 반드시 이에 한정되지는 않는다. 센서(1210)는 이들 중 어느 하나가 사용될 수도 있고, 또는 둘 이상이 사용될 수도 있다. 센서(1210)에 의해 감지되는 생체 정보는 예를 들어, 도 1에 도시된 입력 지문 영상(115)일 수도 있고, 홍채 영상 또는 얼굴 영상일 수도 있다.Sensor 1210 captures input data including biometric information of the user. The user's biometric information may include, for example, the user's iris, fingerprint, and face, but is not necessarily limited thereto. The sensor 1210 may include, for example, an ultrasonic fingerprint sensor, an optical fingerprint sensor, an electrostatic fingerprint sensor, a depth sensor, an iris sensor, an image sensor, and the like, but is not necessarily limited thereto. As the sensor 1210, any one of these may be used, or two or more may be used. The biometric information sensed by the sensor 1210 may be, for example, the input fingerprint image 115 shown in FIG. 1 , an iris image, or a face image.

프로세서(1230)는 입력 데이터로부터 생체 정보의 위조 여부를 검출하는 뉴럴 네트워크의 복수의 중간 레이어들로부터, 미리 트레이닝된 하나 이상의 제1 분류기를 이용하여, 하나 이상의 제1 특징 벡터를 추출한다. 프로세서(1230)는 하나 이상의 제1 특징 벡터에 기초하여, 생체 정보의 제1 위조 여부를 검출한다. 프로세서(1230)는 제1 위조 여부가 검출되는지 여부에 따라, 출력 레이어로부터 출력되는 출력 벡터를 미리 트레이닝된 제2 분류기에 인가하여 제2 스코어를 산출한다. 프로세서(1230)는 하나 이상의 제1 특징 벡터에 기초하여 산출된 제1 스코어 및 제2 스코어를 융합한 스코어에 의해 생체 정보의 제2 위조 여부를 검출한다. 이때, 하나 이상의 제1 분류기 및 상기 제2 분류기 중 적어도 하나는 뉴럴 네트워크를 위한 활성화 함수의 피크 값이 고정되도록 활성화 함수의 승수가 활성화 함수의 상승 슬로프와 연관된 제1 하이퍼 파라미터 및 활성화 함수의 하강 슬로프와 연관된 제2 하이퍼 파라미터에 의해 결정되는 활성화 함수에 기초하여 트레이닝된 것일 수 있다. 활성화 함수는 예를 들어, 전술한 수학식 2로 표현될 수 있다. 수학식 2에서 x는 추가 노드들에 대한 입력을 나타내고, 는 추가 노드들에 대한 입력(x)가 0보다 작을 때에 활성화 함수의 출력을 0으로 만들어 주는 헤비사이드 스텝 함수를 나타낼 수 있다. The processor 1230 extracts one or more first feature vectors from a plurality of intermediate layers of a neural network that detects whether biometric information is forged or not from input data, using one or more first classifiers trained in advance. The processor 1230 detects whether the biometric information is first forged based on one or more first feature vectors. The processor 1230 calculates a second score by applying an output vector output from the output layer to a pre-trained second classifier according to whether the first forgery is detected. The processor 1230 detects whether or not the second forgery of the biometric information is based on a fusion score of the first score and the second score calculated based on one or more first feature vectors. At this time, in at least one of the one or more first classifiers and the second classifier, the multiplier of the activation function is the first hyperparameter associated with the ascending slope of the activation function and the descending slope of the activation function so that the peak value of the activation function for the neural network is fixed. It may be trained based on an activation function determined by a second hyperparameter associated with . activation function Can be expressed, for example, by Equation 2 above. In Equation 2, x represents the input to the additional nodes, is the activation function when the input (x) to the additional nodes is less than zero It can represent the Heaviside step function that makes the output of 0.

프로세서(1230)는 메모리(1270)에 포함된 실행가능한 명령어들을 실행한다. 프로세서(1230)는 프로그램을 실행하고, 전자 장치(1200)를 제어할 수 있다. 프로세서(1230)에 의하여 실행되는 프로그램 코드는 메모리(1270)에 저장될 수 있다.Processor 1230 executes executable instructions contained in memory 1270 . The processor 1230 may execute a program and control the electronic device 1200 . Program codes executed by the processor 1230 may be stored in the memory 1270 .

출력 장치(1250)는 프로세서(1230)가 검출한 제1 위조 여부 및 제2 위조 여부 중 적어도 하나를 출력한다. The output device 1250 outputs at least one of the first forgery and the second forgery detected by the processor 1230 .

메모리(1270)는 센서(1210)가 캡쳐한 입력 데이터를 저장할 수 있다. 메모리(1270)는 프로세서(1230)가 추출한 제1 특징 벡터, 제1 스코어, 및/또는 제2 스코어를 저장할 수 있다. 메모리(1270)는 출력 벡터를 저장할 수 있다. 메모리(1250)는 프로세서(1230)가 검출한 제1 위조 여부, 및/또는 제2 위조 여부를 저장할 수 있다. The memory 1270 may store input data captured by the sensor 1210 . The memory 1270 may store the first feature vector, the first score, and/or the second score extracted by the processor 1230 . Memory 1270 may store the output vector. The memory 1250 may store whether the processor 1230 detects first forgery or not and/or second forgery or not.

메모리(1270)는 프로세서(1230)의 처리 과정에서 생성되는 다양한 정보를 저장할 수 있다. 이 밖에도, 메모리(1270)는 각종 데이터와 프로그램 등을 저장할 수 있다. 메모리(1270)는 휘발성 메모리 또는 비휘발성 메모리를 포함할 수 있다. 메모리(1270)는 하드 디스크 등과 같은 대용량 저장 매체를 구비하여 각종 데이터를 저장할 수 있다.The memory 1270 may store various types of information generated during processing by the processor 1230 . In addition, the memory 1270 may store various data and programs. The memory 1270 may include volatile memory or non-volatile memory. The memory 1270 may include a mass storage medium such as a hard disk to store various types of data.

또한, 프로세서(1230)는 도 1 내지 도 11을 통해 전술한 적어도 하나의 방법 또는 적어도 하나의 방법에 대응되는 기법을 수행할 수 있다. 프로세서(1230)는 목적하는 동작들(desired operations)을 실행시키기 위한 물리적인 구조를 갖는 회로를 가지는 하드웨어로 구현된 전자 장치일 수 있다. 예를 들어, 목적하는 동작들은 프로그램에 포함된 코드(code) 또는 인스트럭션들(instructions)을 포함할 수 있다. 예를 들어, 하드웨어로 구현된 전자 장치(1200)는 마이크로프로세서(microprocessor), 중앙 처리 장치(Central Processing Unit; CPU), 그래픽 처리 장치(Graphic Processing Unit; GPU), 프로세서 코어(processor core), 멀티-코어 프로세서(multi-core processor), 멀티프로세서(multiprocessor), ASIC(Application-Specific Integrated Circuit), FPGA(Field Programmable Gate Array), NPU(Neural Processing Unit) 등을 포함할 수 있다.Also, the processor 1230 may perform at least one method described above with reference to FIGS. 1 to 11 or a technique corresponding to at least one method. The processor 1230 may be an electronic device implemented in hardware having a circuit having a physical structure for executing desired operations. For example, desired operations may include codes or instructions included in a program. For example, the electronic device 1200 implemented as hardware includes a microprocessor, a central processing unit (CPU), a graphic processing unit (GPU), a processor core, and a multiprocessor. - May include a multi-core processor, a multiprocessor, an Application-Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Neural Processing Unit (NPU), and the like.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 컨트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented as hardware components, software components, and/or a combination of hardware components and software components. For example, the devices, methods and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA). array), programmable logic units (PLUs), microprocessors, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include. For example, a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. You can command the device. Software and/or data may be any tangible machine, component, physical device, virtual equipment, computer storage medium or device, intended to be interpreted by or provide instructions or data to a processing device. , or may be permanently or temporarily embodied in a transmitted signal wave. Software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored on computer readable media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있으며 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination, and the program instructions recorded on the medium may be specially designed and configured for the embodiment or may be known and usable to those skilled in the art of computer software. may be Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. - includes hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler.

위에서 설명한 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 또는 복수의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The hardware device described above may be configured to operate as one or a plurality of software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 이를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited drawings, those skilled in the art can apply various technical modifications and variations based on this. For example, the described techniques may be performed in an order different from the method described, and/or components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims are within the scope of the following claims.

1200: 전자 장치
1205: 통신 버스
1210: 센서
1230: 프로세서
1250: 출력 장치
1270: 메모리1200: electronic device
1205: communication bus
1210: sensor
1230: processor
1250: output device
1270: memory

Claims

A method of operating a neural network including an input layer, a plurality of intermediate layers, and an out layer,
generating a first intermediate vector by applying a first activation function to first nodes belonging to a first intermediate layer adjacent to the input layer among the plurality of intermediate layers;
transferring the first intermediate vector to second nodes belonging to a second intermediate layer adjacent to the output layer among the intermediate layers;
generating a second intermediate vector by applying a second activation function to the second nodes; and
applying the second intermediate vector to the output layer;
including,
The second activation function is
A multiplier of the second activation function is a ratio of a first hyperparameter associated with an ascending slope of the second activation function and the second activation function so that the peak value of the second activation function is fixed. A method of operating a neural network, determined by a second hyperparameter associated with a descending slope.

According to claim 1,
The method of operating a neural network, wherein the dynamic range of the second activation function is limited to [0, 1].

According to claim 1,
The second activation function ( ) is expressed by the following equation,
[mathematical expression]
,
Here, a represents a first hyperparameter associated with an ascending slope of the second activation function, and b represents a second hyperparameter associated with a descending slope of the second activation function, denotes an Euler number, the x denotes the input of the second nodes, represents a Heaviside Step function that makes an output of the second activation function 0 when the x is less than 0, the neural network operating method.

According to claim 1,
The first activation function is
of a neural network, including any one of a Step function, a Sigmoid function, a Hyperbolic tangent function, a ReLU function, and a Leaky ReLU function how it works.

According to claim 1,
The neural network is
A method of operating a neural network, including any one of a Convolution Neural Network (CNN), a Deep Neural Network (DNN), and a Recurrent Neural Network (RNN).

A method for training a neural network including an input layer, a plurality of intermediate layers, and an output layer,
extracting a first result value obtained by applying a first activation function to intermediate nodes belonging to each of the plurality of intermediate layers;
extracting a second result value by applying a second activation function different from the first activation function to additional nodes connected to intermediate nodes belonging to at least one arbitrary layer among the plurality of intermediate layers; and
training the neural network based on the difference between the first result value and the second result value;
Including, a training method of a neural network.

According to claim 6,
The second activation function is
The multiplier of the second activation function is determined by a first hyper parameter associated with the rising slope of the second activation function and a second hyper parameter associated with the falling slope of the second activation function so that the peak value of the second activation function is fixed. Determined, the training method of the neural network.

According to claim 6,
The number of additional nodes is the number of intermediate nodes -1,
The method of training a neural network, wherein the additional nodes and the intermediate nodes are fully connected.

According to claim 6,
The second activation function ( ) is expressed by the following equation,
[mathematical expression]
,
Here, a represents a first hyperparameter associated with an ascending slope of the second activation function, and b represents a second hyperparameter associated with a descending slope of the second activation function, denotes an Euler number, where x denotes an input to the additional nodes, represents a Heaviside step function that makes the output of the activation function 0 when the x is less than 0, a neural network training method.

According to claim 6,
The method of training a neural network, wherein the dynamic range of the second activation function is limited to [0, 1].

According to claim 6,
The first activation function is
A method for training a neural network, including any one of a step function, a sigmoid function, a hyperbolic tangent function, a relu function, and a ricky relu function.

A method for training a neural network including an input layer, a plurality of intermediate layers, and an output layer,
generating a first feature vector by propagating learning data input to the input layer to first nodes belonging to a first intermediate layer adjacent to the input layer among the intermediate layers and operating according to a first activation function;
primary training of the neural network based on a difference between the first feature vector and the correct answer vector corresponding to the learning data;
The first feature vector is propagated to second nodes belonging to a second intermediate layer adjacent to the output layer among intermediate layers of the first-trained neural network and operating according to a second activation function to obtain a second feature vector. generating; and
Secondary training of the first trained neural network based on a difference between an output value of the second embedding vector output through the output layer and a correct answer value corresponding to the training data
Including, a training method of a neural network.

According to claim 12,
The second activation function is
The multiplier of the second activation function is determined by a first hyper parameter associated with the rising slope of the second activation function and a second hyper parameter associated with the falling slope of the second activation function so that the peak value of the second activation function is fixed. Determined, the training method of the neural network.

According to claim 12,
The second activation function ( ) is expressed by the following equation,
[mathematical expression]
,
Here, a represents a first hyperparameter associated with an ascending slope of the second activation function, and b represents a second hyperparameter associated with a descending slope of the second activation function, denotes an Euler number, the x denotes the second feature vector, and represents a Heaviside step function that makes an output of the second activation function 0 when the x is less than 0, the neural network training method.

According to claim 12,
The method of training a neural network, wherein the dynamic range of the second activation function is limited to [0, 1].

According to claim 12,
The first activation function is
A method for training a neural network, including any one of a step function, a sigmoid function, a hyperbolic tangent function, a relu function, and a ricky relu function.

A method for detecting forgery of biometric information using a neural network,
One or more first features, using one or more first classifiers trained in advance, from a plurality of intermediate layers of the neural network that detects whether the biometric information is spoofed from input data including the user's biometric information. extracting vectors;
detecting whether the biometric information is first forged based on the one or more first feature vectors;
calculating a second score by applying an output vector output from an output layer of the neural network to a pretrained second classifier according to whether the first forgery is detected; and
Detecting whether the biometric information is second forged or not by a fusion score of the first score calculated based on the one or more first feature vectors and the second score
including,
At least one of the one or more first classifiers and the second classifier
The activation function multiplier is determined by a first hyperparameter associated with an ascending slope of the activation function and a second hyperparameter associated with a descending slope of the activation function such that a peak value of the activation function for the neural network is fixed. A method for detecting forgery of biometric information, which is trained by

According to claim 17,
The dynamic range of the activation function is controlled to [0, 1], a method for detecting whether biometric information is forged or not.

According to claim 17,
The activation function ( ) is expressed by the following equation,
[mathematical expression]

Here, a represents a first hyperparameter associated with an ascending slope of the activation function, and b represents a second hyperparameter associated with a descending slope of the activation function. represents the Euler number, the x represents the input data, and the represents a Heaviside step function that makes the output of the activation function 0 when the x is less than 0, and detects whether biometric information is forged.

According to claim 17,
Extracting the one or more first feature vectors
extracting a 1-1 feature vector from a first intermediate layer among the plurality of intermediate layers by using a 1-1 classifier among the one or more first classifiers;
extracting 1-2 feature vectors from a second intermediate layer after the first intermediate layer by using a 1-2 classifier among the one or more first classifiers; and
Extracting a feature vector obtained by combining the 1-1 feature vector and the 1-2 feature vector
Including, a method for detecting whether the biometric information is forged.

According to claim 20,
The step of detecting whether the biometric information is first forged or not
calculating the first score based on a degree of similarity between at least one of a previously provided registered feature vector and a fake feature vector and the combined feature vector; and
Classifying whether the first score is a score determined as fake information or a score determined as real information using the one or more first classifiers
Including, a method for detecting whether the biometric information is forged.

According to claim 17,
The biometric information
A method for detecting whether biometric information including any one of the user's fingerprint, iris, and face is forged.

A computer program stored in a computer-readable recording medium in order to execute the method of any one of claims 1 to 22 in combination with hardware.

An electronic device for detecting forgery of biometric information using a neural network,
a sensor that captures input data including the biometric information of a user;
From the input data, one or more first feature vectors are extracted from a plurality of intermediate layers of the neural network that detects whether the biometric information is forged or not, using one or more first classifiers trained in advance, and the one or more first feature vectors are extracted. Based on 1 feature vector, whether the biometric information is first forged or not is detected, and according to whether the first forgery or not is detected, the output vector output from the output layer of the neural network is passed to a pretrained second classifier. a processor that calculates a second score by applying a second score, and detects whether the biometric information is second forged or not based on a fusion score of the first score calculated based on the one or more first feature vectors and the second score; and
An output device outputting at least one of the first forgery status and the second forgery status
including,
At least one of the one or more first classifiers and the second classifier
The activation function multiplier is determined by a first hyperparameter associated with a rising slope of the activation function and a second hyperparameter associated with a falling slope of the activation function such that a peak value of the activation function for the neural network is fixed. An electronic device that is trained based on a function.

According to claim 24,
The activation function ( ) is expressed by the following equation,
[mathematical expression]

Here, a represents a first hyperparameter associated with an ascending slope of the activation function, and b represents a second hyperparameter associated with a descending slope of the activation function. denotes an Euler number, where x denotes an input to the additional nodes, represents a Heaviside step function that makes an output of the activation function 0 when the x is less than 0, the electronic device.