CN114708625A - Face recognition method and device - Google Patents
Face recognition method and device Download PDFInfo
- Publication number
- CN114708625A CN114708625A CN202111360868.9A CN202111360868A CN114708625A CN 114708625 A CN114708625 A CN 114708625A CN 202111360868 A CN202111360868 A CN 202111360868A CN 114708625 A CN114708625 A CN 114708625A
- Authority
- CN
- China
- Prior art keywords
- feature
- sample
- class
- quality score
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000003062 neural network model Methods 0.000 claims abstract description 84
- 230000004927 fusion Effects 0.000 claims abstract description 54
- 238000007499 fusion processing Methods 0.000 claims abstract description 15
- 239000011159 matrix material Substances 0.000 claims description 141
- 238000004364 calculation method Methods 0.000 claims description 68
- 238000012545 processing Methods 0.000 claims description 30
- 238000012549 training Methods 0.000 claims description 29
- 238000005457 optimization Methods 0.000 claims description 21
- 230000002452 interceptive effect Effects 0.000 claims description 18
- 230000009466 transformation Effects 0.000 claims description 12
- 239000000284 extract Substances 0.000 claims description 8
- 238000001514 detection method Methods 0.000 claims description 4
- 230000017105 transposition Effects 0.000 claims 1
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 66
- 238000004590 computer program Methods 0.000 description 17
- 230000008569 process Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 9
- 238000000605 extraction Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
本公开涉及人工智能技术领域,提供了一种人脸识别方法及装置。该方法包括:通过神经网络模型提取输入图像中每个样本的第一特征、第二特征和第三特征,并对上述三个特征进行特征融合处理,得到每个样本的融合特征,进而通过计算得到每个样本的第一质量分和每个类对应的类中心的第二质量分;通过神经网络模型计算每个样本和每个样本所属的类对应的类中心的偏差角度的余弦值;根据每个样本的第一质量分、每个样本所属的类对应的类中心的第二质量分以及每个样本和每个样本所属的类对应的类中心的偏差角度的余弦值,通过损失函数训练神经网络模型;使用训练之后的神经网络模型进行人脸识别。
The present disclosure relates to the technical field of artificial intelligence, and provides a face recognition method and device. The method includes: extracting the first feature, second feature and third feature of each sample in the input image through a neural network model, and performing feature fusion processing on the above three features to obtain the fusion feature of each sample, and then calculating Obtain the first quality score of each sample and the second quality score of the class center corresponding to each class; calculate the cosine value of the deviation angle of each sample and the class center corresponding to the class to which each sample belongs through the neural network model; according to The first quality score of each sample, the second quality score of the class center corresponding to the class to which each sample belongs, and the cosine value of the deviation angle of each sample and the class center corresponding to the class to which each sample belongs are trained through the loss function Neural network model; use the trained neural network model for face recognition.
Description
技术领域technical field
本公开涉及人工智能技术领域,尤其涉及一种人脸识别方法及装置。The present disclosure relates to the technical field of artificial intelligence, and in particular, to a face recognition method and device.
背景技术Background technique
现有人脸识别算法的通用做法,是将所有训练样本一视同仁进行处理,在 对人脸识别模型的训练中,没有考虑不同的样本质量和不同质量的类中心带来 的训练问题。不同的样本质量和不同质量的类中心实际上涉及的是图像特征提 取和图像特征质量估计。在现有的人脸算识别技术中,人脸图像的特征提取任 务和关于特征的质量估计任务是相互独立的,特征提取任务和质量估计任务没 有关联,也无法关联,进而导致了在对人脸识别模型的训练中,图像特征提取 和图像特征质量估计无法相互促进的问题。The general practice of existing face recognition algorithms is to treat all training samples equally. In the training of face recognition models, the training problems caused by different sample quality and different quality class centers are not considered. Different sample quality and different quality class centers actually involve image feature extraction and image feature quality estimation. In the existing face recognition technology, the feature extraction task of the face image and the quality estimation task about the feature are independent of each other, and the feature extraction task and the quality estimation task are not related and cannot be related, which leads to In the training of face recognition model, image feature extraction and image feature quality estimation cannot promote each other.
在实现本公开构思的过程中,发明人发现相关技术中至少存在如下技术问 题:在对人脸识别模型的训练中,图像特征提取和图像特征质量估计无法相互 促进的问题。In the process of realizing the concept of the present disclosure, the inventor found that there are at least the following technical problems in the related art: in the training of the face recognition model, the problem that image feature extraction and image feature quality estimation cannot promote each other.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本公开实施例提供了一种人脸识别方法及装置,以解决现有技 术中,在对人脸识别模型的训练中,图像特征提取和图像特征质量估计无法相 互促进的问题。In view of this, the embodiments of the present disclosure provide a face recognition method and device to solve the problem in the prior art that image feature extraction and image feature quality estimation cannot promote each other in the training of a face recognition model.
本公开实施例的第一方面,提供了一种人脸识别方法,包括:获取输入图 像,并通过神经网络模型提取输入图像中每个样本的第一特征、第二特征和第 三特征,通过神经网络模型计算每个样本和每个样本所属的类对应的类中心的 偏差角度的余弦值,其中,输入图像有多个类,每个类对应的一个类中心,每 个类中有多个样本;对每个样本的第一特征、第二特征和第三特征进行特征融 合处理,得到每个样本的融合特征;对每个样本的以下至少之一特征做交互计 算:第一特征、第三特征和融合特征,以得到每个样本的第一质量分;根据输 入图像中每个类中的多个样本的质量分,计算每个类对应的类中心的第二质量分;根据每个样本的第一质量分、每个样本所属的类对应的类中心的第二质量 分以及每个样本和每个样本所属的类对应的类中心的偏差角度的余弦值,通过 损失函数训练神经网络模型;使用训练之后的神经网络模型进行人脸识别。A first aspect of the embodiments of the present disclosure provides a face recognition method, including: acquiring an input image, and extracting a first feature, a second feature, and a third feature of each sample in the input image by using a neural network model, and using a neural network model The neural network model calculates the cosine value of the deviation angle of each sample and the class center corresponding to the class to which each sample belongs, wherein the input image has multiple classes, each class corresponds to a class center, and each class has multiple sample; perform feature fusion processing on the first feature, second feature and third feature of each sample to obtain the fusion feature of each sample; perform interactive calculation on at least one of the following features of each sample: the first feature, the third feature Three features and fusion features to obtain the first quality score of each sample; according to the quality scores of multiple samples in each class in the input image, calculate the second quality score of the class center corresponding to each class; The first quality score of the sample, the second quality score of the class center corresponding to the class to which each sample belongs, and the cosine value of the deviation angle of each sample and the class center corresponding to the class to which each sample belongs, train the neural network through the loss function Model; use the trained neural network model for face recognition.
本公开实施例的第二方面,提供了一种人脸识别装置,包括:第一计算模 块,用于获取输入图像,并通过神经网络模型提取输入图像中每个样本的第一 特征、第二特征和第三特征,通过神经网络模型计算每个样本和每个样本所属 的类对应的类中心的偏差角度的余弦值,其中,输入图像有多个类,每个类对 应的一个类中心,每个类中有多个样本;特征融合模块,用于对每个样本的第 一特征、第二特征和第三特征进行特征融合处理,得到每个样本的融合特征; 第二计算模块,用于对每个样本的以下至少之一特征做交互计算:第一特征、 第三特征和融合特征,以得到每个样本的第一质量分;第三计算模块,用于根据输入图像中每个类中的多个样本的质量分,计算每个类对应的类中心的第二 质量分;模型训练模块,用于根据每个样本的第一质量分、每个样本所属的类 对应的类中心的第二质量分以及每个样本和每个样本所属的类对应的类中心的 偏差角度的余弦值,通过损失函数训练神经网络模型;人脸识别模块,用于使 用训练之后的神经网络模型进行人脸识别。In a second aspect of the embodiments of the present disclosure, there is provided a face recognition device, including: a first computing module, configured to acquire an input image, and extract the first feature and the second feature of each sample in the input image through a neural network model feature and the third feature, the cosine value of the deviation angle of each sample and the class center corresponding to the class to which each sample belongs is calculated by the neural network model, wherein, the input image has multiple classes, and each class corresponds to a class center, There are multiple samples in each class; the feature fusion module is used to perform feature fusion processing on the first feature, the second feature and the third feature of each sample to obtain the fusion feature of each sample; the second calculation module, using Perform interactive calculation on at least one of the following features of each sample: the first feature, the third feature and the fusion feature to obtain the first quality score of each sample; the third calculation module is used for each sample in the input image. The quality scores of multiple samples in the class are used to calculate the second quality score of the class center corresponding to each class; the model training module is used for the first quality score of each sample and the class center corresponding to the class to which each sample belongs. The second quality score and the cosine value of the deviation angle of the class center corresponding to each sample and the class to which each sample belongs, the neural network model is trained through the loss function; the face recognition module is used to use the trained neural network model for face recognition.
本公开实施例的第三方面,提供了一种电子设备,包括存储器、处理器以 及存储在存储器中并且可在处理器上运行的计算机程序,该处理器执行计算机 程序时实现上述方法的步骤。In a third aspect of the embodiments of the present disclosure, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the above method when the processor executes the computer program.
本公开实施例的第四方面,提供了一种计算机可读存储介质,该计算机可 读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述方法的 步骤。A fourth aspect of the embodiments of the present disclosure provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above method are implemented.
本公开实施例与现有技术相比存在的有益效果是:因为本公开实施例通过 神经网络模型提取输入图像中每个样本的第一特征、第二特征和第三特征,并 对上述三个特征进行特征融合处理,得到每个样本的融合特征,进而通过计算 得到每个样本的第一质量分和每个类对应的类中心的第二质量分;通过神经网 络模型计算每个样本和每个样本所属的类对应的类中心的偏差角度的余弦值; 根据每个样本的第一质量分、每个样本所属的类对应的类中心的第二质量分以 及每个样本和每个样本所属的类对应的类中心的偏差角度的余弦值,通过损失 函数训练神经网络模型;进而可以使用训练之后的神经网络模型进行人脸识别, 因此,采用上述技术手段,可以解决现有技术中,在对人脸识别模型的训练中,图像特征提取和图像特征质量估计无法相互促进的问题,进而提高人脸识别模 型识别人脸的准确率。Compared with the prior art, the embodiment of the present disclosure has the following beneficial effects: because the embodiment of the present disclosure extracts the first feature, the second feature and the third feature of each sample in the input image through the neural network model, and compares the above three features The features are subjected to feature fusion processing to obtain the fusion features of each sample, and then the first quality score of each sample and the second quality score of the class center corresponding to each class are obtained by calculation; The cosine value of the deviation angle of the class center corresponding to the class to which each sample belongs; according to the first quality score of each sample, the second quality score of the class center corresponding to the class to which each sample belongs, and the The cosine value of the deviation angle of the class center corresponding to the class of , trains the neural network model through the loss function; and then the trained neural network model can be used for face recognition. Therefore, the above technical means can be used to solve the problem in the prior art. In the training of face recognition model, image feature extraction and image feature quality estimation cannot promote each other, thereby improving the accuracy of face recognition model for face recognition.
附图说明Description of drawings
为了更清楚地说明本公开实施例中的技术方案,下面将对实施例或现有技 术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅 仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳 动的前提下,还可以根据这些附图获得其它的附图。In order to illustrate the technical solutions in the embodiments of the present disclosure more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only for the present disclosure. In some embodiments, for those of ordinary skill in the art, other drawings can also be obtained according to these drawings without any creative effort.
图1是本公开实施例的应用场景的场景示意图;FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present disclosure;
图2是本公开实施例提供的一种人脸识别方法的流程示意图;2 is a schematic flowchart of a face recognition method provided by an embodiment of the present disclosure;
图3是本公开实施例提供的使用训练之后的神经网络模型进行人脸识别的 方法的流程示意图;3 is a schematic flowchart of a method for performing face recognition using a trained neural network model provided by an embodiment of the present disclosure;
图4是本公开实施例提供的一种人脸识别装置的结构示意图;4 is a schematic structural diagram of a face recognition device provided by an embodiment of the present disclosure;
图5是本公开实施例提供的人脸识别模块的结构示意图;5 is a schematic structural diagram of a face recognition module provided by an embodiment of the present disclosure;
图6是本公开实施例提供的一种电子设备的结构示意图。FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术 之类的具体细节,以便透彻理解本公开实施例。然而,本领域的技术人员应当 清楚,在没有这些具体细节的其它实施例中也可以实现本公开。在其它情况中, 省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节 妨碍本公开的描述。In the following description, specific details, such as specific system structures, techniques, etc., are set forth for the purpose of explanation rather than limitation, in order to provide a thorough understanding of the embodiments of the present disclosure. However, it will be apparent to those skilled in the art that the present disclosure may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
下面将结合附图详细说明根据本公开实施例的一种人脸识别方法和装置。A method and apparatus for face recognition according to embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.
图1是本公开实施例的应用场景的场景示意图。该应用场景可以包括终端 设备1、2和3、服务器4以及网络5。FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present disclosure. The application scenario may include
终端设备1、2和3可以是硬件,也可以是软件。当终端设备1、2和3为 硬件时,其可以是具有显示屏且支持与服务器4通信的各种电子设备,包括但 不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等;当终端设备 1、2和3为软件时,其可以安装在如上的电子设备中。终端设备1、2和3可 以实现为多个软件或软件模块,也可以实现为单个软件或软件模块,本公开实 施例对此不作限制。进一步地,终端设备1、2和3上可以安装有各种应用,例 如数据处理应用、即时通信工具、社交平台软件、搜索类应用、购物类应用等。
服务器4可以是提供各种服务的服务器,例如,对与其建立通信连接的终 端设备发送的请求进行接收的后台服务器,该后台服务器可以对终端设备发送 的请求进行接收和分析等处理,并生成处理结果。服务器4可以是一台服务器, 也可以是由若干台服务器组成的服务器集群,或者还可以是一个云计算服务中 心,本公开实施例对此不作限制。The
需要说明的是,服务器4可以是硬件,也可以是软件。当服务器4为硬件 时,其可以是为终端设备1、2和3提供各种服务的各种电子设备。当服务器4 为软件时,其可以是为终端设备1、2和3提供各种服务的多个软件或软件模块, 也可以是为终端设备1、2和3提供各种服务的单个软件或软件模块,本公开实 施例对此不作限制。It should be noted that the
网络5可以是采用同轴电缆、双绞线和光纤连接的有线网络,也可以是无 需布线就能实现各种通信设备互联的无线网络,例如,蓝牙(Bluetooth)、近 场通信(Near FieldCommunication,NFC)、红外(Infrared)等,本公开实施 例对此不作限制。The network 5 can be a wired network connected by coaxial cables, twisted pairs and optical fibers, or a wireless network that can realize interconnection of various communication devices without wiring, such as Bluetooth, Near Field Communication, NFC), infrared (Infrared), etc., which are not limited in this embodiment of the present disclosure.
用户可以通过终端设备1、2和3经由网络5与服务器4建立通信连接,以 接收或发送信息等。需要说明的是,终端设备1、2和3、服务器4以及网络5 的具体类型、数量和组合可以根据应用场景的实际需求进行调整,本公开实施 例对此不作限制。The user can establish a communication connection with the
图2是本公开实施例提供的一种人脸识别方法的流程示意图。图2的人脸 识别方法可以由图1的终端设备或服务器执行。如图2所示,该人脸识别方法 包括:FIG. 2 is a schematic flowchart of a face recognition method provided by an embodiment of the present disclosure. The face recognition method of FIG. 2 may be executed by the terminal device or the server of FIG. 1 . As shown in Figure 2, the face recognition method includes:
S201,获取输入图像,并通过神经网络模型提取输入图像中每个样本的第 一特征、第二特征和第三特征,通过神经网络模型计算每个样本和每个样本所 属的类对应的类中心的偏差角度的余弦值,其中,输入图像有多个类,每个类 对应的一个类中心,每个类中有多个样本;S201: Obtain an input image, extract the first feature, the second feature and the third feature of each sample in the input image through a neural network model, and use the neural network model to calculate the class center corresponding to each sample and the class to which each sample belongs The cosine of the deviation angle, where the input image has multiple classes, each class corresponds to a class center, and there are multiple samples in each class;
S202,对每个样本的第一特征、第二特征和第三特征进行特征融合处理, 得到每个样本的融合特征;S202, performing feature fusion processing on the first feature, the second feature and the third feature of each sample to obtain the fusion feature of each sample;
S203,对每个样本的以下至少之一特征做交互计算:第一特征、第三特征 和融合特征,以得到每个样本的第一质量分;S203, do interactive calculation to at least one of the following features of each sample: the first feature, the third feature and the fusion feature, to obtain the first quality score of each sample;
S204,根据输入图像中每个类中的多个样本的质量分,计算每个类对应的 类中心的第二质量分;S204, according to the quality points of multiple samples in each class in the input image, calculate the second quality points of the class center corresponding to each class;
S205,根据每个样本的第一质量分、每个样本所属的类对应的类中心的第 二质量分以及每个样本和每个样本所属的类对应的类中心的偏差角度的余弦值, 通过损失函数训练神经网络模型;S205, according to the first quality score of each sample, the second quality score of the class center corresponding to the class to which each sample belongs, and the cosine value of the deviation angle between each sample and the class center corresponding to the class to which each sample belongs, pass The loss function trains the neural network model;
S206,使用训练之后的神经网络模型进行人脸识别。S206, using the trained neural network model to perform face recognition.
需要说明的是,本公开实施例获取的输入图像是为了用于训练神经网络模 型,输入图像包括多个类,每个类对应一个正类中心和\或负类中心,每个类都 有多个样本。在人脸识别领域下,一个类可以是一个人,一个样本可以是一个 人的一张图片。一个类对应的类中心可以理解为一个人的所有图片的特征的平 均值,一个类对应的正类中心可以理解为一个人的正样本图片的特征的平均值, 一个类对应的负类中心可以理解为一个人的负样本图片的特征的平均值,样本 的特征大于人脸识别预设阈值的样本为正样本,样本的特征小于人脸识别预设 阈值的样本为负样本。It should be noted that the input image obtained in the embodiment of the present disclosure is used for training the neural network model, and the input image includes multiple classes, each class corresponds to a positive class center and\or negative class center, and each class has multiple classes. samples. In the field of face recognition, a class can be a person, and a sample can be a picture of a person. The class center corresponding to a class can be understood as the average value of the features of all pictures of a person, the positive class center corresponding to a class can be understood as the average value of the features of a person's positive sample pictures, and the negative class center corresponding to a class can be It is understood as the average value of the features of a person's negative sample pictures. The samples whose features are greater than the preset threshold for face recognition are positive samples, and the samples whose features are less than the preset threshold for face recognition are negative samples.
根据本公开实施例提供的技术方案,因为本公开实施例通过神经网络模型 提取输入图像中每个样本的第一特征、第二特征和第三特征,并对上述三个特 征进行特征融合处理,得到每个样本的融合特征,进而通过计算得到每个样本 的第一质量分和每个类对应的类中心的第二质量分;通过神经网络模型计算每 个样本和每个样本所属的类对应的类中心的偏差角度的余弦值;根据每个样本 的第一质量分、每个样本所属的类对应的类中心的第二质量分以及每个样本和 每个样本所属的类对应的类中心的偏差角度的余弦值,通过损失函数训练神经 网络模型;进而可以使用训练之后的神经网络模型进行人脸识别,因此,采用 上述技术手段,可以解决现有技术中,在对人脸识别模型的训练中,图像特征 提取和图像特征质量估计无法相互促进的问题,进而提高人脸识别模型识别人 脸的准确率。According to the technical solution provided by the embodiment of the present disclosure, because the embodiment of the present disclosure extracts the first feature, the second feature and the third feature of each sample in the input image through the neural network model, and performs feature fusion processing on the above three features, Obtain the fusion feature of each sample, and then obtain the first quality score of each sample and the second quality score of the class center corresponding to each class through calculation; calculate the corresponding class of each sample and each sample through the neural network model The cosine value of the deviation angle of the class center of the The cosine value of the deviation angle is used to train the neural network model through the loss function; then the trained neural network model can be used for face recognition. Therefore, the above technical means can solve the problem of the existing technology in the face recognition model. During training, image feature extraction and image feature quality estimation cannot promote each other, thereby improving the face recognition accuracy of the face recognition model.
在步骤S202中,对每个样本的第一特征、第二特征和第三特征进行特征 融合处理,得到每个样本的融合特征,包括:将输入图像输入神经网络模型, 分别通过神经网络模型的第二阶段、第三阶段和第四阶段输出输入图像中每个 样本的第一特征、第二特征和第三特征,其中,神经网络模型具有四个阶段; 将每个样本的第一特征输入第一预设卷积层,输出每个样本的第四特征;将每 个样本的第二特征输入第二预设卷积层,输出每个样本的第五特征;将每个样 本的第三特征输入第三预设卷积层,输出每个样本的第六特征;对每个样本的 第四特征、第五特征和第六特征进行特征拼接处理,得到每个样本的拼接特征,其中,特征融合处理包括特征拼接处理;将每个样本的拼接特征输入第四预设 卷积层,输出每个样本的融合特征。In step S202, feature fusion processing is performed on the first feature, the second feature and the third feature of each sample to obtain the fused feature of each sample, including: inputting the input image into the neural network model, respectively through the neural network model The second stage, the third stage and the fourth stage output the first feature, the second feature and the third feature of each sample in the input image, wherein the neural network model has four stages; input the first feature of each sample The first preset convolution layer outputs the fourth feature of each sample; the second feature of each sample is input into the second preset convolution layer, and the fifth feature of each sample is output; the third feature of each sample is input. The feature is input into the third preset convolution layer, and the sixth feature of each sample is output; the fourth feature, the fifth feature and the sixth feature of each sample are subjected to feature splicing processing to obtain the splicing feature of each sample, wherein, The feature fusion processing includes feature splicing processing; the splicing features of each sample are input into the fourth preset convolution layer, and the fusion features of each sample are output.
神经网络模型具有四个阶段为本领域技术人员所公知,在此不再赘述。The neural network model has four stages, which are well known to those skilled in the art, and will not be repeated here.
具体地,第一预设卷积层对输入的第一特征进行卷积核为第一预设大小、 下采样的步长为第一预设步长和通道数为第一预设数目的卷积,得到了矩阵维 度为第一预设维度的第四特征;举例说明,对第一特征进行卷积核为3x3、下 采样步长为2和通道数为128的卷积,得到第四特征,第四特征或者说第四特 征对应的矩阵的维度是(14,14,128)。第二预设卷积层对输入的第二特征进 行卷积核为第二预设大小和通道数为第二预设数目的卷积,得到了矩阵维度为 第二预设维度的第五特征;举例说明,对第二特征进行卷积核为3x3、通道128 的卷积,得到第五特征,第五特征维度或者说第五特征对应的矩阵的维度是(14, 14,128)。第三预设卷积层对输入的第三特征进行卷积核为第三预设大小和通 道数为第三预设数目的卷积,第三预设卷积层再对该卷积结果做预设倍数的线 性插值上采样处理,得到了矩阵维度为第三预设维度的第六特征;举例说明, 对第三特征进行卷积核为3x3、通道128的卷积,再对该卷积结果做2倍线性 插值上采样,得到第六特征,第六特征维度或者说第六特征对应的矩阵的维度 是(14,14,128)。第四预设卷积层对输入的第四特征进行卷积核为第四预设大 小和通道数为第四预设数目的卷积,得到了矩阵维度为第四预设维度的融合特 征;举例说明,对第四特征进行卷积核为1x1、通道数为128的卷积计算,得到卷积计算结果,将该卷积计算结果作为融合特征。需要说明的是,第四预设 卷积层还可以再对卷积计算结果做归一化处理,将得到归一化处理结果作为融 合特征。Specifically, the first preset convolution layer performs a convolution kernel of the first preset size on the input first feature, the step size of the downsampling is the first preset step size, and the number of channels is the first preset number of volumes. product, the fourth feature whose matrix dimension is the first preset dimension is obtained; for example, the first feature is convolved with a convolution kernel of 3x3, a downsampling step size of 2 and a channel number of 128 to obtain the fourth feature , the dimension of the fourth feature or the matrix corresponding to the fourth feature is (14, 14, 128). The second preset convolution layer performs convolution on the input second feature with the convolution kernel being the second preset size and the number of channels being the second preset number to obtain the fifth feature whose matrix dimension is the second preset dimension For example, the second feature is convolved with a convolution kernel of 3x3 and a channel of 128 to obtain a fifth feature, and the dimension of the fifth feature or the dimension of the matrix corresponding to the fifth feature is (14, 14, 128). The third preset convolution layer performs convolution on the input third feature with the convolution kernel being the third preset size and the number of channels being the third preset number, and then the third preset convolution layer will do the convolution result. The linear interpolation upsampling processing of preset multiples obtains the sixth feature whose matrix dimension is the third preset dimension; for example, the convolution kernel is 3×3 and the convolution channel is 128 for the third feature, and then the convolution is performed. The result is upsampled by 2 times linear interpolation to obtain the sixth feature. The dimension of the sixth feature or the dimension of the matrix corresponding to the sixth feature is (14, 14, 128). The fourth preset convolution layer performs convolution on the input fourth feature whose convolution kernel is the fourth preset size and the number of channels is the fourth preset number, and obtains the fusion feature whose matrix dimension is the fourth preset dimension; For example, a convolution calculation with a convolution kernel of 1×1 and a channel number of 128 is performed on the fourth feature to obtain a convolution calculation result, and the convolution calculation result is used as a fusion feature. It should be noted that the fourth preset convolution layer can further normalize the convolution calculation result, and use the normalized result as a fusion feature.
根据本公开实施例提供的技术方案,分别通过不同的预设卷积层处理第一 特征、第二特征和第三特征,得到第四特征、第五特征和第六特征,进而根据 第四特征、第五特征和第六特征得到融合特征,提高了融合特征与第一特征、 第二特征和第三特征的相似度。According to the technical solutions provided by the embodiments of the present disclosure, the first feature, the second feature, and the third feature are processed through different preset convolution layers, respectively, to obtain the fourth feature, the fifth feature, and the sixth feature, and then according to the fourth feature , the fifth feature and the sixth feature to obtain a fusion feature, which improves the similarity between the fusion feature and the first feature, the second feature and the third feature.
在步骤S203中,对每个样本的以下至少之一特征做交互计算:第一特征、 第三特征和融合特征,以得到每个样本的第一质量分,包括:对每个样本的第 一特征和融合特征做交互计算,以得到每个样本的浅层质量分,其中,第一质 量分包括浅层质量分,包括:将第一特征输入第五预设卷积层,输出第七特征; 分别对第七特征对应的第一矩阵和融合特征对应的第二矩阵进行维度打平处理, 并使用经过维度打平处理后的第一矩阵乘以经过维度打平处理后的第二矩阵的 转置,得到第一乘积;使用归一化指数函数对第一乘积进行计算,得到第一计 算结果,并使用第一计算结果乘以第二矩阵,得到第二乘积;将第二乘积乘以 第一参数矩阵再乘以第二参数矩阵,得到第三乘积;使用sigmoid函数对第三 乘积进行计算,得到浅层质量分。In step S203, interactive calculation is performed on at least one of the following features of each sample: the first feature, the third feature and the fusion feature to obtain the first quality score of each sample, including: The feature and the fusion feature are interactively calculated to obtain the shallow quality score of each sample, wherein the first quality score includes the shallow quality score, including: inputting the first feature into the fifth preset convolution layer and outputting the seventh feature ; Perform dimension leveling processing on the first matrix corresponding to the seventh feature and the second matrix corresponding to the fusion feature respectively, and use the first matrix after the dimension leveling process to multiply the second matrix after the dimension leveling process. Transpose to obtain the first product; use the normalized exponential function to calculate the first product to obtain the first calculation result, and use the first calculation result to multiply the second matrix to obtain the second product; multiply the second product by The first parameter matrix is multiplied by the second parameter matrix to obtain the third product; the sigmoid function is used to calculate the third product to obtain the shallow quality score.
本公开实施例中的样本的特征可以是特征图,因为特征图可以对应一个矩 阵,所以,为了便于理解,可以直接将本公开实施例中的每个特征理解为一个 矩阵。The feature of the sample in the embodiment of the present disclosure may be a feature map, because the feature map may correspond to a matrix, therefore, for ease of understanding, each feature in the embodiment of the present disclosure may be directly understood as a matrix.
具体地,第五预设卷积层对输入的第一特征进行卷积核为第五预设大小和 通道数为第五预设数目的卷积,得到第七特征。举例说明,对第五特征进行卷 积核为1x1、通道数128的卷积计算,得到第七特征。然后把第七特征的维度 打平为(28x28,128)的矩阵A,将第一矩阵的维度打平为(14x14,128)的 矩阵B,其中,14x14是卷积核的大小,128是通道的数目;A乘上B的转置; 对乘积结果,也就是第一乘积做归一化指数函数softmax操作;再将操作结果 乘上B;然后将该乘积结果,也就是第二乘积外接参数矩阵W11(128,64)和 矩阵W12(64,1),得到第三乘积;使用sigmoid函数对第三乘积进行计算, 得到浅层质量分。需要说明的是,使用sigmoid函数对第三乘积进行计算之后, 可以再对该计算结果取平均值,将平均值作为浅层质量分。外接参数矩阵W11就是第一参数矩阵,外接参数矩阵W12就是第二参数矩阵,第一参数矩阵和第 二参数矩阵可以是预先设置的。Specifically, the fifth preset convolution layer performs convolution on the input first feature with a convolution kernel of a fifth preset size and a channel number of a fifth preset number to obtain the seventh feature. For example, a convolution calculation with a convolution kernel of 1×1 and a channel number of 128 is performed on the fifth feature to obtain the seventh feature. Then flatten the dimension of the seventh feature to matrix A of (28x28, 128), and flatten the dimension of the first matrix to matrix B of (14x14, 128), where 14x14 is the size of the convolution kernel and 128 is the channel The number of ; A multiplied by the transpose of B; Perform the normalized exponential function softmax operation on the product result, that is, the first product; Multiply the operation result by B; then the product result, that is, the external parameter of the second product The matrix W 11 (128, 64) and the matrix W 12 (64, 1) are used to obtain the third product; the sigmoid function is used to calculate the third product to obtain the shallow quality score. It should be noted that, after the third product is calculated using the sigmoid function, the calculation result may be averaged, and the average value may be used as the shallow quality score. The external parameter matrix W 11 is the first parameter matrix, the external parameter matrix W 12 is the second parameter matrix, and the first parameter matrix and the second parameter matrix may be preset.
上述步骤可以理解为如下公式:The above steps can be understood as the following formula:
为浅层质量分,T为转置运算的符号。W11为第一参数矩阵,W12为第二 参数矩阵,mean为求均值函数。 is the shallow quality score, and T is the symbol of the transpose operation. W 11 is the first parameter matrix, W 12 is the second parameter matrix, and mean is the mean value function.
因为浅层质量分是神经网络模型的第二阶段输出的,所以浅层质量分可以 理解为样本对样本类对应的类中心的影响权重值。Because the shallow quality score is output by the second stage of the neural network model, the shallow quality score can be understood as the weight value of the influence of the sample on the class center corresponding to the sample class.
在步骤S203中,对每个样本的以下至少之一特征做交互计算:第一特征、 第三特征和融合特征,以得到每个样本的第一质量分,包括:对每个样本的第 三特征和融合特征做交互计算,以得到每个样本的中层质量分,其中,第一质 量分包括中层质量分,包括:将第三特征输入第五预设卷积层,输出第八特征; 分别对第八特征对应的第三矩阵和融合特征对应的第二矩阵进行维度打平处理, 并使用经过维度打平处理后的第三矩阵乘以经过维度打平处理后的第二矩阵的 转置,得到第四乘积;使用归一化指数函数对第四乘积进行计算,得到第二计 算结果,并使用第二计算结果乘以第二矩阵,得到第五乘积;将第五乘积乘以 第三参数矩阵再乘以第四参数矩阵,得到第六乘积;使用sigmoid函数对第六 乘积进行计算,得到中层质量分。In step S203, interactive calculation is performed on at least one of the following features of each sample: the first feature, the third feature and the fusion feature to obtain the first quality score of each sample, including: The feature and the fusion feature are interactively calculated to obtain the middle-level quality score of each sample, wherein the first quality score includes the middle-level quality score, including: inputting the third feature into the fifth preset convolutional layer, and outputting the eighth feature; respectively; Perform dimension leveling processing on the third matrix corresponding to the eighth feature and the second matrix corresponding to the fusion feature, and use the third matrix after dimension leveling processing to multiply the transpose of the second matrix after dimension leveling processing , obtain the fourth product; use the normalized exponential function to calculate the fourth product to obtain the second calculation result, and use the second calculation result to multiply the second matrix to obtain the fifth product; multiply the fifth product by the third The parameter matrix is then multiplied by the fourth parameter matrix to obtain the sixth product; the sigmoid function is used to calculate the sixth product to obtain the mid-level quality score.
举例说明,对第三特征进行卷积核为1x1、通道数128的卷积计算,得到 第八特征。然后把第八特征的维度打平为(7x7,128)的矩阵C,将第一矩阵 的维度打平为(14x14,128)的矩阵B,其中,14x14是卷积核的大小,128是 通道的数目;C乘上B的转置;对该乘积结果,也就是第四乘积做归一化指数 函数softmax操作;再将该操作结果乘上B;然后将该乘积结果,也就是第五 乘积外接参数矩阵W21(128,64)和矩阵W22(64,1),得到第六乘积;使 用sigmoid函数对第六乘积进行计算,得到中层质量分。需要说明的是,使用 sigmoid函数对第六乘积进行计算之后,可以再对该计算结果取平均值,将平均 值作为中层质量分。外接参数矩阵W21就是第三参数矩阵,外接参数矩阵W22就是第四参数矩阵,第三参数矩阵和第四参数矩阵可以是预先设置的。For example, a convolution calculation with a convolution kernel of 1×1 and a channel number of 128 is performed on the third feature to obtain the eighth feature. Then flatten the dimension of the eighth feature to a matrix C of (7x7, 128), and flatten the dimension of the first matrix to a matrix B of (14x14, 128), where 14x14 is the size of the convolution kernel and 128 is the channel the number of ; multiply C by the transpose of B; perform the normalized exponential function softmax operation on the result of the product, which is the fourth product; multiply the result of the operation by B; and then the result of the product, which is the fifth product The external parameter matrix W 21 (128, 64) and the matrix W 22 (64, 1) are used to obtain the sixth product; the sigmoid function is used to calculate the sixth product to obtain the mid-level quality score. It should be noted that, after the sixth product is calculated using the sigmoid function, the calculation result may be averaged, and the average value may be used as the mid-level quality score. The external parameter matrix W 21 is the third parameter matrix, the external parameter matrix W 22 is the fourth parameter matrix, and the third parameter matrix and the fourth parameter matrix may be preset.
上述步骤可以理解为如下公式:The above steps can be understood as the following formula:
为中层质量分,W21为第三参数矩阵,W22为第四参数矩阵。 is the mid-level quality score, W 21 is the third parameter matrix, and W 22 is the fourth parameter matrix.
在步骤S203中,对每个样本的以下至少之一特征做交互计算:第一特征、 第三特征和融合特征,以得到每个样本的第一质量分,包括:对每个样本的融 合特征做交互计算,以得到每个样本的高层质量分,其中,第一质量分包括高 层质量分,包括:将融合特征输入第五预设卷积层,输出第九特征;分别对第 九特征对应的第四矩阵和融合特征对应的第二矩阵进行维度打平处理,并使用 经过维度打平处理后的第四矩阵乘以经过维度打平处理后的第二矩阵的转置, 得到第七乘积;使用归一化指数函数对第七乘积进行计算,得到第三计算结果, 并使用第三计算结果乘以第二矩阵,得到第八乘积;将第八乘积乘以第五参数 矩阵再乘以第六参数矩阵,得到第九乘积;使用sigmoid函数对第九乘积进行 计算,得到高层质量分。In step S203, interactive calculation is performed on at least one of the following features of each sample: the first feature, the third feature and the fusion feature to obtain the first quality score of each sample, including: the fusion feature of each sample Do interactive calculation to obtain the high-level quality score of each sample, where the first quality score includes the high-level quality score, including: inputting the fusion feature into the fifth preset convolutional layer, and outputting the ninth feature; corresponding to the ninth feature respectively The fourth matrix and the second matrix corresponding to the fusion feature are dimension-leveled, and the fourth matrix after dimension-leveling is used to multiply the transpose of the second matrix after dimension-leveling to obtain the seventh product. ; use the normalized exponential function to calculate the seventh product to obtain the third calculation result, and use the third calculation result to multiply the second matrix to obtain the eighth product; multiply the eighth product by the fifth parameter matrix and then multiply by The sixth parameter matrix, the ninth product is obtained; the sigmoid function is used to calculate the ninth product to obtain the high-level quality score.
举例说明,对融合特征进行卷积核为1x1、通道数128的卷积计算,得到 第九特征。然后把第九特征的维度打平为(14x14,128)的矩阵D,将第一矩 阵的维度打平为(14x14,128)的矩阵B,其中,14x14是卷积核的大小,128 是通道的数目;D乘上B的转置;对该乘积结果,也就是第七乘积做归一化指 数函数softmax操作;再将该操作结果乘上B;然后将该乘积结果,也就是第 八乘积外接参数矩阵W31(128,64)和矩阵W32(64,1),得到第九乘积; 使用sigmoid函数对第九乘积进行计算,得到高层质量分。需要说明的是,使 用sigmoid函数对第九乘积进行计算之后,可以再对该计算结果取平均值,将 平均值作为高层质量分。外接参数矩阵W31就是第五参数矩阵,外接参数矩阵 W32就是第六参数矩阵,第五参数矩阵和第六参数矩阵可以是预先设置的。For example, a convolution calculation with a convolution kernel of 1×1 and a channel number of 128 is performed on the fusion feature to obtain the ninth feature. Then flatten the dimension of the ninth feature to matrix D of (14x14, 128), and flatten the dimension of the first matrix to matrix B of (14x14, 128), where 14x14 is the size of the convolution kernel and 128 is the channel the number of ; multiply D by the transpose of B; perform the normalized exponential function softmax operation on the result of the product, that is, the seventh product; then multiply the result of the operation by B; then the result of the product, which is the eighth product The external parameter matrix W 31 (128, 64) and the matrix W 32 (64, 1) are used to obtain the ninth product; the sigmoid function is used to calculate the ninth product to obtain the high-level quality score. It should be noted that after using the sigmoid function to calculate the ninth product, the calculation result can be averaged, and the average value can be used as the high-level quality score. The external parameter matrix W 31 is the fifth parameter matrix, the external parameter matrix W 32 is the sixth parameter matrix, and the fifth parameter matrix and the sixth parameter matrix may be preset.
上述步骤可以理解为如下公式:The above steps can be understood as the following formula:
为高层质量分,W31为第三参数矩阵,W32为第四参数矩阵。 is the high-level quality score, W 31 is the third parameter matrix, and W 32 is the fourth parameter matrix.
在步骤S204中,根据输入图像中每个类中的多个样本的质量分,计算每 个类对应的类中心的第二质量分,包括根据每个样本的中层质量分和高层质量 分计算每个样本对应的第一队列分数,其中,第一质量分,包括:浅层质量分、 中层质量分和高层质量分;根据每个样本的浅层质量分计算每个样本对应的第 二队列分数;根据每个类中的多个样本对应的第一队列分数和第二队列分数计 算每个类对应的类中心的第二质量分。In step S204, calculating the second quality score of the class center corresponding to each class according to the quality scores of the multiple samples in each class in the input image, including calculating the quality score of each class according to the middle-level quality score and the high-level quality score of each sample The first cohort scores corresponding to each sample, wherein the first quality score includes: the shallow layer quality score, the middle layer quality score and the high layer quality score; the second cohort score corresponding to each sample is calculated according to the shallow layer quality score of each sample ; Calculate the second quality score of the class center corresponding to each class according to the first cohort score and the second cohort score corresponding to the multiple samples in each class.
具体地,根据如下公式计算第一队列分数qi,1:Specifically, the first queue score qi ,1 is calculated according to the following formula:
根据如下公式计算第二队列分数qi,2:Calculate the second cohort score qi ,2 according to the following formula:
根据如下公式计算每个类对应的类中心的第二质量分γ:Calculate the second mass score γ of the class center corresponding to each class according to the following formula:
γ据如 γAccording to as
i为一个类中具有的样本的序号,R为在一个类中具有的样本的数量,α为 神经网络模型的调节参数,可以设为0.2,max(R)表示所有类中具有的样本最 多的类的样本数。i is the serial number of the samples in a class, R is the number of samples in a class, α is the adjustment parameter of the neural network model, which can be set to 0.2, max(R) represents the most samples in all classes The number of samples for the class.
在步骤S205中,根据每个样本的第一质量分、每个样本所属的类对应的 类中心的第二质量分以及每个样本和每个样本所属的类对应的类中心的偏差角 度的余弦值,通过损失函数训练神经网络模型,包括:根据每个样本的第一质 量分以及每个样本和每个样本所属的类对应的类中心的偏差角度的余弦值,通 过交叉熵损失函数训练神经网络模型,其中,损失函数包括交叉熵损失函数; 其中,第一质量分,包括:浅层质量分、中层质量分和高层质量分;其中,类 中心,包括:正类中心和负类中心。In step S205, according to the first quality score of each sample, the second quality score of the class center corresponding to the class to which each sample belongs, and the cosine of the deviation angle between each sample and the class center corresponding to the class to which each sample belongs value, training the neural network model through the loss function, including: training the neural network through the cross entropy loss function according to the first quality score of each sample and the cosine value of the deviation angle of the class center corresponding to each sample and the class to which each sample belongs A network model, wherein the loss function includes a cross-entropy loss function; wherein, the first quality score includes: shallow layer quality score, middle layer quality score, and high-level quality score; wherein, the class center includes: positive class center and negative class center.
具体地,交叉熵损失函数L1为:Specifically, the cross - entropy loss function L1 is:
s是放大系数,可以设为64,θyi表示样本与正类中心的角度,θj表示样本 与负类中心的角度,i为一个类中具有的正样本的序号,j为一个类中具有的负 样本的序号,m0可以取0.35,m1可以取0.25,N为所有样本的总数量,n为所 有负样本的总数量。s is the magnification factor, which can be set to 64, θ yi represents the angle between the sample and the center of the positive class, θ j represents the angle between the sample and the center of the negative class, i is the serial number of the positive samples in a class, and j is the number of positive samples in a class. The serial number of the negative samples, m 0 can take 0.35, m 1 can take 0.25, N is the total number of all samples, n is the total number of all negative samples.
在步骤S205中,根据每个样本的第一质量分、每个样本所属的类对应的 类中心的第二质量分以及每个样本和每个样本所属的类对应的类中心的偏差角 度的余弦值,通过损失函数训练神经网络模型,包括:通过近邻优化损失函数 训练神经网络模型,其中,损失函数包括近邻优化损失函数,包括:计算每个 样本的浅层质量分的倒数、中层质量分的倒数和高层质量分的倒数的第一和, 其中,第一质量分,包括:浅层质量分、中层质量分和高层质量分;根据每个 样本所属的类对应的类中心的第二质量分以及每个样本和每个样本所属的类对 应的负类中心的偏差角度的余弦值,使用近邻优化函数计算近邻优化结果,其中,类中心,包括:正类中心和负类中心;计算每个样本的浅层质量分、中层 质量分和高层质量分的第二和,并使用每个样本对应的第二和乘以每个样本对 应的近邻优化结果,得到第十乘积;将每个样本对应的第一和加上每个样本对 应的第十乘积,得到第三和;通过每个样本对应第三和训练神经网络模型。In step S205, according to the first quality score of each sample, the second quality score of the class center corresponding to the class to which each sample belongs, and the cosine of the deviation angle between each sample and the class center corresponding to the class to which each sample belongs value, training the neural network model through the loss function, including: training the neural network model through the nearest neighbor optimization loss function, wherein the loss function includes the nearest neighbor optimization loss function, including: calculating the inverse of the quality score of the shallow layer and the quality score of the middle layer of each sample. The first sum of the reciprocal and the inverse of the high-level quality score, where the first quality score includes: the shallow-level quality score, the middle-level quality score, and the high-level quality score; according to the second quality score of the class center corresponding to the class to which each sample belongs And the cosine value of the deviation angle of the negative class center corresponding to each sample and the class to which each sample belongs, use the nearest neighbor optimization function to calculate the nearest neighbor optimization result, where the class center includes: positive class center and negative class center; calculate each The second sum of the shallow, middle and high quality scores of the samples, and the second sum corresponding to each sample is multiplied by the nearest neighbor optimization result corresponding to each sample to obtain the tenth product; each sample corresponds to The first sum of , and the tenth product corresponding to each sample are added to obtain the third sum; the neural network model is trained through the third sum corresponding to each sample.
具体地,近邻优化损失函数L2为:Specifically, the nearest neighbor optimization loss function L2 is :
t可以设为0.2,∑top10cos(θj-γt)为近邻优化函数计算的近邻优化结果,z 为预设优化数目,假设z取10,∑top10cos(θj-γt)表示在一个类中,使用与该 类对应的负类中心的相似度最高的前10个负样本优化神经网络模型。t can be set to 0.2, ∑ top10 cos(θ j -γt) is the nearest neighbor optimization result calculated by the nearest neighbor optimization function, z is the preset number of optimizations, assuming z is 10, ∑ top10 cos(θ j -γt) indicates that in a class , the neural network model is optimized using the top 10 negative samples with the highest similarity to the negative class center corresponding to the class.
在一个可选实施例中,在对神经网络模型进行训练的时候,还需要更新或 者修改每个样本和每个样本所属的类对应的正类中心的偏差角度的余弦值cosθp,该步骤可以通过如下公式更新余弦值cosθp:In an optional embodiment, when training the neural network model, it is also necessary to update or modify the cosine value cos θ p of the deviation angle of the positive class center corresponding to each sample and the class to which each sample belongs. Update the cosine value cosθ p by the following formula:
或 or
或 or
或 or
m2和m1与m0一样,都是可以预先设置的参数。Like m 0 , m 2 and m 1 are parameters that can be set in advance.
余弦值cosθp的值和神经网络模型判定样本为正样本还是负样本有关。将一 个样本输入神经网络模型,神经网络模型可以自行判断是正样本还是负样本。The value of the cosine value cosθ p is related to the neural network model determining whether the sample is a positive sample or a negative sample. Input a sample into the neural network model, and the neural network model can judge by itself whether it is a positive sample or a negative sample.
图3是本公开实施例提供的使用训练之后的神经网络模型进行人脸识别的 方法的流程示意图。图3的人脸识别方法可以由图1的终端设备或服务器执行。3 is a schematic flowchart of a method for performing face recognition using a trained neural network model provided by an embodiment of the present disclosure. The face recognition method of FIG. 3 may be executed by the terminal device or the server of FIG. 1 .
如图3所示,该人脸识别方法包括:As shown in Figure 3, the face recognition method includes:
S301,在检测到目标用户进入预设区域,通过图像获取设备获取目标用户 的人脸图像,并从人脸检测数据库中获取人脸图像对应的人脸原型图;S301, after detecting that the target user enters the preset area, obtain the face image of the target user through the image acquisition device, and obtain the face prototype map corresponding to the face image from the face detection database;
S302,通过神经网络模型分别提取人脸原型图和人脸图像对应的第十特征 和第十一特征,并分别计算第十特征和第十一特征对应的第一分数和第二分数;S302, extract respectively the tenth feature and the eleventh feature corresponding to the face prototype figure and the face image by the neural network model, and calculate the first score and the second score corresponding to the tenth feature and the eleventh feature respectively;
S303,计算第十特征和第十一特征的欧氏距离;S303, calculate the Euclidean distance between the tenth feature and the eleventh feature;
S304,根据第一分数、第二分数和欧距离,计算欧氏变换距离;S304, calculate the Euclidean transformation distance according to the first score, the second score and the Euclidean distance;
S305,在欧氏变换距离大于预设阈值的情况下,确认人脸识别成功。S305, in the case that the Euclidean transformation distance is greater than the preset threshold, confirm that the face recognition is successful.
根据本公开实施例提供的技术方案,因为本公开实施例可以通过神经网络 模型分别提取人脸原型图和人脸图像对应的第十特征和第十一特征,并分别计 算第十特征和第十一特征对应的第一分数和第二分数,进而计算欧氏变换距离, 在欧氏变换距离大于预设阈值的情况下,确认人脸识别成功,因此采用上述技 术手段,可以解决现有技术中,在对人脸识别中,人脸识别的阈值是固定的问 题,进而提供一种基于动态阈值的人脸识别方案。According to the technical solutions provided by the embodiments of the present disclosure, the tenth feature and the eleventh feature corresponding to the face prototype image and the face image can be extracted respectively through the neural network model, and the tenth feature and the tenth feature can be calculated respectively. The first score and the second score corresponding to a feature, and then calculate the Euclidean transformation distance. When the Euclidean transformation distance is greater than the preset threshold, it is confirmed that the face recognition is successful. Therefore, the above technical means can be used to solve the problem in the prior art. , in face recognition, the threshold of face recognition is a fixed problem, and then a face recognition scheme based on dynamic threshold is provided.
具体地,欧氏变换距离为:Specifically, the Euclidean transformation distance is:
D’=D+欧氏变换(s1,s2)D'=D+Euclidean Transform(s 1 ,s 2 )
f1和f2分别代表第十特征和第十一特征,s1和s2分别代表第一分数和第 二分数,D为f1和f2的欧式距离,D’为欧氏变换距离,β可以调节,一般取1.2, f()可以取min()函数。f1 and f2 represent the tenth feature and the eleventh feature, respectively, s1 and s2 represent the first score and the second score, respectively, D is the Euclidean distance between f1 and f2, D' is the Euclidean transformation distance, β can be adjusted, generally take 1.2, f() can take the min() function.
有上述公式可见,欧氏变换距离是根据图像获取设备获取或者检测到的目 标用户的人脸图像而变化的,因此本公开实施例的人脸识别的阈值是动态变化 的。It can be seen from the above formula that the Euclidean transform distance changes according to the face image of the target user acquired or detected by the image acquisition device, so the threshold value of the face recognition in the embodiment of the present disclosure is dynamically changed.
上述所有可选技术方案,可以采用任意结合形成本申请的可选实施例,在 此不再一一赘述。All the above-mentioned optional technical solutions can be combined arbitrarily to form optional embodiments of the present application, which will not be repeated here.
下述为本公开装置实施例,可以用于执行本公开方法实施例。对于本公开 装置实施例中未披露的细节,请参照本公开方法实施例。The following are the apparatus embodiments of the present disclosure, which can be used to execute the method embodiments of the present disclosure. For details not disclosed in the device embodiments of the present disclosure, please refer to the method embodiments of the present disclosure.
图4是本公开实施例提供的一种人脸识别装置的示意图。如图4所示,该 人脸识别装置包括:FIG. 4 is a schematic diagram of a face recognition apparatus provided by an embodiment of the present disclosure. As shown in Figure 4, the face recognition device includes:
第一计算模块401,被配置为获取输入图像,并通过神经网络模型提取输 入图像中每个样本的第一特征、第二特征和第三特征,通过神经网络模型计算 每个样本和每个样本所属的类对应的类中心的偏差角度的余弦值,其中,输入 图像有多个类,每个类对应的一个类中心,每个类中有多个样本;The
特征融合模块402,被配置为对每个样本的第一特征、第二特征和第三特 征进行特征融合处理,得到每个样本的融合特征;The
第二计算模块403,被配置为对每个样本的以下至少之一特征做交互计算: 第一特征、第三特征和融合特征,以得到每个样本的第一质量分;The
第三计算模块404,被配置为根据输入图像中每个类中的多个样本的质量 分,计算每个类对应的类中心的第二质量分;The
模型训练模块405,被配置为根据每个样本的第一质量分、每个样本所属 的类对应的类中心的第二质量分以及每个样本和每个样本所属的类对应的类中 心的偏差角度的余弦值,通过损失函数训练神经网络模型;The
人脸识别模块406,被配置为使用训练之后的神经网络模型进行人脸识别。The
需要说明的是,本公开实施例获取的输入图像是为了用于训练神经网络模 型,输入图像包括多个类,每个类对应一个正类中心和\或负类中心,每个类都 有多个样本。在人脸识别领域下,一个类可以是一个人,一个样本可以是一个 人的一张图片。一个类对应的类中心可以理解为一个人的所有图片的特征的平 均值,一个类对应的正类中心可以理解为一个人的正样本图片的特征的平均值, 一个类对应的负类中心可以理解为一个人的负样本图片的特征的平均值,样本 的特征大于人脸识别预设阈值的样本为正样本,样本的特征小于人脸识别预设 阈值的样本为负样本。It should be noted that the input image obtained in the embodiment of the present disclosure is used for training the neural network model, and the input image includes multiple classes, each class corresponds to a positive class center and\or negative class center, and each class has multiple classes. samples. In the field of face recognition, a class can be a person, and a sample can be a picture of a person. The class center corresponding to a class can be understood as the average value of the features of all pictures of a person, the positive class center corresponding to a class can be understood as the average value of the features of a person's positive sample pictures, and the negative class center corresponding to a class can be It is understood as the average value of the features of a person's negative sample pictures. The samples whose features are greater than the preset threshold for face recognition are positive samples, and the samples whose features are less than the preset threshold for face recognition are negative samples.
根据本公开实施例提供的技术方案,因为本公开实施例通过神经网络模型 提取输入图像中每个样本的第一特征、第二特征和第三特征,并对上述三个特 征进行特征融合处理,得到每个样本的融合特征,进而通过计算得到每个样本 的第一质量分和每个类对应的类中心的第二质量分;通过神经网络模型计算每 个样本和每个样本所属的类对应的类中心的偏差角度的余弦值;根据每个样本 的第一质量分、每个样本所属的类对应的类中心的第二质量分以及每个样本和 每个样本所属的类对应的类中心的偏差角度的余弦值,通过损失函数训练神经 网络模型;进而可以使用训练之后的神经网络模型进行人脸识别,因此,采用 上述技术手段,可以解决现有技术中,在对人脸识别模型的训练中,图像特征 提取和图像特征质量估计无法相互促进的问题,进而提高人脸识别模型识别人 脸的准确率。According to the technical solution provided by the embodiment of the present disclosure, because the embodiment of the present disclosure extracts the first feature, the second feature and the third feature of each sample in the input image through the neural network model, and performs feature fusion processing on the above three features, Obtain the fusion feature of each sample, and then obtain the first quality score of each sample and the second quality score of the class center corresponding to each class through calculation; calculate the corresponding class of each sample and each sample through the neural network model The cosine value of the deviation angle of the class center of the The cosine value of the deviation angle is used to train the neural network model through the loss function; then the trained neural network model can be used for face recognition. Therefore, the above technical means can solve the problem of the existing technology in the face recognition model. During training, image feature extraction and image feature quality estimation cannot promote each other, thereby improving the face recognition accuracy of the face recognition model.
可选地,特征融合模块402还被配置为将输入图像输入神经网络模型,分 别通过神经网络模型的第二阶段、第三阶段和第四阶段输出输入图像中每个样 本的第一特征、第二特征和第三特征,其中,神经网络模型具有四个阶段;将 每个样本的第一特征输入第一预设卷积层,输出每个样本的第四特征;将每个 样本的第二特征输入第二预设卷积层,输出每个样本的第五特征;将每个样本 的第三特征输入第三预设卷积层,输出每个样本的第六特征;对每个样本的第 四特征、第五特征和第六特征进行特征拼接处理,得到每个样本的拼接特征, 其中,特征融合处理包括特征拼接处理;将每个样本的拼接特征输入第四预设 卷积层,输出每个样本的融合特征。Optionally, the
神经网络模型具有四个阶段为本领域技术人员所公知,在此不再赘述。The neural network model has four stages, which are well known to those skilled in the art, and will not be repeated here.
具体地,第一预设卷积层对输入的第一特征进行卷积核为第一预设大小、 下采样的步长为第一预设步长和通道数为第一预设数目的卷积,得到了矩阵维 度为第一预设维度的第四特征;举例说明,对第一特征进行卷积核为3x3、下 采样步长为2和通道数为128的卷积,得到第四特征,第四特征或者说第四特 征对应的矩阵的维度是(14,14,128)。第二预设卷积层对输入的第二特征进 行卷积核为第二预设大小和通道数为第二预设数目的卷积,得到了矩阵维度为 第二预设维度的第五特征;举例说明,对第二特征进行卷积核为3x3、通道128 的卷积,得到第五特征,第五特征维度或者说第五特征对应的矩阵的维度是(14, 14,128)。第三预设卷积层对输入的第三特征进行卷积核为第三预设大小和通 道数为第三预设数目的卷积,第三预设卷积层再对该卷积结果做预设倍数的线 性插值上采样处理,得到了矩阵维度为第三预设维度的第六特征;举例说明, 对第三特征进行卷积核为3x3、通道128的卷积,再对该卷积结果做2倍线性 插值上采样,得到第六特征,第六特征维度或者说第六特征对应的矩阵的维度 是(14,14,128)。第四预设卷积层对输入的第四特征进行卷积核为第四预设大 小和通道数为第四预设数目的卷积,得到了矩阵维度为第四预设维度的融合特 征;举例说明,对第四特征进行卷积核为1x1、通道数为128的卷积计算,得到卷积计算结果,将该卷积计算结果作为融合特征。需要说明的是,第四预设 卷积层还可以再对卷积计算结果做归一化处理,将得到归一化处理结果作为融 合特征。Specifically, the first preset convolution layer performs a convolution kernel of the first preset size on the input first feature, the step size of the downsampling is the first preset step size, and the number of channels is the first preset number of volumes. product, the fourth feature whose matrix dimension is the first preset dimension is obtained; for example, the first feature is convolved with a convolution kernel of 3x3, a downsampling step size of 2 and a channel number of 128 to obtain the fourth feature , the dimension of the fourth feature or the matrix corresponding to the fourth feature is (14, 14, 128). The second preset convolution layer performs convolution on the input second feature with the convolution kernel being the second preset size and the number of channels being the second preset number to obtain the fifth feature whose matrix dimension is the second preset dimension For example, the second feature is convolved with a convolution kernel of 3x3 and a channel of 128 to obtain a fifth feature, and the dimension of the fifth feature or the dimension of the matrix corresponding to the fifth feature is (14, 14, 128). The third preset convolution layer performs convolution on the input third feature with the convolution kernel being the third preset size and the number of channels being the third preset number, and then the third preset convolution layer will do the convolution result. The linear interpolation upsampling processing of preset multiples obtains the sixth feature whose matrix dimension is the third preset dimension; for example, the convolution kernel is 3×3 and the convolution channel is 128 for the third feature, and then the convolution is performed. The result is upsampled by 2 times linear interpolation to obtain the sixth feature. The dimension of the sixth feature or the dimension of the matrix corresponding to the sixth feature is (14, 14, 128). The fourth preset convolution layer performs convolution on the input fourth feature whose convolution kernel is the fourth preset size and the number of channels is the fourth preset number, and obtains the fusion feature whose matrix dimension is the fourth preset dimension; For example, a convolution calculation with a convolution kernel of 1×1 and a channel number of 128 is performed on the fourth feature to obtain a convolution calculation result, and the convolution calculation result is used as a fusion feature. It should be noted that the fourth preset convolution layer can further normalize the convolution calculation result, and use the normalized result as a fusion feature.
根据本公开实施例提供的技术方案,分别通过不同的预设卷积层处理第一 特征、第二特征和第三特征,得到第四特征、第五特征和第六特征,进而根据 第四特征、第五特征和第六特征得到融合特征,提高了融合特征与第一特征、 第二特征和第三特征的相似度。According to the technical solutions provided by the embodiments of the present disclosure, the first feature, the second feature, and the third feature are processed through different preset convolution layers, respectively, to obtain the fourth feature, the fifth feature, and the sixth feature, and then according to the fourth feature , the fifth feature and the sixth feature to obtain a fusion feature, which improves the similarity between the fusion feature and the first feature, the second feature and the third feature.
可选地,第二计算模块403还被配置为对每个样本的第一特征和融合特征 做交互计算,以得到每个样本的浅层质量分,其中,第一质量分包括浅层质量 分,包括:将第一特征输入第五预设卷积层,输出第七特征;分别对第七特征 对应的第一矩阵和融合特征对应的第二矩阵进行维度打平处理,并使用经过维 度打平处理后的第一矩阵乘以经过维度打平处理后的第二矩阵的转置,得到第 一乘积;使用归一化指数函数对第一乘积进行计算,得到第一计算结果,并使 用第一计算结果乘以第二矩阵,得到第二乘积;将第二乘积乘以第一参数矩阵 再乘以第二参数矩阵,得到第三乘积;使用sigmoid函数对第三乘积进行计算, 得到浅层质量分。Optionally, the
本公开实施例中的样本的特征可以是特征图,因为特征图可以对应一个矩 阵,所以,为了便于理解,可以直接将本公开实施例中的每个特征理解为一个 矩阵。The feature of the sample in the embodiment of the present disclosure may be a feature map, because the feature map may correspond to a matrix, therefore, for ease of understanding, each feature in the embodiment of the present disclosure may be directly understood as a matrix.
具体地,第五预设卷积层对输入的第一特征进行卷积核为第五预设大小和 通道数为第五预设数目的卷积,得到第七特征。举例说明,对第五特征进行卷 积核为1x1、通道数128的卷积计算,得到第七特征。然后把第七特征的维度 打平为(28x28,128)的矩阵A,将第一矩阵的维度打平为(14x14,128)的 矩阵B,其中,14x14是卷积核的大小,128是通道的数目;A乘上B的转置; 对乘积结果,也就是第一乘积做归一化指数函数softmax操作;再将操作结果 乘上B;然后将该乘积结果,也就是第二乘积外接参数矩阵W11(128,64)和 矩阵W12(64,1),得到第三乘积;使用sigmoid函数对第三乘积进行计算, 得到浅层质量分。需要说明的是,使用sigmoid函数对第三乘积进行计算之后, 可以再对该计算结果取平均值,将平均值作为浅层质量分。Specifically, the fifth preset convolution layer performs convolution on the input first feature with a convolution kernel of a fifth preset size and a channel number of a fifth preset number to obtain the seventh feature. For example, a convolution calculation with a convolution kernel of 1×1 and a channel number of 128 is performed on the fifth feature to obtain the seventh feature. Then flatten the dimension of the seventh feature to matrix A of (28x28, 128), and flatten the dimension of the first matrix to matrix B of (14x14, 128), where 14x14 is the size of the convolution kernel and 128 is the channel The number of ; A multiplied by the transpose of B; Perform the normalized exponential function softmax operation on the product result, that is, the first product; Multiply the operation result by B; then the product result, that is, the external parameter of the second product The matrix W 11 (128, 64) and the matrix W 12 (64, 1) are used to obtain the third product; the sigmoid function is used to calculate the third product to obtain the shallow quality score. It should be noted that, after the third product is calculated using the sigmoid function, the calculation result may be averaged, and the average value may be used as the shallow quality score.
上述步骤可以理解为如下公式:The above steps can be understood as the following formula:
为浅层质量分,T为转置运算的符号。W11为第一参数矩阵,W12为第二 参数矩阵,mean为求均值函数。 is the shallow quality score, and T is the symbol of the transpose operation. W 11 is the first parameter matrix, W 12 is the second parameter matrix, and mean is the mean value function.
因为浅层质量分是神经网络模型的第二阶段输出的,所以浅层质量分可以 理解为样本对样本类对应的类中心的影响权重值。Because the shallow quality score is output by the second stage of the neural network model, the shallow quality score can be understood as the weight value of the influence of the sample on the class center corresponding to the sample class.
可选地,第二计算模块403还被配置为对每个样本的第三特征和融合特征 做交互计算,以得到每个样本的中层质量分,其中,第一质量分包括中层质量 分,包括:将第三特征输入第五预设卷积层,输出第八特征;分别对第八特征 对应的第三矩阵和融合特征对应的第二矩阵进行维度打平处理,并使用经过维 度打平处理后的第三矩阵乘以经过维度打平处理后的第二矩阵的转置,得到第 四乘积;使用归一化指数函数对第四乘积进行计算,得到第二计算结果,并使 用第二计算结果乘以第二矩阵,得到第五乘积;将第五乘积乘以第三参数矩阵 再乘以第四参数矩阵,得到第六乘积;使用sigmoid函数对第六乘积进行计算, 得到中层质量分。Optionally, the
举例说明,对第三特征进行卷积核为1x1、通道数128的卷积计算,得到 第八特征。然后把第八特征的维度打平为(7x7,128)的矩阵C,将第一矩阵 的维度打平为(14x14,128)的矩阵B,其中,14x14是卷积核的大小,128是 通道的数目;C乘上B的转置;对该乘积结果,也就是第四乘积做归一化指数 函数softmax操作;再将该操作结果乘上B;然后将该乘积结果,也就是第五 乘积外接参数矩阵W21(128,64)和矩阵W22(64,1),得到第六乘积;使 用sigmoid函数对第六乘积进行计算,得到中层质量分。需要说明的是,使用 sigmoid函数对第六乘积进行计算之后,可以再对该计算结果取平均值,将平均 值作为中层质量分。For example, a convolution calculation with a convolution kernel of 1×1 and a channel number of 128 is performed on the third feature to obtain the eighth feature. Then flatten the dimension of the eighth feature to a matrix C of (7x7, 128), and flatten the dimension of the first matrix to a matrix B of (14x14, 128), where 14x14 is the size of the convolution kernel and 128 is the channel the number of ; multiply C by the transpose of B; perform the normalized exponential function softmax operation on the result of the product, which is the fourth product; multiply the result of the operation by B; and then the result of the product, which is the fifth product The external parameter matrix W 21 (128, 64) and the matrix W 22 (64, 1) are used to obtain the sixth product; the sigmoid function is used to calculate the sixth product to obtain the mid-level quality score. It should be noted that, after the sixth product is calculated using the sigmoid function, the calculation result may be averaged, and the average value may be used as the mid-level quality score.
上述步骤可以理解为如下公式:The above steps can be understood as the following formula:
为中层质量分,W21为第三参数矩阵,W22为第四参数矩阵。 is the mid-level quality score, W 21 is the third parameter matrix, and W 22 is the fourth parameter matrix.
可选地,第二计算模块403还被配置为对每个样本的融合特征做交互计算, 以得到每个样本的高层质量分,其中,第一质量分包括高层质量分,包括:将 融合特征输入第五预设卷积层,输出第九特征;分别对第九特征对应的第四矩 阵和融合特征对应的第二矩阵进行维度打平处理,并使用经过维度打平处理后 的第四矩阵乘以经过维度打平处理后的第二矩阵的转置,得到第七乘积;使用 归一化指数函数对第七乘积进行计算,得到第三计算结果,并使用第三计算结 果乘以第二矩阵,得到第八乘积;将第八乘积乘以第五参数矩阵再乘以第六参 数矩阵,得到第九乘积;使用sigmoid函数对第九乘积进行计算,得到高层质 量分。Optionally, the
举例说明,对融合特征进行卷积核为1x1、通道数128的卷积计算,得到 第九特征。然后把第九特征的维度打平为(14x14,128)的矩阵D,将第一矩 阵的维度打平为(14x14,128)的矩阵B,其中,14x14是卷积核的大小,128 是通道的数目;D乘上B的转置;对该乘积结果,也就是第七乘积做归一化指 数函数softmax操作;再将该操作结果乘上B;然后将该乘积结果,也就是第 八乘积外接参数矩阵W31(128,64)和矩阵W32(64,1),得到第九乘积; 使用sigmoid函数对第九乘积进行计算,得到高层质量分。需要说明的是,使 用sigmoid函数对第九乘积进行计算之后,可以再对该计算结果取平均值,将 平均值作为高层质量分。For example, a convolution calculation with a convolution kernel of 1×1 and a channel number of 128 is performed on the fusion feature to obtain the ninth feature. Then flatten the dimension of the ninth feature to matrix D of (14x14, 128), and flatten the dimension of the first matrix to matrix B of (14x14, 128), where 14x14 is the size of the convolution kernel and 128 is the channel the number of ; multiply D by the transpose of B; perform the normalized exponential function softmax operation on the result of the product, that is, the seventh product; then multiply the result of the operation by B; then the result of the product, which is the eighth product The external parameter matrix W 31 (128, 64) and the matrix W 32 (64, 1) are used to obtain the ninth product; the sigmoid function is used to calculate the ninth product to obtain the high-level quality score. It should be noted that after using the sigmoid function to calculate the ninth product, the calculation result can be averaged, and the average value can be used as the high-level quality score.
上述步骤可以理解为如下公式:The above steps can be understood as the following formula:
为高层质量分,W31为第三参数矩阵,W32为第四参数矩阵。 is the high-level quality score, W 31 is the third parameter matrix, and W 32 is the fourth parameter matrix.
可选地,第三计算模块404还被配置为根据每个样本的中层质量分和高层 质量分计算每个样本对应的第一队列分数,其中,第一质量分,包括:浅层质 量分、中层质量分和高层质量分;根据每个样本的浅层质量分计算每个样本对 应的第二队列分数;根据每个类中的多个样本对应的第一队列分数和第二队列 分数计算每个类对应的类中心的第二质量分。Optionally, the
具体地,根据如下公式计算第一队列分数qi,1:Specifically, the first queue score qi ,1 is calculated according to the following formula:
根据如下公式计算第二队列分数qi,2:Calculate the second cohort score qi ,2 according to the following formula:
qi,2=norm(φ1)q i,2 =norm(φ 1 )
根据如下公式计算每个类对应的类中心的第二质量分γ:Calculate the second mass score γ of the class center corresponding to each class according to the following formula:
i为一个类中具有的样本的序号,R为在一个类中具有的样本的数量,α为 神经网络模型的调节参数,可以设为0.2,max(R)表示所有类中具有的样本最 多的类的样本数。i is the serial number of the samples in a class, R is the number of samples in a class, α is the adjustment parameter of the neural network model, which can be set to 0.2, max(R) represents the most samples in all classes The number of samples for the class.
可选地,模型训练模块405还被配置为根据每个样本的第一质量分以及每 个样本和每个样本所属的类对应的类中心的偏差角度的余弦值,通过交叉熵损 失函数训练神经网络模型,其中,损失函数包括交叉熵损失函数;其中,第一 质量分,包括:浅层质量分、中层质量分和高层质量分;其中,类中心,包括: 正类中心和负类中心。Optionally, the
具体地,交叉熵损失函数L1为:Specifically, the cross - entropy loss function L1 is:
s是放大系数,可以设为64,θyi表示样本与正类中心的角度,θj表示样本 与负类中心的角度,i为一个类中具有的正样本的序号,j为一个类中具有的负 样本的序号,m0可以取0.35,m1可以取0.25,N为所有样本的总数量,n为所 有负样本的总数量。s is the magnification factor, which can be set to 64, θ yi represents the angle between the sample and the center of the positive class, θ j represents the angle between the sample and the center of the negative class, i is the serial number of the positive samples in a class, and j is the number of positive samples in a class. The serial number of the negative samples, m 0 can take 0.35, m 1 can take 0.25, N is the total number of all samples, n is the total number of all negative samples.
可选地,模型训练模块405还被配置为计算每个样本的浅层质量分的倒数、 中层质量分的倒数和高层质量分的倒数的第一和,其中,第一质量分,包括: 浅层质量分、中层质量分和高层质量分;根据每个样本所属的类对应的类中心 的第二质量分以及每个样本和每个样本所属的类对应的负类中心的偏差角度的 余弦值,使用近邻优化函数计算近邻优化结果,其中,类中心,包括:正类中 心和负类中心;计算每个样本的浅层质量分、中层质量分和高层质量分的第二 和,并使用每个样本对应的第二和乘以每个样本对应的近邻优化结果,得到第 十乘积;将每个样本对应的第一和加上每个样本对应的第十乘积,得到第三和; 通过每个样本对应第三和训练神经网络模型。Optionally, the
具体地,近邻优化损失函数L2为:Specifically, the nearest neighbor optimization loss function L2 is :
t可以设为0.2,∑top10cos(θj-γt)为近邻优化函数计算的近邻优化结果,z 为预设优化数目,假设z取10,∑top10cos(θj-γt)表示在一个类中,使用与该 类对应的负类中心的相似度最高的前10个负样本优化神经网络模型。t can be set to 0.2, ∑ top10 cos(θ j -γt) is the nearest neighbor optimization result calculated by the nearest neighbor optimization function, z is the preset number of optimizations, assuming z is 10, ∑ top10 cos(θ j -γt) indicates that in a class , the neural network model is optimized using the top 10 negative samples with the highest similarity to the negative class center corresponding to the class.
可选地,模型训练模块405还被配置为在对神经网络模型进行训练的时候, 还需要更新或者修改每个样本和每个样本所属的类对应的正类中心的偏差角度 的余弦值cosθp,该步骤可以通过如下公式更新余弦值cosθp:Optionally, the
或 or
或 or
或 or
m2和m1与m0一样,都是可以预先设置的参数。Like m 0 , m 2 and m 1 are parameters that can be set in advance.
余弦值cosθp的值和神经网络模型判定样本为正样本还是负样本有关。将一 个样本输入神经网络模型,神经网络模型可以自行判断是正样本还是负样本。The value of the cosine value cosθ p is related to the neural network model determining whether the sample is a positive sample or a negative sample. Input a sample into the neural network model, and the neural network model can judge by itself whether it is a positive sample or a negative sample.
图5是本公开实施例提供的人脸识别模块的结构示意图。如图5所示,该 人脸识别模块,包括:FIG. 5 is a schematic structural diagram of a face recognition module provided by an embodiment of the present disclosure. As shown in Figure 5, the face recognition module includes:
检测单元501,被配置为在检测到目标用户进入预设区域,通过图像获取 设备获取目标用户的人脸图像,并从人脸检测数据库中获取人脸图像对应的人 脸原型图;The
第一计算单元502,被配置为通过神经网络模型分别提取人脸原型图和人 脸图像对应的第十特征和第十一特征,并分别计算第十特征和第十一特征对应 的第一分数和第二分数;The
第二计算单元503,被配置为计算第十特征和第十一特征的欧氏距离;The
第三计算单元504,被配置为根据第一分数、第二分数和欧氏距离,计算 欧氏变换距离;The
确认单元505,被配置为在欧氏变换距离大于预设阈值的情况下,确认人 脸识别成功。The
根据本公开实施例提供的技术方案,因为本公开实施例可以通过神经网络 模型分别提取人脸原型图和人脸图像对应的第十特征和第十一特征,并分别计 算第十特征和第十一特征对应的第一分数和第二分数,进而计算欧氏变换距离, 在欧氏变换距离大于预设阈值的情况下,确认人脸识别成功,因此采用上述技 术手段,可以解决现有技术中,在对人脸识别中,人脸识别的阈值是固定的问 题,进而提供一种基于动态阈值的人脸识别方案。According to the technical solutions provided by the embodiments of the present disclosure, the tenth feature and the eleventh feature corresponding to the face prototype image and the face image can be extracted respectively through the neural network model, and the tenth feature and the tenth feature can be calculated respectively. The first score and the second score corresponding to a feature, and then calculate the Euclidean transformation distance. When the Euclidean transformation distance is greater than the preset threshold, it is confirmed that the face recognition is successful. Therefore, the above technical means can be used to solve the problem in the prior art. , in face recognition, the threshold of face recognition is a fixed problem, and then a face recognition scheme based on dynamic threshold is provided.
具体地,欧氏变换距离为:Specifically, the Euclidean transformation distance is:
D’=D+欧氏变换(s1,s2)D'=D+Euclidean Transform(s 1 ,s 2 )
f1和f2分别代表第十特征和第十一特征,s1和s2分别代表第一分数和第 二分数,D为D为f1和f2的欧式距离,D’为欧氏变换距离,β可以调节,一般 取1.2,f可以取min()函数。f1 and f2 represent the tenth feature and the eleventh feature respectively, s1 and s2 represent the first score and the second score respectively, D is the Euclidean distance between f1 and f2, D' is the Euclidean transformation distance, β can be adjusted, Generally take 1.2, f can take the min() function.
有上述公式可见,欧氏变换距离是根据图像获取设备获取或者检测到的目 标用户的人脸图像而变化的,因此本公开实施例的人脸识别的阈值是动态变化 的。It can be seen from the above formula that the Euclidean transform distance changes according to the face image of the target user acquired or detected by the image acquisition device, so the threshold value of the face recognition in the embodiment of the present disclosure is dynamically changed.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后, 各过程的执行顺序应以其功能和内在逻辑确定,而不应对本公开实施例的实施 过程构成任何限定。It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present disclosure.
图6是本公开实施例提供的电子设备6的示意图。如图6所示,该实施例 的电子设备6包括:处理器601、存储器602以及存储在该存储器602中并且 可在处理器601上运行的计算机程序603。处理器601执行计算机程序603时 实现上述各个方法实施例中的步骤。或者,处理器601执行计算机程序603时 实现上述各装置实施例中各模块/单元的功能。FIG. 6 is a schematic diagram of an
示例性地,计算机程序603可以被分割成一个或多个模块/单元,一个或多 个模块/单元被存储在存储器602中,并由处理器601执行,以完成本公开。一 个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指 令段用于描述计算机程序603在电子设备6中的执行过程。Illustratively, the
电子设备6可以是桌上型计算机、笔记本、掌上电脑及云端服务器等电子 设备。电子设备6可以包括但不仅限于处理器601和存储器602。本领域技术 人员可以理解,图6仅仅是电子设备6的示例,并不构成对电子设备6的限定, 可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例 如,电子设备还可以包括输入输出设备、网络接入设备、总线等。The
处理器601可以是中央处理单元(Central Processing Unit,CPU),也可以 是其它通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集 成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列 (Field-Programmable Gate Array,FPGA)或者其它可编程逻辑器件、分立门或 者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理 器也可以是任何常规的处理器等。The
存储器602可以是电子设备6的内部存储单元,例如,电子设备6的硬盘 或内存。存储器602也可以是电子设备6的外部存储设备,例如,电子设备6 上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器602还可以既包 括电子设备6的内部存储单元也包括外部存储设备。存储器602用于存储计算 机程序以及电子设备所需的其它程序和数据。存储器602还可以用于暂时地存 储已经输出或者将要输出的数据。The
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上 述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上 述功能分配由不同的功能单元、模块完成,即将装置的内部结构划分成不同的 功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单 元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可 以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的 形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的 具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系 统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在 此不再赘述。Those skilled in the art can clearly understand that, for the convenience and simplicity of description, only the division of the above-mentioned functional units and modules is used as an example. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment may be integrated in one processing unit, or each unit may exist physically alone, or two or more units may be integrated in one unit, and the above-mentioned integrated units may adopt hardware. It can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing from each other, and are not used to limit the protection scope of the present application. For the specific working processes of the units and modules in the above system, reference may be made to the corresponding processes in the foregoing method embodiments, and details are not described herein again.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详 述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described or recorded in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示 例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来 实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用 和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现 所描述的功能,但是这种实现不应认为超出本公开的范围。Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this disclosure.
在本公开所提供的实施例中,应该理解到,所揭露的装置/电子设备和方法, 可以通过其它的方式实现。例如,以上所描述的装置/电子设备实施例仅仅是示 意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可 以有另外的划分方式,多个单元或组件可以结合或者可以集成到另一个系统, 或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或 直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接, 可以是电性,机械或其它的形式。In the embodiments provided in the present disclosure, it should be understood that the disclosed apparatus/electronic device and method may be implemented in other manners. For example, the apparatus/electronic device embodiments described above are only illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other division methods. Multiple units or components may be Incorporation may either be integrated into another system, or some features may be omitted, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元 显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可 以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元 来实现本实施例方案的目的。Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中, 也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元 中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的 形式实现。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, and can also be implemented in the form of software functional units.
集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售 或使用时,可以存储在一个计算机可读存储介质中。基于这样的理解,本公开 实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关 的硬件来完成,计算机程序可以存储在计算机可读存储介质中,该计算机程序 在被处理器执行时,可以实现上述各个方法实施例的步骤。计算机程序可以包 括计算机程序代码,计算机程序代码可以为源代码形式、对象代码形式、可执 行文件或某些中间形式等。计算机可读介质可以包括:能够携带计算机程序代 码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、电载波信号、电信信号以及软件分发介质等。需要说明的 是,计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求 进行适当的增减,例如,在某些司法管辖区,根据立法和专利实践,计算机可 读介质不包括电载波信号和电信信号。The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer-readable storage medium. Based on this understanding, the present disclosure realizes all or part of the processes in the methods of the above embodiments, and can also be completed by instructing relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium, and the computer program is processed when the When the device is executed, the steps of the foregoing method embodiments may be implemented. A computer program may include computer program code, which may be in source code form, object code form, an executable or some intermediate form, and the like. The computer-readable medium may include: any entity or device capable of carrying computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-Only Memory (ROM), random access memory Memory (Random Access Memory, RAM), electric carrier signal, telecommunication signal, software distribution medium, etc. It should be noted that the content contained in computer-readable media may be modified as appropriate in accordance with the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, computer-readable media may not be Including electrical carrier signals and telecommunication signals.
以上实施例仅用以说明本公开的技术方案,而非对其限制;尽管参照前述 实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:其依然 可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进 行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本公开各 实施例技术方案的精神和范围,均应包含在本公开的保护范围之内。The above embodiments are only used to illustrate the technical solutions of the present disclosure, but not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The recorded technical solutions are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in the present disclosure. within the scope of protection.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111360868.9A CN114708625B (en) | 2021-11-17 | 2021-11-17 | Face recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111360868.9A CN114708625B (en) | 2021-11-17 | 2021-11-17 | Face recognition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114708625A true CN114708625A (en) | 2022-07-05 |
CN114708625B CN114708625B (en) | 2024-11-15 |
Family
ID=82166401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111360868.9A Active CN114708625B (en) | 2021-11-17 | 2021-11-17 | Face recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114708625B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108508294A (en) * | 2018-03-29 | 2018-09-07 | 深圳众厉电力科技有限公司 | A kind of high ferro electric energy quality monitoring system |
CN109214360A (en) * | 2018-10-15 | 2019-01-15 | 北京亮亮视野科技有限公司 | A kind of construction method of the human face recognition model based on ParaSoftMax loss function and application |
CN110796100A (en) * | 2019-10-31 | 2020-02-14 | 浙江大华技术股份有限公司 | Gait recognition method and device, terminal and storage device |
CN113240092A (en) * | 2021-05-31 | 2021-08-10 | 深圳市商汤科技有限公司 | Neural network training and face recognition method, device, equipment and storage medium |
-
2021
- 2021-11-17 CN CN202111360868.9A patent/CN114708625B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108508294A (en) * | 2018-03-29 | 2018-09-07 | 深圳众厉电力科技有限公司 | A kind of high ferro electric energy quality monitoring system |
CN109214360A (en) * | 2018-10-15 | 2019-01-15 | 北京亮亮视野科技有限公司 | A kind of construction method of the human face recognition model based on ParaSoftMax loss function and application |
CN110796100A (en) * | 2019-10-31 | 2020-02-14 | 浙江大华技术股份有限公司 | Gait recognition method and device, terminal and storage device |
CN113240092A (en) * | 2021-05-31 | 2021-08-10 | 深圳市商汤科技有限公司 | Neural network training and face recognition method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114708625B (en) | 2024-11-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108898086B (en) | Video image processing method and device, computer readable medium and electronic equipment | |
CN109766925B (en) | Feature fusion method and device, electronic equipment and storage medium | |
US20250111696A1 (en) | Expression recognition method and apparatus | |
CN114330565A (en) | Face recognition method and device | |
CN117132950A (en) | Vehicle tracking method, system, equipment and storage medium | |
CN116403250A (en) | Face recognition method and device with shielding | |
CN114332993A (en) | Face recognition method and device, electronic equipment and computer readable storage medium | |
CN115147871B (en) | Pedestrian re-identification method in shielding environment | |
CN111915689A (en) | Method, apparatus, electronic device and computer readable medium for generating objective function | |
CN111639198A (en) | Media file identification method and device, readable medium and electronic equipment | |
CN110765304A (en) | Image processing method, image processing device, electronic equipment and computer readable medium | |
CN114708625B (en) | Face recognition method and device | |
CN115953803A (en) | Training method and device for human body recognition model | |
CN115359390A (en) | Image processing method and device | |
CN114756425A (en) | Intelligent monitoring method, device, electronic device and computer-readable storage medium | |
CN114332991B (en) | Method and device for confirming attendance results | |
CN116912631B (en) | Target identification method, device, electronic equipment and storage medium | |
CN113591987B (en) | Image recognition method, device, electronic equipment and medium | |
CN115359274A (en) | Pedestrian image matching method and device | |
CN115880756A (en) | Face recognition method and device with occlusion | |
CN116012917A (en) | Sample data sampling method and face recognition method | |
CN115187912A (en) | Human body matching method and device in video stream | |
CN115862117A (en) | Face recognition method and device with occlusion | |
CN114519884A (en) | Face recognition method and device, electronic equipment and computer readable storage medium | |
CN115188052A (en) | Expression detection method and device based on depth residual error network and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20230109 Address after: 518054 cable information transmission building 25f2504, no.3369 Binhai Avenue, Haizhu community, Yuehai street, Nanshan District, Shenzhen City, Guangdong Province Applicant after: Shenzhen Xumi yuntu Space Technology Co.,Ltd. Address before: No.103, no.1003, Nanxin Road, Nanshan community, Nanshan street, Nanshan District, Shenzhen City, Guangdong Province Applicant before: Shenzhen Jizhi Digital Technology Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |