CN108681704A

CN108681704A - A kind of face identification system based on deep learning

Info

Publication number: CN108681704A
Application number: CN201810460351.9A
Authority: CN
Inventors: 曲秀杰; 魏天博; 彭程; 杜鹏
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2018-05-15
Filing date: 2018-05-15
Publication date: 2018-10-19

Abstract

A face recognition system based on deep learning belongs to the technical field of deep learning and image processing. Including PC-side network training module, system-on-chip and peripheral peripheral modules: PC-side network training module includes data set processing, CNN training, initialization, gradient return, backpropagation and forward propagation units; first in the PC-side network module Construct the network structure, collect enough face training sets at the same time, and train a usable deep learning face recognition network; the system on chip includes image acquisition module, fifo control module, deep learning face recognition module, personnel information storage module and image display module. The recognition speed can reach 400FPS under the 50MHz clock; the recognition accuracy is as high as 99.25%, which is comparable to the human eye recognition accuracy; the training set covers various lighting conditions, and the recognition function can be completed under most lighting conditions, with good robustness .

Description

A face recognition system based on deep learning

技术领域technical field

本发明涉及一种基于深度学习的人脸识别系统，属于数字图像处理技术领域。The invention relates to a face recognition system based on deep learning, belonging to the technical field of digital image processing.

背景技术Background technique

人脸具有高度非刚性的特点，存在着大量体现个体差异的细节。人脸识别是通过从静态图像或者动态视频中检测出的人脸图像与数据库中的人脸图像进行比对，找到与之匹配的人脸的过程，通常用于身份识别和鉴定的目的，是属于生物特征识别领域的课题。Human faces are highly non-rigid, and there are a large number of details that reflect individual differences. Face recognition is the process of finding a matching face by comparing the face image detected from a static image or dynamic video with the face image in the database. It is usually used for identification and identification purposes. It belongs to the field of biometric identification.

传统的人脸识别方法有多种，可以大体分为两类，一类是基于局部的方法，如利用局部描述子Gabor、局部二值模式等进行识别；另一类为基于全局的方法，包括经典的人脸识别算法，如特征脸方法、线性判别分析法等子空间学习算法以及局部保持投影算法等流行学习算法。There are many traditional face recognition methods, which can be roughly divided into two categories. One is based on local methods, such as using local descriptor Gabor, local binary patterns, etc. for recognition; the other is based on global methods, including Classic face recognition algorithms, such as subspace learning algorithms such as eigenface method and linear discriminant analysis method, and popular learning algorithms such as locality preserving projection algorithm.

现在国内外在人脸识别领域的研究主要还是集中在主成分分析法，LBP算法，以及深度学习算法上，这三种方法在人脸识别的成功率和实现难度上会优于其它的算法。At present, the research in the field of face recognition at home and abroad is mainly focused on principal component analysis, LBP algorithm, and deep learning algorithm. These three methods are better than other algorithms in terms of success rate and implementation difficulty of face recognition.

深度学习在人脸识别领域拥有很强的鲁棒性，以及极高的识别率，识别效果远超其它的算法。但现阶段在FPGA平台上实现人脸识别主要还是使用主成分分析法和LBP算法。Deep learning has strong robustness in the field of face recognition, as well as a very high recognition rate, and the recognition effect is far superior to other algorithms. But at this stage, face recognition on the FPGA platform mainly uses principal component analysis and LBP algorithm.

虽然这两种算法与深度学习算法相比较，在FPGA平台上更易实现，但是，由于算法本身的局限性(鲁棒性差、识别率低，预处理增加时耗)，在人脸识别的速度和精度上，这两种算法不能达到逐帧实时识别以及适应各种光照条件的技术要求。文献显示，目前在FPGA平台上实现人脸识别的科研成果检测速度大多在10FPS以内，现阶段最快的成果也只能达到25FPS。而深度学习算法多是利用CPU，GPU，TPU进行实现，在FPGA平台上的研究还相对较少，使用FPGA来设计出用于人脸识别的深度学习网络更是相对空白，也有着一些科研人员在这方面做了一些尝试，但一直没有成熟的科研成果。Although these two algorithms are easier to implement on the FPGA platform compared with the deep learning algorithm, due to the limitations of the algorithm itself (poor robustness, low recognition rate, and increased time consumption for preprocessing), the speed of face recognition and In terms of accuracy, these two algorithms cannot meet the technical requirements of frame-by-frame real-time recognition and adaptation to various lighting conditions. The literature shows that most of the scientific research results of face recognition on the FPGA platform have a detection speed of less than 10FPS, and the fastest results at this stage can only reach 25FPS. Most of the deep learning algorithms are implemented by using CPU, GPU, and TPU, and there are relatively few researches on the FPGA platform. Using FPGA to design a deep learning network for face recognition is relatively blank, and there are also some scientific researchers. Some attempts have been made in this regard, but there has been no mature scientific research results.

本发明在FPGA平台上实现了一种深度学习的人脸识别系统，该系统在50MHz的时钟频率下工作，识别速度上可以达到400FPS，远超现有的一些成果。识别精度可比拟人眼识别精度，高达99.25％。训练集涵盖各种光照条件，可以在大部分的光照条件下完成识别功能，拥有很好的鲁棒性。The present invention implements a deep learning face recognition system on the FPGA platform. The system works at a clock frequency of 50MHz, and the recognition speed can reach 400FPS, far exceeding some existing achievements. The recognition accuracy is comparable to the recognition accuracy of human eyes, as high as 99.25%. The training set covers a variety of lighting conditions, and the recognition function can be completed under most lighting conditions, with good robustness.

发明内容Contents of the invention

本发明的目的在于针对FPGA平台上人脸识别技术具有实时性差、检测精度低以及鲁棒性差的缺陷，提出了一种基于深度学习的人脸识别系统。The purpose of the present invention is to propose a face recognition system based on deep learning for the defects of poor real-time performance, low detection accuracy and poor robustness of the face recognition technology on the FPGA platform.

一种基于深度学习的人脸识别系统由PC端网络训练模块、片上系统和周边外设模块组成：A face recognition system based on deep learning consists of a PC-side network training module, a system-on-chip and peripheral peripheral modules:

其中，PC端网络训练模块包括：数据集处理单元、CNN训练单元、初始化单元、梯度回传单元、反向传播单元以及前向传播单元；根据实际需要，首先在PC端网络模块中构建网络结构，同时采集足够的人脸训练集，训练出一个可用的深度学习人脸识别网络，其中；人脸识别网络采用的激活函数是sigmoid函数；Among them, the PC-side network training module includes: dataset processing unit, CNN training unit, initialization unit, gradient return unit, backpropagation unit, and forward-propagation unit; according to actual needs, first build the network structure in the PC-side network module , and collect enough face training sets at the same time to train an available deep learning face recognition network, wherein; the activation function used by the face recognition network is a sigmoid function;

片上系统包括图像采集模块、fifo控制模块、深度学习人脸识别模块、人员信息存储模块和图像显示模块；The system on chip includes image acquisition module, fifo control module, deep learning face recognition module, personnel information storage module and image display module;

周边外设模块包括摄像头子模块、VGA数模转换模块和本地显示器；Peripheral peripheral modules include camera sub-module, VGA digital-to-analog conversion module and local display;

其中，摄像头子模块中主要包括多路摄像头及摄像头连接线；Among them, the camera sub-module mainly includes multiple cameras and camera connection lines;

一种基于深度学习的人脸识别系统中各模块的连接关系如下：The connection relationship of each module in a face recognition system based on deep learning is as follows:

摄像头子模块中的多路摄像头通过摄像头连接线将多路数据通道连接到图像采集模块，图像采集模块连接到fifo控制模块，fifo控制模块连接到深度学习人脸识别模块，同时连接到图像显示模块，深度学习人脸模块连接到人员信息存储模块，人员信息存储模块集成在fifo控制模块内，由fifo控制模块进行控制，将结果输出到图像显示模块；图像显示模块再将数字信号传输到VGA数模转换模块，经过数模转换后输出到本地显示器；The multi-channel camera in the camera sub-module connects multiple data channels to the image acquisition module through the camera cable, the image acquisition module is connected to the fifo control module, the fifo control module is connected to the deep learning face recognition module, and is connected to the image display module at the same time , the deep learning face module is connected to the personnel information storage module, the personnel information storage module is integrated in the fifo control module, controlled by the fifo control module, and the result is output to the image display module; the image display module then transmits the digital signal to the VGA digital The analog conversion module is output to the local display after digital-to-analog conversion;

各个模块功能如下：The functions of each module are as follows:

多路摄像头完成图像的采集功能，并传输给图像采集模块；The multi-channel camera completes the image acquisition function and transmits it to the image acquisition module;

图像采集模块的功能是接收视频流的输入数据，并传输给fifo控制模块；The function of the image acquisition module is to receive the input data of the video stream and transmit it to the fifo control module;

fifo控制模块对传来的视频信号进行缓存，以保证输出图像数据的完整性，并对传来的图片进行实时降采样，降采样到适合网络识别的像素大小存到fifo中；The fifo control module caches the incoming video signal to ensure the integrity of the output image data, and performs real-time downsampling on the incoming picture, downsampling to a pixel size suitable for network recognition and storing it in the fifo;

深度学习人脸识别模块的功能是实现人脸识别功能的加速处理，其输入为经过降采样的人脸图像，模块完成人脸识别后，输出人脸结果；The function of the deep learning face recognition module is to realize the accelerated processing of the face recognition function. Its input is the down-sampled face image. After the module completes the face recognition, it outputs the face result;

人员信息存储模块中存着人员的信息，它接收来自深度学习人脸识别模块的结果，并将该人员具体信息输出到图像显示模块，该模块集成在fifo控制模块中；由fifo控制模块控制，输出到图像显示模块；The personnel information is stored in the personnel information storage module, which receives the result from the deep learning face recognition module, and outputs the specific information of the person to the image display module, which is integrated in the fifo control module; controlled by the fifo control module, output to the image display module;

图像显示模块的功能是接收经过缓存的视频流数据，产生规范的视频流时序并匹配相应的像素数据，输出到片外的VGA数模转换模块，另外该模块还具有数据自动对齐功能，可以克服本地时序产生后输入视频流数据滞后导致的显示错位问题；The function of the image display module is to receive the cached video stream data, generate a standardized video stream timing and match the corresponding pixel data, and output it to the off-chip VGA digital-to-analog conversion module. In addition, this module also has the function of automatic data alignment, which can overcome the The display misalignment problem caused by the lag of the input video stream data after the local timing is generated;

周边外设模块的各个模块功能如下：The functions of each module of the peripheral peripheral module are as follows:

摄像头的功能是采集外部场景，产生稳定的数字视频流；The function of the camera is to capture the external scene and generate a stable digital video stream;

VGA数模转换模块功能是接收图像显示模块传出的数字信号和行场同步信号，并将数字信号进行数模转换，传输给本地显示器；本地显示器的功能是采集图像以及识别结果实时显示；The function of the VGA digital-to-analog conversion module is to receive the digital signal and line-field synchronization signal from the image display module, and perform digital-to-analog conversion on the digital signal and transmit it to the local display; the function of the local display is to collect images and display the recognition results in real time;

一种基于深度学习的人脸识别系统的工作过程，包括搭建人脸识别网络、训练人脸识别网络、搭建FPGA平台以及网络参数更新四部分，具体步骤如下：A working process of a face recognition system based on deep learning, including building a face recognition network, training a face recognition network, building an FPGA platform, and updating network parameters. The specific steps are as follows:

步骤1、在PC端网络训练模块搭建一个用于人脸识别的卷积神经网络；Step 1. Build a convolutional neural network for face recognition in the PC-side network training module;

步骤2、用预先采集好的人脸库对人脸识别网络进行训练，得出训练好的网络参数；Step 2, use the pre-collected face library to train the face recognition network to obtain the trained network parameters;

步骤3、基于FPGA平台搭建人脸识别网络；Step 3. Build a face recognition network based on the FPGA platform;

步骤4、将步骤2训练输出的网络参数导入步骤3搭建好的人脸识别网络上。Step 4. Import the network parameters trained in step 2 into the face recognition network built in step 3.

有益效果Beneficial effect

一种基于深度学习的人脸识别系统，与现有技术相比，具有如下有益效果：A face recognition system based on deep learning, compared with the prior art, has the following beneficial effects:

1.现有技术中没有将深度学习与人脸识别结合并在FPGA平台上进行实现的设计，而本发明在FPGA平台上成功实现了基于深度学习人脸识别网络；1. In the prior art, there is no design that combines deep learning and face recognition and realizes it on the FPGA platform, but the present invention successfully realizes a face recognition network based on deep learning on the FPGA platform;

2.与现有的人脸识别技术相比，识别速度得到提高：现有相关人脸识别系统的处理速度一般低于25FPS，本发明通过采用深度学习技术领域中的CNN模型对采集到的图像进行训练，大大提高了图像处理速度；2. compared with existing face recognition technology, recognition speed is improved: the processing speed of existing relevant face recognition system is generally lower than 25FPS, the present invention is by adopting the CNN model in deep learning technology field to gather the image Training, greatly improving the image processing speed;

3.与现有的人脸识别相比，识别精确度得到了提高，具体体现为：基于FPGA平台实现人脸识别的方法多为主成分分析法以及LBP算法，其识别的精度只能达到80％左右；而本发明采用的深度学习算法识别率能达到99％以上，可比拟人眼识别的精度。3. Compared with the existing face recognition, the recognition accuracy has been improved, which is specifically reflected in the following: the face recognition method based on the FPGA platform is mainly the principal component analysis method and the LBP algorithm, and the recognition accuracy can only reach 80%. %; and the deep learning algorithm recognition rate adopted by the present invention can reach more than 99%, which can be compared with the accuracy of human eye recognition.

附图说明Description of drawings

图1是本发明一种基于深度学习的人脸识别系统的结构框图；Fig. 1 is a structural block diagram of a face recognition system based on deep learning of the present invention;

图2是本发明一种基于深度学习的人脸识别系统实施例中的网络结构示意图。FIG. 2 is a schematic diagram of a network structure in an embodiment of a face recognition system based on deep learning in the present invention.

具体实施方式Detailed ways

下面结合附图和实施例对本发明做进一步说明和详细描述。The present invention will be further illustrated and described in detail below in conjunction with the accompanying drawings and embodiments.

实施例1Example 1

本实施例叙述了本发明所述的一种基于深度学习的人脸识别系统的结构及具体实施。This embodiment describes the structure and specific implementation of a face recognition system based on deep learning in the present invention.

如图1所示，为一种基于深度学习的人脸识别系统的结构框图。根据该结构框图，制作了本实施例。As shown in Figure 1, it is a structural block diagram of a face recognition system based on deep learning. Based on this structural block diagram, this embodiment was produced.

本实施例采用紫光同创公司PGT180H器件进行实现，整个系统处理图像大小为540X480像素。This embodiment is realized by using the PGT180H device of Ziguang Tongchuang Company, and the size of the image processed by the whole system is 540X480 pixels.

首先，在PC端网络训练模块中用matlab设计训练了一个6层的CNN卷积神经网络，这里我们采用了卡耐基梅隆大学的CMU_PIE人脸库以及部分自己采集的人脸库。经过测试集的测试，该6层网络对人脸库中的20个人识别率可达到99.25％。First, in the PC-side network training module, a 6-layer CNN convolutional neural network was designed and trained with matlab. Here we used the CMU_PIE face database from Carnegie Mellon University and some face databases collected by ourselves. After testing on the test set, the 6-layer network can achieve a recognition rate of 99.25% for 20 people in the face database.

该6层卷积神经网络由一层输入层，两层卷积层，两层池化层以及一层全连接层组成。输入层为32x32像素大小的图像，经过4个隐层后，连接到最后一层全连接层，并输出结果。网络采用激活函数为sigmoid函数；The 6-layer convolutional neural network consists of one input layer, two convolutional layers, two pooling layers and one fully connected layer. The input layer is an image with a size of 32x32 pixels. After 4 hidden layers, it is connected to the last fully connected layer and the result is output. The network uses the activation function as the sigmoid function;

其中，4个隐层包括第一层卷积层、第一层池化层、第二层卷积层和第二层池化层；Among them, the 4 hidden layers include the first convolutional layer, the first pooling layer, the second convolutional layer and the second pooling layer;

在FPGA平台上搭建好6层网络结构，如图2所示，导入训练好网络结构参数。Build a 6-layer network structure on the FPGA platform, as shown in Figure 2, import and train the network structure parameters.

实施例系统摄像头采用的是MT9V034灰度图像传感器，这是一种全局曝光CMOS传感器，在全像素(752Hx480V)输出时速度为60FPS，SoC片上系统在初始化时将其输出配置为540x480大小，然后器件自动采集并将连续数据(包括场/行同步)输入到系统的图像采集模块；What embodiment system camera adopts is MT9V034 gray-scale image sensor, and this is a kind of global exposure CMOS sensor, and the speed is 60FPS when full pixel (752Hx480V) output, and SoC system on chip configures its output as 540x480 size when initializing, and then device Automatically collect and input continuous data (including field/line synchronization) to the image acquisition module of the system;

该实例在FPGA平台上搭建了一个6层的卷积神经网络。该网络结构与PC端训练的网络结构相同，由一层的输入层，两层卷积层，两层池化层，以及一层全连接层组成，其中sigmoid激活函数采用6段分段拟合进行实现。在50MHz的时钟频率下，该系统在2.4ms内完成人脸的识别，识别的人数为20人。完成算法处理后，将识别的结果传输给fifo控制模块，由fifo控制模块控制人员信息存储模块进行人员信息的输出；This example builds a 6-layer convolutional neural network on the FPGA platform. The network structure is the same as the network structure trained on the PC side. It consists of one layer of input layer, two layers of convolutional layer, two layers of pooling layer, and one layer of fully connected layer. The sigmoid activation function adopts 6 segmental fitting to implement. Under the clock frequency of 50MHz, the system completes face recognition within 2.4ms, and the number of people recognized is 20. After the algorithm processing is completed, the identification result is transmitted to the fifo control module, and the fifo control module controls the personnel information storage module to output personnel information;

fifo控制模块将摄像头采集的实时缓存图像以及人员信息传输给图像显示模块，图像显示模块产生相应的行场同步信号，输出给外设VGA数模转换模块。VGA数模转换模块完成数字信号到模拟信号的转换，并连接VGA显示屏，完成图像的实时采集显示和识别结果显示。The fifo control module transmits the real-time cached images and personnel information collected by the camera to the image display module, and the image display module generates corresponding line and field synchronization signals, which are output to the peripheral VGA digital-to-analog conversion module. The VGA digital-to-analog conversion module completes the conversion of digital signals to analog signals, and connects to the VGA display to complete the real-time acquisition and display of images and the display of recognition results.

本发明的操作流程：系统上电，加载bit文件，完成系统的配置。识别目标进入图像采集区域，目标识别结果在本地显示器中进行实时显示。The operation process of the present invention: the system is powered on, the bit file is loaded, and the configuration of the system is completed. The recognition target enters the image acquisition area, and the target recognition result is displayed in real time on the local monitor.

本实例所述系统在图像采集速度足够快的情况下，搭建一个20人的人脸识别网络，以50MHz的时钟频率作为系统时钟，2.4ms内可以完成人脸识别，识别速度最高可达到400FPS，由于MT9V034灰度图像传感器的帧数限制，本实例中实现了60FPS的实时人脸识别，无延迟，相比于现有设计(低于25FPS)的处理速度，实时性得到提高。并且识别准确率上高达99.25％，比拟人眼识别准确率，远超当前的一些基于主成分分析和LBP算法的设计。在不同的光照环境下对本实例进行测试，均可准确完成人脸识别，对光照的鲁棒性强。The system described in this example builds a 20-person face recognition network under the condition that the image acquisition speed is fast enough. With a clock frequency of 50MHz as the system clock, the face recognition can be completed within 2.4ms, and the recognition speed can reach up to 400FPS. Due to the limitation of the number of frames of the MT9V034 grayscale image sensor, in this example, real-time face recognition at 60FPS is realized without delay. Compared with the processing speed of the existing design (less than 25FPS), the real-time performance is improved. And the recognition accuracy rate is as high as 99.25%, which is comparable to the recognition accuracy rate of human eyes and far exceeds some current designs based on principal component analysis and LBP algorithm. This example is tested under different lighting environments, and the face recognition can be accurately completed, and the robustness to lighting is strong.

以上所述为本发明的较佳实施例，本发明不应该局限于该实施例和附图所公开的内容。凡是不脱离本发明所公开的精神下完成的等效或修改，都落入本发明保护的范围。The above descriptions are preferred embodiments of the present invention, and the present invention should not be limited to the content disclosed in the embodiments and accompanying drawings. All equivalents or modifications accomplished without departing from the disclosed spirit of the present invention fall within the protection scope of the present invention.

Claims

1. A face recognition system based on deep learning, characterized in that: comprising a PC end network training module, a system-on-chip and peripheral peripheral modules;

Among them, the PC-side network training module includes: dataset processing unit, CNN training unit, initialization unit, gradient return unit, backpropagation unit, and forward-propagation unit; according to actual needs, first build the network structure in the PC-side network module , and collect enough face training sets at the same time to train an available deep learning face recognition network, wherein; the activation function used by the face recognition network is a sigmoid function;

The system on chip includes image acquisition module, fifo control module, deep learning face recognition module, personnel information storage module and image display module;

Peripheral peripheral modules include camera sub-module, VGA digital-to-analog conversion module and local display;

Among them, the camera sub-module mainly includes multiple cameras and camera connection lines;

The connection relationship of each module in a face recognition system based on deep learning is as follows:

The multi-channel camera in the camera sub-module connects multiple data channels to the image acquisition module through the camera cable, the image acquisition module is connected to the fifo control module, the fifo control module is connected to the deep learning face recognition module, and is connected to the image display module at the same time , the deep learning face module is connected to the personnel information storage module, the personnel information storage module is integrated in the fifo control module, controlled by the fifo control module, and the result is output to the image display module; the image display module then transmits the digital signal to the VGA digital The analog conversion module is output to the local display after digital-to-analog conversion;

The functions of each module are as follows:

The multi-channel camera completes the image acquisition function and transmits it to the image acquisition module;

The function of the image acquisition module is to receive the input data of the video stream and transmit it to the fifo control module;

The fifo control module caches the incoming video signal to ensure the integrity of the output image data, and performs real-time downsampling on the incoming picture, downsampling to a pixel size suitable for network recognition and storing it in the fifo;

The function of the deep learning face recognition module is to realize the accelerated processing of the face recognition function. Its input is the down-sampled face image. After the module completes the face recognition, it outputs the face result;

The personnel information is stored in the personnel information storage module, which receives the result from the deep learning face recognition module, and outputs the specific information of the person to the image display module, which is integrated in the fifo control module; controlled by the fifo control module, output to the image display module;

The function of the image display module is to receive the cached video stream data, generate a standardized video stream timing and match the corresponding pixel data, and output it to the off-chip VGA digital-to-analog conversion module. In addition, this module also has the function of automatic data alignment, which can overcome the The display misalignment problem caused by the lag of the input video stream data after the local timing is generated;

The functions of each module of the peripheral peripheral module are as follows:

The function of the camera is to capture the external scene and generate a stable digital video stream;

The function of the VGA digital-to-analog conversion module is to receive the digital signal and line-field synchronization signal from the image display module, and perform digital-to-analog conversion on the digital signal and transmit it to the local display; the function of the local display is to collect images and display the recognition results in real time;

A working process of a face recognition system based on deep learning, including four parts: building a face recognition network, training a face recognition network, building an FPGA platform, and updating network parameters. The specific steps are as follows:

Step 1. Build a convolutional neural network for face recognition in the PC-side network training module;

Step 2, use the pre-collected face library to train the face recognition network to obtain the trained network parameters;

Step 3. Build a face recognition network based on the FPGA platform;

Step 4. Import the network parameters trained in step 2 into the face recognition network built in step 3.