CN106204779A - The check class attendance method learnt based on plurality of human faces data collection strategy and the degree of depth - Google Patents

The check class attendance method learnt based on plurality of human faces data collection strategy and the degree of depth Download PDF

Info

Publication number
CN106204779A
CN106204779A CN201610504632.0A CN201610504632A CN106204779A CN 106204779 A CN106204779 A CN 106204779A CN 201610504632 A CN201610504632 A CN 201610504632A CN 106204779 A CN106204779 A CN 106204779A
Authority
CN
China
Prior art keywords
face
convolution
model
formula
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610504632.0A
Other languages
Chinese (zh)
Other versions
CN106204779B (en
Inventor
裴炤
张艳宁
彭亚丽
马苗
尚海星
苏艺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Normal University
Original Assignee
Shaanxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Normal University filed Critical Shaanxi Normal University
Priority to CN201610504632.0A priority Critical patent/CN106204779B/en
Publication of CN106204779A publication Critical patent/CN106204779A/en
Application granted granted Critical
Publication of CN106204779B publication Critical patent/CN106204779B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C1/00Registering, indicating or recording the time of events or elapsed time, e.g. time-recorders for work people
    • G07C1/10Registering, indicating or recording the time of events or elapsed time, e.g. time-recorders for work people together with the recording, indicating or registering of other data, e.g. of signs of identity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a kind of check class attendance method learnt based on plurality of human faces data collection strategy and the degree of depth, for solving the technical problem of existing Work attendance method identification rate variance based on recognition of face.Technical scheme is to utilize AdaBoost algorithm and complexion model to carry out multi-target detection and extraction.Only once the face of all participation work attendances need to be shot one section of video, and the face in video sequence is detected, extracts, complete the foundation of face database.Face identification method based on degree of depth study, based on degree of depth convolutional neural networks LeNet 5 model, the face characteristic under scenes different in face database is learnt by LeNet 5 model that application simplifies, and obtains new feature by multilayered nonlinear conversion and represents.These new features are as much as possible to be eliminated as changed in the classes such as illumination, noise, attitude and expression, and retains and change between the class that identity difference produces, and improves face identification method at actual complex scene human face discrimination.

Description

Classroom attendance checking method based on multi-face data acquisition strategy and deep learning
Technical Field
The invention relates to an attendance checking method based on face recognition, in particular to a class attendance checking method based on a multi-face data acquisition strategy and deep learning.
Background
The document "A protocol of Automated Attendance System Using imaging processing, International Journal of Advanced Research in Computer and communication Engineering, Vol.5, Issue 4, April2016, p 501-505" discloses an Attendance method based on face recognition. The method adopts a traditional principal component analysis method to identify the detected human face. After the attendance checking person enters the attendance checking system, the system judges whether the face data of the attendance checking person exists in the database or not, if yes, the attendance checking person is directly identified, and the detection result is added into the database. If the face data does not exist, the face data is required to be acquired firstly. The method needs to acquire face data of the attendance one by one before recognition. However, in practical situations, the number of attendance checking personnel is large, and if the face data is collected one by one, a large amount of time is consumed, and the data collection efficiency is low. Moreover, the method requires the attendance to autonomously complete the acquisition of the face data, and the quality of the acquired face data is difficult to ensure. In addition, the method has the advantages of simple background, stable illumination and single facial expression during face recognition, however, in actual attendance, many attendance personnel have, the changes of background, illumination, posture, expression and the like are very complex, and the traditional face recognition method based on principal component analysis has poor recognition rate under the actual complex condition.
Disclosure of Invention
In order to overcome the defect of poor identification rate of the existing attendance checking method based on face identification, the invention provides a classroom attendance checking method based on a multi-face data acquisition strategy and deep learning. The method utilizes an AdaBoost algorithm and a skin color model to carry out multi-target detection and extraction. Only one video is shot for all the faces participating in attendance, and the faces in the video sequence are detected and extracted to complete the establishment of the face database. The problems that in actual attendance, human face data acquisition is time-consuming and labor-consuming and is difficult to acquire in a unified mode are solved, and mass human face data can be acquired more easily. In addition, the face recognition method based on deep learning is based on a deep convolutional neural network LeNet-5 model, the simplified LeNet-5 model is applied to learn the face features of different scenes in a face database, and new feature representation is obtained through multilayer nonlinear transformation. The new features remove the intra-class changes such as illumination, noise, posture, expression and the like as much as possible, and keep the inter-class changes generated by different identities, thereby improving the face recognition rate of the face recognition method in the actual complex scene.
The technical scheme adopted by the invention for solving the technical problems is as follows: a classroom attendance checking method based on multi-face data acquisition strategies and deep learning is characterized by comprising the following steps:
(a) and acquiring face data.
The method comprises the steps of shooting a video sequence of 30 seconds on the face of a person to be collected, and in the video shooting process, the person to be collected formulates a series of rules to simulate possible changes of the face in actual attendance. Including smiling, expression changes of frowning, and action changes of mouth opening, head raising, head lowering, and changing face orientation. The person to be collected carries out expression and action changes in the video shooting process according to the requirements of the person to be collected.
(b) And performing multi-face detection by using an AdaBoost algorithm and combining a skin color model.
Combining an AdaBoost algorithm with a skin color model, positioning the position of a human face through the AdaBoost algorithm, and then performing skin color check on the human face by using the skin color model, wherein the method comprises the following steps:
firstly, a classifier for face detection is generated by using an Adaboost algorithm, and preliminary face detection is carried out.
Secondly, checking a human face region preliminarily determined by adopting a skin color model, and comparing pixels in the image with a standard skin color to distinguish a skin color region from a non-skin color region in the image. When a standard skin color range is set, three color spaces are adopted: RGB color space, HSV color space, YCbCr color space.
Two RGB standard skin color models are set. Threshold range for model one: g >40, B >20, R > G, R > B, MAX (R, G, B) -MIN (R, G, B) > 15; threshold range for model two: r >220, | R-G | <15, R > G, R > B.
Converting the RGB color into HSV color by adopting formulas (1), (2), (3) and (4), and setting an HSV standard skin color threshold range: 0< H <50, 0.23< S < 0.68.
H = H i , B &le; G 360 - H , B &GreaterEqual; G - - - ( 1 )
In the formula, H represents a hue. Wherein,
H i = 1 2 ( R - G ) + ( R - B ) ( R - G ) 2 + ( R - G ) &times; ( G - B ) - - - ( 2 )
where R is the value of the red channel, G is the value of the green channel, and B is the value of the blue channel.
S = M A X ( R , G , B ) - M I N ( R , G , B ) M A X ( R , G , B ) - - - ( 3 )
Wherein S is saturation.
V = M A X ( R , G , B ) 255 - - - ( 4 )
Wherein V is lightness.
Converting the RGB color into YCbCr color by using formula (5), and then setting the YCbCr standard skin color threshold range as follows: y >20, 135< Cr <180, 85< Cb < 135.
Y = 0.299 R + 0.587 G + 0.114 B C b = ( B - Y ) &times; 0.564 + 128 C r = ( R - Y ) &times; 0.713 + 128 - - - ( 5 )
Where Y is the luminance component, Cb is the blue chrominance component, and Cr is the red chrominance component.
(c) And the range of the center position of the face is restricted.
Extracting 20 frames of images from the collected video sequence, and calculating the interval g between each frame by using a formula (6)
g = t &times; f 20 - - - ( 6 )
Wherein t is the length of the shot video, and f is the number of frames per second of the shot video.
Secondly, after the extracted 20 frames of images are detected through an AdaBoost algorithm and a skin color model, the face coordinates obtained through detection are stored, and the average value of the center coordinates of different faces is calculated by using formulas (7) and (8)
x c = x &gamma; - x l 2 - - - ( 7 )
y c = y &gamma; - y l 2 - - - ( 8 )
In the formula (x)r,yr) To detect the coordinates of the lower right corner of the face, (x)l,yl) Is the coordinate of the upper left corner.
Comparing the calculated average value with the face center coordinates in the actual image to obtain the error range of the face center coordinates, and adding constraint conditions according to the error range
xc-m≤xc_real≤xc+m (9)
yc-n≤yc_real≤yc+n (10)
In the formula (x)c_real,yc_real) The coordinates of the center position of the face obtained by actual detection.
(d) And extracting and processing the detected face to complete the establishment of the face database.
The detected face is extracted and converted into a face gray image with the size of 28 x 28 pixels. And storing the processed face images according to different constraint conditions to complete the establishment of an actual attendance face database.
(e) And (5) training the model.
And inputting the face gray level image with the size of 28 multiplied by 28 pixels in the face database as training data into the deep convolution neural network model for repeated iterative training to finish the training of the model. The specific training process is as follows:
the training process is divided into two steps: forward propagation and backward propagation.
The purpose of forward propagation is to feed training data into the network to obtain the stimulus response. Comprises a convolutional layer and a downsampling layer.
Firstly, the convolution layer is processed, and the convolution characteristic extracted by the convolution layer l is obtained by applying a formula (11) on the convolution layer I.
x j l = f ( &Sigma; i &Element; M j x j l - 1 * k j l + b j l ) - - - ( 11 )
In the formula,for convolution characteristics of convolution layer l, MjIn order to select the input set of feature maps,is a convolution kernel on the convolution layer l,for the bias values on convolutional layer l, the convolutional characteristic of convolutional layer l is obtained by activating function f.
After the convolution characteristics of the convolution layer are obtained, the formula (12) is applied to carry out downsampling processing on the convolution characteristics, namely carrying out aggregation statistics on the characteristics at different positions.
x j l = f ( &beta; j l d o w n ( x j l - 1 ) + b j l ) - - - ( 12 )
Where down (-) is the downsampling function, β is the multiplicative bias, and b is the additive bias.
② back propagation, weight and bias are adjusted by minimizing residual error. Comprises a convolutional layer and a downsampling layer.
For the convolutional layer, the next layer is the downsampled layer, and the residual is calculated using equation (13).
Where up (-) is an upsampling function, which represents the multiplication of corresponding elements in the matrix.
Wherein,
ul=Wlxl-1+bl(14)
xl=f(ul) (15)
wherein f is an activation function.
From the obtained residual errorThe gradient of the bias b is calculated using equation (16).
&part; E &part; b j = &Sigma; u , v ( &delta; j l ) u v - - - ( 16 )
In the formula, u and v represent coordinate values in the feature map.
Definition ofFor a block of n × n pixels multiplied element by element when convolved, the convolution kernel gradient is calculated by applying equation (17).
&part; E &part; k i j l = &Sigma; u , v ( &delta; j l ) u v ( p i l - 1 ) u v - - - ( 17 )
The value of the (u, v) position of the output convolution feature map is the pixel block of n × n of the (u, v) position in the previous layer and the convolution kernel kijElemental phase by elemental phase results.
The downsampled layers are processed, the next layer of the downsampled layers is a convolutional layer, and the downsampled feature map residual is calculated by applying the formula (18).
In the formula,represents the convolution kernelThe matrix is rotated by 180, i.e. the matrix elements are swapped diagonally. conv2 is a full convolution function, ' full ' indicating that the matrix obtained for full convolution is filled with 0's in the absence.
After the residual is obtained, the gradient of the offset b is calculated by applying equation (19).
&part; E &part; b j = &Sigma; u , v ( &delta; j l ) u v - - - ( 19 )
In the formula, u and v represent coordinate values in the feature map.
Definition ofThe convolution kernel gradient is calculated using equation (20) using the residual found.
And thirdly, outputting the layer to classify the characteristics. The essence of the method is a classifier, and a softmax classifier is adopted to solve the problem of multi-classification.
(f) The trained model is tested using the test data.
Acquisition of test data.
In application, images are shot for all the collected persons. Detecting the face appearing in the image, processing the detected face to obtain a face gray image with the size of 28 multiplied by 28 pixels, and storing the processed image as test data.
Secondly, inputting the test data into the trained model, recognizing the input test data by the model, outputting the face label on the corresponding training set, and finishing face recognition.
(g) And (3) performing multiple experiments by adjusting the layer number of the convolutional neural network, the number of convolutional kernels of each layer of the network and the learning rate, namely repeating the step (e) and the step (f), comparing the experiment results, selecting the model parameter with the highest recognition rate, and storing the parameter and the training model.
The invention has the beneficial effects that: the method utilizes an AdaBoost algorithm and a skin color model to carry out multi-target detection and extraction. Only one video is shot for all the faces participating in attendance, and the faces in the video sequence are detected and extracted to complete the establishment of the face database. The problems that in actual attendance, human face data acquisition is time-consuming and labor-consuming and is difficult to acquire in a unified mode are solved, and mass human face data can be acquired more easily. In addition, the face recognition method based on deep learning is based on a deep convolutional neural network LeNet-5 model, the simplified LeNet-5 model is applied to learn the face features of different scenes in a face database, and new feature representation is obtained through multilayer nonlinear transformation. The new features remove the intra-class changes such as illumination, noise, posture, expression and the like as much as possible, and keep the inter-class changes generated by different identities, thereby improving the face recognition rate of the face recognition method in the actual complex scene.
The present invention will be described in detail with reference to the following embodiments.
Detailed Description
The classroom attendance checking method based on the multi-face data acquisition strategy and deep learning specifically comprises the following steps:
1. and acquiring face data.
The method completes the acquisition of the face data by shooting the video of the person to be acquired, and obtains a video sequence of about 30 seconds by shooting.
In the video shooting process, an acquirer formulates a series of rules to simulate the changes of the face possibly occurring in the actual attendance. Including smiling, frowning, and other expression changes, and mouth opening, head raising, head lowering, and facial orientation changing. And the person to be collected changes expressions and actions in the video shooting process according to the instruction of the person to be collected.
2. Multi-face detection by using AdaBoost algorithm and combining skin color model
The invention combines the AdaBoost algorithm with the skin color model, positions the face position through the AdaBoost algorithm, and then performs skin color check on the face position by using the skin color model, thereby greatly reducing the false detection rate during face detection. The main implementation method comprises the following steps:
firstly, a classifier for face detection is generated by using an Adaboost algorithm, and preliminary face detection is carried out.
Secondly, checking a human face region preliminarily determined by adopting a skin color model, and comparing pixels in the image with a standard skin color so as to distinguish a skin color region from a non-skin color region in the image. When setting the range of standard skin color, three color spaces are adopted: RGB color space, HSV color space, YCbCr color space.
Two RGB standard skin color models are set. Threshold range for model one: g >40, B >20, R > G, R > B, MAX (R, G, B) -MIN (R, G, B) > 15; threshold range for model two: r >220, | R-G | <15, R > G, R > B.
Converting the RGB color into HSV color by adopting formulas (1), (2), (3) and (4), and setting an HSV standard skin color threshold range: 0< H <50, 0.23< S < 0.68.
H = H i , B &le; G 360 - H , B &GreaterEqual; G - - - ( 1 )
In the formula, H represents a hue. Wherein,
H i = 1 2 ( R - G ) + ( R - B ) ( R - G ) 2 + ( R - G ) &times; ( G - B ) - - - ( 2 )
where R is the value of the red channel, G is the value of the green channel, and B is the value of the blue channel.
S = M A X ( R , G , B ) - M I N ( R , G , B ) M A X ( R , G , B ) - - - ( 3 )
Wherein S is saturation.
V = M A X ( R , G , B ) 255 - - - ( 4 )
Wherein V is lightness.
Converting the RGB color into YCbCr color by using formula (5), and then setting the YCbCr standard skin color threshold range as follows: y >20, 135< Cr <180, 85< Cb < 135.
Y = 0.299 R + 0.587 G + 0.114 B C b = ( B - Y ) &times; 0.564 + 128 C r = ( R - Y ) &times; 0.713 + 128 - - - ( 5 )
Where Y is the luminance component, Cb is the blue chrominance component, and Cr is the red chrominance component.
3. And the range of the center position of the face is restricted.
When the face data is actually collected, the expression and the action of the face are specified by the collector, so that the amplitude of the face change in the video is small, and the appearance range of the face is easy to determine.
Extracting 20 frames of images from the collected video sequence, and calculating the interval g between each frame by using a formula (6)
g = t &times; f 20 - - - ( 6 )
Wherein t is the length of the shot video, and f is the number of frames per second of the shot video.
Secondly, after the extracted 20 frames of images are detected through an AdaBoost algorithm and a skin color model, the coordinates of the detected human face are stored, and the average value of the central coordinates of different human faces is calculated by using formulas (7) and (8)
x c = x &gamma; - x l 2 - - - ( 7 )
y c = y &gamma; - y l 2 - - - ( 8 )
In the formula (x)r,yr) To detect the coordinates of the lower right corner of the face, (x)l,yl) Is the coordinate of the upper left corner.
Comparing the calculated average value with the face center coordinates in the actual image to obtain the error range of the face center coordinates, and adding constraint conditions according to the error range
xc-m≤xc_real≤xc+m (9)
yc-n≤yc_real≤yc+n (10)
In the formula (x)c_real,yc_real) The coordinates of the center position of the face obtained by actual detection.
4. And extracting and processing the detected face to complete the establishment of the face database.
The detected face is extracted and converted into a grayscale image of 28 pixels × 28 pixels in size. And storing the processed face images according to different constraint conditions to complete the establishment of an actual attendance face database.
5. And (5) training the model.
And inputting the face image with the size of 28 pixels multiplied by 28 pixels in the face database as training data into the deep convolution neural network model for repeated iterative training to finish the training of the model. The specific training process is as follows:
the training process is mainly divided into two steps: forward propagation and backward propagation.
The purpose of forward propagation is to feed training data into the network to obtain the stimulus response. Comprises a convolutional layer and a downsampling layer.
Firstly, the convolution layer is processed, and the convolution characteristic extracted by the convolution layer l is obtained by applying a formula (11) on the convolution layer I.
x j l = f ( &Sigma; i &Element; M j x j l - 1 * k j l + b j l ) - - - ( 11 )
In the formula,for convolution characteristics of convolution layer l, MjIn order to select the input set of feature maps,is a convolution kernel on the convolution layer l,for the bias values on convolutional layer l, the convolutional characteristic of convolutional layer l is obtained by activating function f.
After the convolution characteristics of the convolution layer are obtained, the formula (12) is applied to carry out downsampling processing on the convolution characteristics, namely carrying out aggregation statistics on the characteristics at different positions. The resulting summary statistical features not only have much lower dimensionality than the features obtained using all the extractions, but also improve the results and are not easily overfitting.
x j l = f ( &beta; j l d o w n ( x j l - 1 ) + b j l ) - - - ( 12 )
Where down (-) is the down-sampling function, β is the multiplicative bias, and b is the additive bias.
② back propagation, weight and bias are adjusted by minimizing residual error. Comprises a convolutional layer and a downsampling layer.
For the convolutional layer, the next layer is the downsampled layer, and the residual is calculated using equation (13).
Where up (-) is an upsampling function, which represents the multiplication of corresponding elements in the matrix.
Wherein,
ul=Wlxl-1+bl(14)
xl=f(ul) (15)
wherein f is an activation function.
Residual error obtained from the aboveThe gradient of the bias b can be calculated, applying equation (16).
&part; E &part; b j = &Sigma; u , v ( &delta; j l ) u v - - - ( 16 )
In the formula, u and v represent coordinate values in the feature map.
Definition ofFor a block of n × n pixels multiplied element by element when convolved, the convolution kernel gradient is calculated by applying equation (17).
&part; E &part; k i j l = &Sigma; u , v ( &delta; j l ) u v ( p i l - 1 ) u v - - - ( 17 )
The value of the (u, v) position of the output convolution feature map is the pixel block of n × n of the (u, v) position in the previous layer and the convolution kernel kijElemental phase by elemental phase results.
The downsampled layers are processed, the next layer of the downsampled layers is a convolutional layer, and the downsampled feature map residual is calculated by applying the formula (18).
In the formula,represents the convolution kernelThe matrix is rotated by 180, i.e. the matrix elements are swapped diagonally. conv2 is a full convolution function, ' full ' indicating that the matrix obtained for full convolution is filled with 0's in the absence.
After the residual is obtained, the gradient of the bias b is calculated again and equation (19) is applied.
&part; E &part; b j = &Sigma; u , v ( &delta; j l ) u v - - - ( 19 )
In the formula, u and v represent coordinate values in the feature map.
Definition ofThe convolution kernel gradient is calculated using the residual error found, and equation (20) is applied.
And thirdly, outputting the layer to classify the characteristics. The essence of the method is a classifier, and a softmax classifier is adopted to solve the problem of multi-classification.
6. The trained model is tested using the test data.
Acquisition of test data
In practice, images are taken of all students during a class. Detecting the face appearing in the picture, processing the detected face to obtain a face gray image with the size of 28 pixels multiplied by 28 pixels, and storing the processed image as test data.
Secondly, inputting the test data into the trained model, recognizing the input test data by the model, outputting the face label on the corresponding training set, and finishing face recognition.
7. And (3) performing multiple experiments by adjusting parameters such as the number of layers of the convolutional neural network, the number of convolutional kernels of each layer of the network, the learning rate and the like, namely repeating the steps 5 and 6, comparing the experiment results, selecting the model parameter with the highest recognition rate, and storing the parameter and the training model.
In a word, the invention provides a classroom attendance method based on a multi-face data acquisition strategy and deep learning, the problem that massive face data are difficult to acquire one by one in actual attendance is solved by using the multi-face data acquisition strategy, and the face acquisition efficiency is greatly improved. Meanwhile, a large number of facial images in a complex actual environment are applied to train a deep learning model, the model learns new features from the deep learning model, the new features remove intra-class changes such as illumination, noise, postures and expressions as much as possible, and inter-class changes generated by different identities are reserved, so that the problem of poor face recognition rate of a traditional face recognition method in an actual complex scene is solved.

Claims (1)

1. A classroom attendance checking method based on a multi-face data acquisition strategy and deep learning is characterized by comprising the following steps:
(a) acquiring face data;
shooting a video sequence of 30 seconds on the face of an acquired person, and in the video shooting process, the acquiring person makes a series of rules to simulate the possible changes of the face in the actual attendance; the facial expression changes comprise smiling and frown expression changes and action changes of opening mouth, raising head, lowering head and changing face orientation; the person to be collected changes expression and action in the video shooting process according to the requirement of the collector;
(b) and performing multi-face detection by using an AdaBoost algorithm and combining a skin color model.
Combining an AdaBoost algorithm with a skin color model, positioning the position of a human face through the AdaBoost algorithm, and then performing skin color check on the human face by using the skin color model, wherein the method comprises the following steps:
firstly, generating a classifier for face detection by using an Adaboost algorithm, and carrying out primary face detection;
checking a human face region preliminarily determined by adopting a skin color model, and comparing pixels in the image with a standard skin color to distinguish a skin color region from a non-skin color region in the image; when a standard skin color range is set, three color spaces are adopted: RGB color space, HSV color space, YCbCr color space;
setting two RGB standard skin color models; threshold range for model one: g >40, B >20, R > G, R > B, MAX (R, G, B) -MIN (R, G, B) > 15; threshold range for model two: r >220, | R-G | <15, R > G, R > B;
converting the RGB color into HSV color by adopting formulas (1), (2), (3) and (4), and setting an HSV standard skin color threshold range: 0< H <50, 0.23< S < 0.68;
H = H i , B &le; G 360 - H , B &GreaterEqual; G - - - ( 1 )
wherein H is a hue; wherein,
H i = 1 2 ( R - G ) + ( R - B ) ( R - G ) 2 + ( R - G ) &times; ( G - B ) - - - ( 2 )
wherein R is the value of the red channel, G is the value of the green channel, and B is the value of the blue channel;
S = M A X ( R , G , B ) - M I N ( R , G , B ) M A X ( R , G , B ) - - - ( 3 )
wherein S is saturation;
V = M A X ( R , G , B ) 255 - - - ( 4 )
wherein V is lightness;
converting the RGB color into YCbCr color by using formula (5), and then setting the YCbCr standard skin color threshold range as follows: y >20, 135< Cr <180, 85< Cb < 135;
Y = 0.299 R + 0.587 G + 0.114 B C b = ( B - Y ) &times; 0.564 + 128 C r = ( R - Y ) &times; 0.713 + 128 - - - ( 5 )
wherein Y is a luminance component, Cb is a blue chrominance component, and Cr is a red chrominance component;
(c) the range of the center position of the face is restrained;
extracting 20 frames of images from the collected video sequence, and calculating the interval g between each frame by using a formula (6)
g = t &times; f 20 - - - ( 6 )
Wherein t is the length of the shot video, and f is the frame number of the shot video per second;
secondly, after the extracted 20 frames of images are detected through an AdaBoost algorithm and a skin color model, the face coordinates obtained through detection are stored, and the average value of the center coordinates of different faces is calculated by using formulas (7) and (8)
x c = x &gamma; - x l 2 - - - ( 7 )
y c = y &gamma; - y l 2 - - - ( 8 )
In the formula (x)r,yr) To detect the coordinates of the lower right corner of the face, (x)l,yl) Coordinates of the upper left corner;
comparing the calculated average value with the face center coordinates in the actual image to obtain the error range of the face center coordinates, and adding constraint conditions according to the error range
xc-m≤xc_real≤xc+m (9)
yc-n≤yc_real≤yc+n (10)
In the formula (x)c_real,yc_real) The coordinates of the center position of the face are obtained by actual detection;
(d) extracting and processing the detected face to complete the establishment of a face database;
extracting a detected face, and converting the detected face into a face gray image with the size of 28 multiplied by 28 pixels; storing the processed face images according to different constraint conditions to complete the establishment of an actual attendance face database;
(e) training a model;
inputting a face gray image with the size of 28 multiplied by 28 pixels in a face database as training data into a deep convolution neural network model for repeated iterative training to finish the training of the model; the specific training process is as follows:
the training process is divided into two steps: forward propagation and backward propagation;
the purpose of forward propagation is to send training data into a network to obtain an excitation response; comprises a convolution layer and a down-sampling layer;
firstly, processing the convolution layer, and applying a formula (11) on the first convolution layer to obtain convolution characteristics extracted from the convolution layer;
x j l = f ( &Sigma; i &Element; M j x j l - 1 * k j l + b j l ) - - - ( 11 )
in the formula,for convolution characteristics of convolution layer l, MjIn order to select the input set of feature maps,is a convolution kernel on the convolution layer l,obtaining the convolution characteristic of the convolution layer l for the offset value on the convolution layer l through an activation function f;
after the convolution characteristics of the convolution layer are obtained, the formula (12) is applied to carry out downsampling processing on the convolution characteristics, namely, aggregation statistics is carried out on the characteristics at different positions;
x j l = f ( &beta; j l d o w n ( x j l - 1 ) + b j l ) - - - ( 12 )
wherein down (-) is a down-sampling function, β is a multiplicative bias, and b is an additive bias;
backward propagation, and adjusting the weight and the bias by minimizing residual errors; comprises a convolution layer and a down-sampling layer;
for the convolutional layer, the next layer is a downsampled layer, and the residual error is calculated by using the formula (13);
wherein up (·) is an upsampling function, o represents multiplication of corresponding elements in a matrix;
wherein,
ul=Wlxl-1+bl(14)
xl=f(ul) (15)
wherein f is an activation function;
from the obtained residual errorCalculating the gradient of the bias b by applying the formula (16);
&part; E &part; b j = &Sigma; u , v ( &delta; j l ) u v - - - ( 16 )
in the formula, u and v represent coordinate values in the feature map;
definition ofCalculating a convolution kernel gradient by applying a formula (17) for the pixel block of n × n multiplied by element when in convolution;
&part; E &part; k i j l = &Sigma; u , v ( &delta; j l ) u v ( p i l - 1 ) u v - - - ( 17 )
the value of the (u, v) position of the output convolution feature map is the pixel block of n × n of the (u, v) position in the previous layer and the convolution kernel kijElemental phase by elemental phase results;
processing down-sampling layers, wherein the next layer of the down-sampling layers is a convolution layer, and calculating a down-sampling feature map residual error by applying a formula (18);
in the formula,represents the convolution kernelThe matrix is rotated by 180 degrees, namely, the matrix elements are exchanged according to the diagonal; conv2 is a full convolution function, and 'full' indicates that the matrix obtained by full convolution is filled with 0 in a vacancy;
after obtaining the residue, calculating the gradient of the bias b by applying a formula (19);
&part; E &part; b j = &Sigma; u , v ( &delta; j l ) u v - - - ( 19 )
in the formula, u and v represent coordinate values in the feature map;
definition ofCalculating the gradient of the convolution kernel by using the obtained residual error and applying a formula (20);
thirdly, outputting a layer to classify the characteristics; the essence is a classifier, and a softmax classifier is adopted to solve the problem of multi-classification;
(f) testing the trained model by using the test data;
firstly, acquiring test data;
in application, all the collected persons are photographed with images; detecting a face appearing in the image, processing the detected face to obtain a face gray image with the size of 28 multiplied by 28 pixels, and storing the processed image as test data;
inputting the test data into the trained model, identifying the input test data by the model, outputting a face label on a corresponding training set, and completing face identification;
(g) and (3) performing multiple experiments by adjusting the layer number of the convolutional neural network, the number of convolutional kernels of each layer of the network and the learning rate, namely repeating the step (e) and the step (f), comparing the experiment results, selecting the model parameter with the highest recognition rate, and storing the parameter and the training model.
CN201610504632.0A 2016-06-30 2016-06-30 Check class attendance method based on plurality of human faces data collection strategy and deep learning Active CN106204779B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610504632.0A CN106204779B (en) 2016-06-30 2016-06-30 Check class attendance method based on plurality of human faces data collection strategy and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610504632.0A CN106204779B (en) 2016-06-30 2016-06-30 Check class attendance method based on plurality of human faces data collection strategy and deep learning

Publications (2)

Publication Number Publication Date
CN106204779A true CN106204779A (en) 2016-12-07
CN106204779B CN106204779B (en) 2018-08-31

Family

ID=57462734

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610504632.0A Active CN106204779B (en) 2016-06-30 2016-06-30 Check class attendance method based on plurality of human faces data collection strategy and deep learning

Country Status (1)

Country Link
CN (1) CN106204779B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778589A (en) * 2016-12-09 2017-05-31 厦门大学 A kind of masked method for detecting human face of robust based on modified LeNet
CN106960185A (en) * 2017-03-10 2017-07-18 陕西师范大学 The Pose-varied face recognition method of linear discriminant depth belief network
CN107292278A (en) * 2017-06-30 2017-10-24 哈尔滨理工大学 A kind of face identification device and its recognition methods based on Adaboost algorithm
CN108073917A (en) * 2018-01-24 2018-05-25 燕山大学 A kind of face identification method based on convolutional neural networks
CN108228872A (en) * 2017-07-21 2018-06-29 北京市商汤科技开发有限公司 Facial image De-weight method and device, electronic equipment, storage medium, program
CN108416797A (en) * 2018-02-27 2018-08-17 鲁东大学 A kind of method, equipment and the storage medium of detection Behavioral change
CN108427921A (en) * 2018-02-28 2018-08-21 辽宁科技大学 A kind of face identification method based on convolutional neural networks
CN108830980A (en) * 2018-05-22 2018-11-16 重庆大学 Security protection integral intelligent robot is received in Study of Intelligent Robot Control method, apparatus and attendance
CN108875654A (en) * 2018-06-25 2018-11-23 深圳云天励飞技术有限公司 A kind of face characteristic acquisition method and device
CN109460974A (en) * 2018-10-29 2019-03-12 广州皓云原智信息科技有限公司 A kind of attendance checking system based on gesture recognition
CN109766813A (en) * 2018-12-31 2019-05-17 陕西师范大学 Dictionary learning face identification method based on symmetrical face exptended sample
CN110263618A (en) * 2019-04-30 2019-09-20 阿里巴巴集团控股有限公司 The alternative manner and device of one seed nucleus body model
CN110276263A (en) * 2019-05-24 2019-09-24 长江大学 A kind of face identification system and recognition methods
CN110313894A (en) * 2019-04-15 2019-10-11 四川大学 Arrhythmia cordis sorting algorithm based on convolutional neural networks
CN110728225A (en) * 2019-10-08 2020-01-24 北京联华博创科技有限公司 High-speed face searching method for attendance checking
CN110852704A (en) * 2019-10-22 2020-02-28 佛山科学技术学院 Attendance checking method, system, equipment and medium based on dense micro face recognition
CN111507227A (en) * 2020-04-10 2020-08-07 南京汉韬科技有限公司 Multi-student individual segmentation and state autonomous identification method based on deep learning
CN111814704A (en) * 2020-07-14 2020-10-23 陕西师范大学 Full convolution examination room target detection method based on cascade attention and point supervision mechanism
CN111881876A (en) * 2020-08-06 2020-11-03 桂林电子科技大学 Attendance checking method based on single-order anchor-free detection network
CN113450369A (en) * 2021-04-20 2021-09-28 广州铁路职业技术学院(广州铁路机械学校) Classroom analysis system and method based on face recognition technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102223520A (en) * 2011-04-15 2011-10-19 北京易子微科技有限公司 Intelligent face recognition video monitoring system and implementation method thereof
CN104573679A (en) * 2015-02-08 2015-04-29 天津艾思科尔科技有限公司 Deep learning-based face recognition system in monitoring scene

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102223520A (en) * 2011-04-15 2011-10-19 北京易子微科技有限公司 Intelligent face recognition video monitoring system and implementation method thereof
CN104573679A (en) * 2015-02-08 2015-04-29 天津艾思科尔科技有限公司 Deep learning-based face recognition system in monitoring scene

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
汪济民: "基于卷积神经网络的人脸检测和性别识别研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
裴炤: "基于AdaBoost+肤色模型的多人脸检测考勤系统", 《电子产品世界》 *
赵男男: "基于AdaBoost算法与肤色模型的多姿态人脸检测", 《计算机工程与科学》 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778589A (en) * 2016-12-09 2017-05-31 厦门大学 A kind of masked method for detecting human face of robust based on modified LeNet
CN106960185B (en) * 2017-03-10 2019-10-25 陕西师范大学 The Pose-varied face recognition method of linear discriminant deepness belief network
CN106960185A (en) * 2017-03-10 2017-07-18 陕西师范大学 The Pose-varied face recognition method of linear discriminant depth belief network
CN107292278A (en) * 2017-06-30 2017-10-24 哈尔滨理工大学 A kind of face identification device and its recognition methods based on Adaboost algorithm
US11132581B2 (en) 2017-07-21 2021-09-28 Beijing Sensetime Technology Development Co., Ltd Method and apparatus for face image deduplication and storage medium
CN108228872A (en) * 2017-07-21 2018-06-29 北京市商汤科技开发有限公司 Facial image De-weight method and device, electronic equipment, storage medium, program
WO2019015682A1 (en) * 2017-07-21 2019-01-24 北京市商汤科技开发有限公司 Dynamic facial image warehousing method and apparatus, electronic device, medium, and program
US11409983B2 (en) 2017-07-21 2022-08-09 Beijing Sensetime Technology Development Co., Ltd Methods and apparatuses for dynamically adding facial images into database, electronic devices and media
CN108228871A (en) * 2017-07-21 2018-06-29 北京市商汤科技开发有限公司 Facial image dynamic storage method and device, electronic equipment, medium, program
CN108073917A (en) * 2018-01-24 2018-05-25 燕山大学 A kind of face identification method based on convolutional neural networks
CN108416797A (en) * 2018-02-27 2018-08-17 鲁东大学 A kind of method, equipment and the storage medium of detection Behavioral change
CN108427921A (en) * 2018-02-28 2018-08-21 辽宁科技大学 A kind of face identification method based on convolutional neural networks
CN108830980A (en) * 2018-05-22 2018-11-16 重庆大学 Security protection integral intelligent robot is received in Study of Intelligent Robot Control method, apparatus and attendance
CN108875654A (en) * 2018-06-25 2018-11-23 深圳云天励飞技术有限公司 A kind of face characteristic acquisition method and device
CN109460974B (en) * 2018-10-29 2021-09-07 广州皓云原智信息科技有限公司 Attendance system based on gesture recognition
CN109460974A (en) * 2018-10-29 2019-03-12 广州皓云原智信息科技有限公司 A kind of attendance checking system based on gesture recognition
CN109766813A (en) * 2018-12-31 2019-05-17 陕西师范大学 Dictionary learning face identification method based on symmetrical face exptended sample
CN110313894A (en) * 2019-04-15 2019-10-11 四川大学 Arrhythmia cordis sorting algorithm based on convolutional neural networks
CN110263618B (en) * 2019-04-30 2023-10-20 创新先进技术有限公司 Iteration method and device of nuclear body model
CN110263618A (en) * 2019-04-30 2019-09-20 阿里巴巴集团控股有限公司 The alternative manner and device of one seed nucleus body model
CN110276263B (en) * 2019-05-24 2021-05-14 长江大学 Face recognition system and recognition method
CN110276263A (en) * 2019-05-24 2019-09-24 长江大学 A kind of face identification system and recognition methods
CN110728225B (en) * 2019-10-08 2022-04-19 北京联华博创科技有限公司 High-speed face searching method for attendance checking
CN110728225A (en) * 2019-10-08 2020-01-24 北京联华博创科技有限公司 High-speed face searching method for attendance checking
CN110852704A (en) * 2019-10-22 2020-02-28 佛山科学技术学院 Attendance checking method, system, equipment and medium based on dense micro face recognition
CN110852704B (en) * 2019-10-22 2023-04-25 佛山科学技术学院 Attendance checking method, system, equipment and medium based on dense micro face recognition
CN111507227A (en) * 2020-04-10 2020-08-07 南京汉韬科技有限公司 Multi-student individual segmentation and state autonomous identification method based on deep learning
CN111507227B (en) * 2020-04-10 2023-04-18 南京汉韬科技有限公司 Multi-student individual segmentation and state autonomous identification method based on deep learning
CN111814704A (en) * 2020-07-14 2020-10-23 陕西师范大学 Full convolution examination room target detection method based on cascade attention and point supervision mechanism
CN111881876A (en) * 2020-08-06 2020-11-03 桂林电子科技大学 Attendance checking method based on single-order anchor-free detection network
CN111881876B (en) * 2020-08-06 2022-04-08 桂林电子科技大学 Attendance checking method based on single-order anchor-free detection network
CN113450369A (en) * 2021-04-20 2021-09-28 广州铁路职业技术学院(广州铁路机械学校) Classroom analysis system and method based on face recognition technology

Also Published As

Publication number Publication date
CN106204779B (en) 2018-08-31

Similar Documents

Publication Publication Date Title
CN106204779B (en) Check class attendance method based on plurality of human faces data collection strategy and deep learning
CN108537743B (en) Face image enhancement method based on generation countermeasure network
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
CN110399821B (en) Customer satisfaction acquisition method based on facial expression recognition
CN108268859A (en) A kind of facial expression recognizing method based on deep learning
CN107844797A (en) A kind of method of the milking sow posture automatic identification based on depth image
CN110827304B (en) Traditional Chinese medicine tongue image positioning method and system based on deep convolution network and level set method
CN109359681A (en) A kind of field crop pest and disease disasters recognition methods based on the full convolutional neural networks of improvement
CN107633229A (en) Method for detecting human face and device based on convolutional neural networks
CN110929687B (en) Multi-user behavior recognition system based on key point detection and working method
CN101908153B (en) Method for estimating head postures in low-resolution image treatment
Aydogdu et al. Comparison of three different CNN architectures for age classification
CN106778785A (en) Build the method for image characteristics extraction model and method, the device of image recognition
CN110032932B (en) Human body posture identification method based on video processing and decision tree set threshold
CN111160194B (en) Static gesture image recognition method based on multi-feature fusion
CN111667400A (en) Human face contour feature stylization generation method based on unsupervised learning
CN110969171A (en) Image classification model, method and application based on improved convolutional neural network
CN113486700A (en) Facial expression analysis method based on attention mechanism in teaching scene
CN111507227B (en) Multi-student individual segmentation and state autonomous identification method based on deep learning
CN110163567A (en) Classroom roll calling system based on multitask concatenated convolutional neural network
CN113205002B (en) Low-definition face recognition method, device, equipment and medium for unlimited video monitoring
CN113610046B (en) Behavior recognition method based on depth video linkage characteristics
CN109902613A (en) A kind of human body feature extraction method based on transfer learning and image enhancement
CN112883867A (en) Student online learning evaluation method and system based on image emotion analysis
CN104361357A (en) Photo set classification system and method based on picture content analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant