CN114155573A - Human species identification method and device based on SE-ResNet network and computer storage medium - Google Patents

Human species identification method and device based on SE-ResNet network and computer storage medium Download PDF

Info

Publication number
CN114155573A
CN114155573A CN202111305054.5A CN202111305054A CN114155573A CN 114155573 A CN114155573 A CN 114155573A CN 202111305054 A CN202111305054 A CN 202111305054A CN 114155573 A CN114155573 A CN 114155573A
Authority
CN
China
Prior art keywords
layer
resnet network
face
race
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111305054.5A
Other languages
Chinese (zh)
Inventor
虞志媛
杨立成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hongmu Intelligent Technology Co ltd
Original Assignee
Shanghai Hongmu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hongmu Intelligent Technology Co ltd filed Critical Shanghai Hongmu Intelligent Technology Co ltd
Priority to CN202111305054.5A priority Critical patent/CN114155573A/en
Publication of CN114155573A publication Critical patent/CN114155573A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a race identification method based on an SE-ResNet network, which comprises the following steps: acquiring real race image data as original data, detecting the face, rotating, filling, correcting and zooming to a uniform size; removing the characteristic-free images which are shielded by the side face, the lower head and the five sense organ areas of the human face in a large area; performing diversity enhancement on the image data; performing subtraction processing on RGB channels of image data respectively, and then performing classification and labeling; adding an SE residual error module on the basis of ResNet50 to establish an SE-ResNet network, and training; and selecting a picture to be identified, inputting the picture to the SE-ResNet network obtained through training, and performing classification identification to obtain a result. The invention provides the SE-ResNet network training after no obvious characteristic data by detecting and processing the acquired image data, and has better recognition speed and accuracy.

Description

Human species identification method and device based on SE-ResNet network and computer storage medium
Technical Field
The invention relates to a human race identification method, in particular to a human race identification method and device based on an SE-ResNet network and a computer storage medium.
Background
The existing world ethnic identification is mostly recognized four major ethnicities, and in personnel management work, because the characteristics of different ethnic countries are different, the people need to respect the respective living and working habits, and therefore the people need to be identified. The prior art focuses on distinguishing four major ethnic groups, such as those in patent application nos. 202010996916.2 and 201811372085.0, and solves the problem that the existing face data sets contain most of the european and american ethnic data, so that the recognition results for other ethnic groups are poor. While there are many types within a race, the existing recognition methods are not accurate enough to recognize this particular subtype.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a race identification method based on an SE-ResNet network, which solves the problem that the accuracy of classification identification of a single subtype in four races is not high.
The technical scheme of the invention is as follows: a ethnic identification method based on SE-ResNet network comprises the following steps:
s1, acquiring real race image data as original data, detecting the face, performing rotary filling correction on the face according to the facial features, and zooming to a uniform size;
s2, removing the featureless images which are blocked by the human face side face, the head and the five sense organ regions in a large area;
s3, randomly adjusting the brightness, the contrast, the definition and the sharpness of the image data to increase the diversity of the image data and then performing Gaussian blur processing on the image;
s4, performing subtraction processing on RGB channels of the image data respectively, and then labeling the data set according to classification to distinguish a general race face and a specific race face, wherein the general race is other than the specific race;
s5, establishing an SE-ResNet network, wherein the SE-ResNet network sequentially comprises a first convolution module, a second convolution module, a first pooling layer, a first SE residual error module, a third convolution module, a second pooling layer, a second SE residual error module, a third SE residual error module, a fourth convolution module, a third pooling layer, fifth to ninth SE residual error modules, a fifth convolution module, a fourth pooling layer, tenth to twelfth SE residual error modules, a first full-link layer, a second full-link layer and a softmax layer, the first to fifth convolution modules respectively include a convolution layer + an active layer, the SE residual module includes a main path and a side path respectively connected to the eltwise layer, the main path is sequentially a first convolution layer + an active layer, a second convolution layer + an active layer, an average pooling layer, a full-connection layer, an active function, a full-connection layer and a Sigmoid and then is connected to the eltwise layer, and the side path is directly connected with the eltwise layer;
s6, training the SE-ResNet network by adopting the data set obtained in the step S4;
s7, selecting pictures to be identified, inputting the pictures to the SE-ResNet network trained in the step S6, and carrying out classification and identification to obtain a result.
Further, the labeled data set is divided into a test set and a training set in the step S4, the step S6 includes S6-1, the SE-ResNet network is trained by using the training set, then the SE-ResNet network trained in the step S6-1 is tested by using the original data corresponding to the test set, and the test set data corresponding to the original data with correct results and incorrect results in the test results form an optimized training set; and S6-2, carrying out optimization training on the SE-ResNet network trained in the step S6-1 by adopting an optimization training set.
Further, when the step S6-1 is performed, the optimizer is SGD, the loss function is a cross entropy loss function, the initial learning rate is 0.001, 100000 epochs are set for training, the learning strategy is multistep,20000 epoch learning rates are attenuated to 0.0001, 40000 epoch learning rates are attenuated to 0.00001 before, and the momentum is 0.99; when the optimization training in the step S6-2 is performed, the optimizer is Adam, the loss function is a cross entropy loss function, the initial learning rate is 0.0001, 50000 epochs are set for training, the learning strategy is step, the learning rate of each 5000 epochs is attenuated to 50% of the previous learning rate, the momentum is 0.99, and the learning rate of a second full connection layer and the bias learning rate in the SE-ResNet network are respectively expanded by 10 times to perform optimization to generate a network model; the result confidence is set to 0.8 when the classification recognition is performed in step S7.
Further, when the step S6-2 is performed with the optimization training, the percentage of the test set data corresponding to the original data with the wrong test result in the step S6-1 in the optimization training set is 40% to 50%.
Further, the ratio of the data of the general race category to the specific race category in the test set, the training set and the optimized training set is 1: 1.
further, the specific step of step S2 includes: s2-1, classifying and positioning the original data by adopting a first neural network, wherein the classification comprises a front face, a side face and a lower head, the positioning is coordinate points of eyebrows, eyebrows and eyebrows at the left and right sides, coordinate points of inner canthus, eyes and outer canthus of eyes at the left and right sides, coordinate points of nose heads, two sides of nose wings and nose tails of a nose, coordinate points of two ends, an upper lip, a middle lip and a lower lip of a mouth and coordinate points of two upper and lower positions of a left and right ear connecting face outline; s2-2, comparing the distance between the left ear and the left eye with the distance between the right ear and the right eye to calculate the face front side angle, and comparing the y value coordinates of the two ears with the y value coordinates of the two eyes to calculate the face pitch angle; s2-3, identifying whether the area of the five sense organs is large-area shielded or not by adopting a second neural network; s2-4, removing the featureless images which are judged to be 90-degree side faces, 70-degree or above head lowering and large-area occlusion of the five sense organ areas; the first neural network structure is a modified VGG network, the base layer is a convolutional layer with 5 step lengths of 1 and an active layer, the base layer is a pooling layer, the three convolutional layers are a convolutional layer with 3 step lengths of 1 and an active layer and a pooling layer, the base layer is a fully-connected calculation classification result, and the calculation result is divided into the classification and coordinate points by adopting a partition layer slice; the second neural network structure is a modified VGG network, the base layer is a convolution layer with convolution kernels of 7 and 4 in step length and an activation layer, the pooling layer is a convolution layer with convolution kernels of 3 and 1 in step length and an activation layer, the pooling layer is a layer, and the full-connection layer is used for calculating classification results.
Further, the random adjustment in step S3 is to randomly select whether the brightness, the contrast, the sharpness, or the sharpness is to be adjusted, and randomly select the forward adjustment or the backward adjustment for the parameters of the selected adjustment, and the adjustment magnitudes of the forward adjustment and the backward adjustment are the same.
The invention also provides a human race recognition device based on the SE-ResNet network, which comprises a processor and a memory, wherein the memory is stored with a computer program, and the computer program is executed by the processor to realize the human race recognition method based on the SE-ResNet network.
The invention also provides a computer storage medium, wherein a computer program is stored on the computer storage medium, and when the computer program is executed by a processor, the human species identification method based on the SE-ResNet network is realized.
The technical scheme provided by the invention has the advantages that: the invention constructs the SE-ResNet network aiming at the actual requirement of identifying the race based on the image, thereby improving the identification speed and reducing the labor. The face in the image data is detected in the early stage, the inferior data shielded by special angles and five sense organs is eliminated to form training data, the influence on the training result can be reduced, the VGG network is used for classifying and positioning the image, and the efficiency and accuracy of data elimination are improved by matching with the calculation of the angle. An ideal recognition network model is obtained through preliminary training and optimization training, and a reasonable confidence threshold value is set, so that the recognition accuracy is improved.
Drawings
FIG. 1 is a schematic diagram of a training process of an SE-ResNet network adopted by the race identification method based on the SE-ResNet network.
Fig. 2 is a schematic diagram of a SE-ResNet network structure.
FIG. 3 is a schematic diagram of the structure of SE residual modules in the SE-ResNet network structure.
Detailed Description
The present invention is further described in the following examples, which are intended to be illustrative only and not to be limiting as to the scope of the invention, which is to be given the full breadth of the appended claims and any and all equivalent modifications within the scope of the following claims.
Please refer to fig. 1, the race identification method based on SE-ResNet network of the present embodiment includes the following steps:
s1, acquiring real race image data, and generating a data set: specifically, the human species data set used in the invention is obtained by snapping in the actual project of the company, and 6W of general human faces and 6W of specific human species faces are obtained in total. Almost covers the process of imaging different cameras in different time periods and different illumination different scenes and different sexes and different age periods, adopts mtcnn to detect the human face, positions coordinates of five coordinate points on two sides of two eyes, a nose tip and a mouth, and matches the coordinates with preset coordinates of five sense organ points (left eyes 30.2946,51.6963, right eyes 65.5318,51.5014, a nose tip 48.0252,71.7366, left mouth corners 33.5493,92.3655 and right mouth corners 62.7299 and 92.2041) one by one, so that the five sense organs are corrected by rotary expansion, the five sense organs of each picture are at the same position, blank places after the image is rotated are supplemented by black pixel points, the human face is corrected by rotary filling, and the size of the human face is uniformly zoomed to 96 x 112;
s2, removing the unsatisfactory face data: the human face is divided into a general human face and other human faces of specific races according to requirements, the ratio is 1:1, and if the blood mixing phenomenon exists, the human faces are classified according to the deviation of the human faces. Through a calibrated and trained modified VGG network, a base layer of the modified VGG network is a convolutional layer with convolution kernels of 5 steps of 1 + an active layer, a pooling layer, three convolution kernels are a convolutional layer with 3 steps of 1 + an active layer + a pooling layer, a fully-connected calculation classification result is obtained, a segmentation layer slice is adopted to divide the calculation result into 3 types and 50 numerical values, the 3 types are respectively 0-front face, 1-90 degrees side face, 2-75 degrees higher head, and 50 numerical values form 25 coordinate points which are respectively coordinate points of eyebrow heads, eyebrow middle parts and eyebrow tails of the left and right eyebrows, coordinate points of inner canthus angles, eye middle parts and outer canthus angles of the left and right eyes, coordinate points of nose heads, two sides of nose wings and nose tails of a mouth, coordinate points of two ends of lips, an upper lip, a middle lip and a lower lip of the mouth, and coordinate points of two upper and lower parts of a left and right ear connection outline of the left and right eyes, and a gray scale graph of 40 x 40 is input image, in the early stage, 5W faces in each posture are selected, coordinate points of the eyebrows, the eyebrow middle parts and the eyebrow tail of the eyebrows on the left side and the right side, coordinate points of the inner canthus, the eye middle parts and the outer canthus of the eyes on the left side and the right side, coordinate points of the nose head, the nose wing, the nose tail of the nose, coordinate points of the two ends of the mouth, the upper lip, the middle lip and the lower lip and coordinate points of the upper and the lower parts of the connecting face outline of the ears on the left side and the right side are marked respectively, the network is trained, then the input face image obtains the coordinate points of the five sense organs by the network after the training, the ratio of the width of the right eye to the width of the two eyes is calculated by utilizing the x-axis distance from the right eye to the right ear/(the x-axis distance from the left eye to the left ear + the x-axis distance from the right eye to the right ear), a result range [0,1], the value is subtracted by 0.5 and then multiplied by 2 to the result range to the range to, 0) the right face (0, 1) is the left face, and the yaw angle is calculated by multiplying the square of the left face by 90 to count the left and right directions and the direction angles of the face, and the correspondence k1, b1 is calculated by using y ═ kx + b and two points on the top surface of both ears, and the correspondence k2, b2 is calculated by using two points on the bottom surface of both ears in the same manner. And (3) respectively calculating y1 and y2 by setting x as an x-axis coordinate point in the left eye, taking an average value y0 of y1 and y2, setting y as a y-axis coordinate point in the left eye, using n (y-y0/| y2-y1|), setting n as 30, and normalizing a value range to [ -90, 90] by comparing a maximum value with a minimum value to calculate a face pitch angle, wherein if the value is greater than 0, the head is raised, and if the value is less than 0, the head is lowered. Through another training modified VGG network, the network base layer is a convolution layer with 7-step size 4 and a convolution layer + an active layer, a pooling layer, a convolution layer with 3-step size 1 and a convolution layer + an active layer, a pooling layer, two convolution layers with 3-step size 1 and a convolution layer + an active layer, a pooling layer and a fully-connected layer are used as calculation classification results, a three-channel image with the image size of 96 x 60 is input, the image is a human face which is detected and corrected by an mtcnn open source network and is uniformly scaled to 96 × 112, the human face is cut from a position with the height of 62 to the end of the image to form 96 × 60 partial human faces only with the positions below the nose, and 5W human faces obtained in the above mode are trained to be marked as 0-mask 1-normal people to identify whether the area of the five sense organs is large-area mask or not. Removing the face data which is judged to be 90 degrees side face and 70 degrees or above low head through the two networks in the original data set, and removing the face data which is shielded in a large area by the five sense organs and has no obvious characteristics;
s3, performing data enhancement on the image, performing brightness, contrast, definition and sharpness adjustment on the image by using OpenCV, adjusting the four dimensions by using four [0,1] random numbers, if the contrast random number is 0, adjusting the contrast, if the random number is 1, adjusting the contrast, setting the random numbers as adjustment amplitude, setting the brightness adjustment amplitude to be [0.5,1.5] and the contrast adjustment amplitude to be [0.5,1.5], setting the definition adjustment amplitude to be [0.5,1.5] and setting the sharpness adjustment amplitude to be [0.5,1.5], and increasing the diversity of the sample.
Then using the Gauss equation
Figure BDA0003339808070000051
Setting a value of σ, wherein the specific value of the embodiment is 3, calculating an image weight matrix, multiplying each pixel point by a weight value to obtain a gaussian blur value of a central point, and thus performing gaussian blur processing on each image after brightness, contrast, definition and sharpness adjustment to enhance the generalization ability of the sample.
And S4, preprocessing the images, unifying the images into a three-channel color image, and performing brightness reduction processing on the images, specifically, subtracting the mean value 104,117,123 of each channel from the BGR channel respectively, wherein the mean value is obtained by calculating the mean value of each channel of BGR through the face image of S2, and the subtraction of the mean value from the images is convenient to eliminate the commonalities of the images and highlight individual differences. Eliminating the image commonality also comprises removing the average brightness value of the image, thereby reducing the influence of illumination on the data to a certain extent, labeling the data set according to classification, wherein the label is in the form of (n, l), n is a picture path, l ═ 0 represents a general face, l ═ n-1 represents a face of a specific different race, randomly extracting 10% as a test set, 90% as a training set, and randomly scrambling the data set.
S5, please refer to fig. 2 and 3, in which a SE-ResNet network constructed based on a ResNet50 network and SE residual modules sequentially includes a first convolution module, a second convolution module, a first pooling layer, a first SE residual module, a third convolution module, a second pooling layer, a second SE residual module, a third SE residual module, a fourth convolution module, a third pooling layer, fifth to ninth SE residual modules, a fifth convolution module, a fourth pooling layer, tenth to twelfth SE residual modules, a first full connection layer, a second full connection layer, and a softmax layer, and different numbers of SE residual modules are respectively inserted into the SE-ResNet network after the last four of the five convolution modules of the ResNet50 network, wherein all convolution modules of the ResNet50 network include a convolution layer and an active layer. The SE residual error module comprises a main path which is sequentially a first convolution layer + activation layer, a second convolution layer + activation layer, an average pooling layer, a full connection layer, an activation function (ReLU), a full connection layer and a Sigmoid and then is connected to the eltwise layer, the SE residual error module comprises a side path which is directly connected with the eltwise layer, and a newly-built SE-ResNet network trains a training set based on the SE-ResNet network.
S6-1, the training parameters are that the data input size is 96 × 112, the batchsize is 64, the optimizer SGD uses a cross entropy loss function, the initial learning rate is 0.001, 100000 epochs are set for training, the learning strategy is multistep,20000 epoch learning rates are attenuated to the previous 10%, 40000 epoch learning rates are attenuated to the previous 10%, and the momentum is 0.99.
S6-2, if the accuracy of the new setting scene is lower than 80% in actual use, selecting a new sample obtained by testing the scene camera by the training generation model in the step S6-1, selecting 2W samples according to the test result, wherein 40% of error samples, 60% of correct samples are selected, the ratio of the number of the categories is 1:1, the images are processed in the same manner as in steps S1 to S4, and then optimized based on the model generated by training in step S6-1, where the input size is 96 × 112, the batch size is 64, the optimizer is Adam, the loss function uses a cross entropy loss function, the initial learning rate is 0.0001, 50000 epochs are set, the learning strategy is step, the learning rate of every 5000 epochs is attenuated to 50% of the previous learning rate, the momentum is 0.99, and the learning rate and the bias learning rate of the last fully-connected layer (second fully-connected layer) in the network are respectively expanded by 10 times to generate the optimized model.
S8, practical use, selecting pictures collected by the camera, adopting mtcnn to carry out face detection and face angle correction, respectively calculating and screening faces without obvious characteristics of 90-degree side faces and large-area shielding through two modified VGG networks, carrying out [104,117,123] mean value reduction processing on the faces through RGB channels, sending the faces to the SE-ResNet network optimized in the step S6-2, and setting the result confidence coefficient to be 0.8 to obtain a race classification result. The race accuracy (i.e., race pattern vs. number of samples/race pattern out of samples) was 99.9%, and the recall (i.e., race pattern vs. number of samples/total number of race samples) was 85%.
It should be noted that the particular methods of the embodiments described above may form a computer program product and, thus, the computer program product embodied herein may be stored on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.). The present invention may be implemented in hardware, software, or a combination of hardware and software, or may be a computer device including at least one processor and a memory, where the memory stores a computer program for implementing the steps of the above-described flow chart, and the processor is configured to execute the computer program stored in the memory to perform the steps of the method forming the above-described embodiment.

Claims (9)

1. A ethnic identification method based on SE-ResNet network is characterized by comprising the following steps:
s1, acquiring real race image data as original data, detecting the face by adopting an MTCNN model, carrying out rotary filling correction on the face according to the five sense organs of the face, and zooming to a uniform size;
s2, removing the featureless images which are blocked by the human face side face, the head and the five sense organ regions in a large area;
s3, adjusting brightness, contrast, definition and sharpness of the image data to increase the diversity of the image data, and then performing Gaussian blur processing on the image;
s4, performing subtraction processing on RGB channels of the image data respectively, and then labeling the data set according to classification to distinguish a general race face and a specific race face, wherein the general race is other than the specific race;
s5, establishing an SE-ResNet network, wherein the SE-ResNet network sequentially comprises a first convolution module, a second convolution module, a first pooling layer, a first SE residual error module, a third convolution module, a second pooling layer, a second SE residual error module, a third SE residual error module, a fourth convolution module, a third pooling layer, fifth to ninth SE residual error modules, a fifth convolution module, a fourth pooling layer, tenth to twelfth SE residual error modules, a first full-link layer, a second full-link layer and a softmax layer, the first to fifth convolution modules respectively include a convolution layer + an active layer, the SE residual module includes a main path and a side path respectively connected to the eltwise layer, the main path is sequentially a first convolution layer + an active layer, a second convolution layer + an active layer, an average pooling layer, a full-connection layer, an active function, a full-connection layer and a Sigmoid and then is connected to the eltwise layer, and the side path is directly connected with the eltwise layer;
s6, training the SE-ResNet network by adopting the data set obtained in the step S4;
s7, selecting pictures to be identified, inputting the pictures to the SE-ResNet network trained in the step S6, and carrying out classification and identification to obtain a result.
2. The ethnic identification method based on the SE-ResNet network as claimed in claim 1, wherein the labeled data set is divided into a test set and a training set in the step S4, the step S6 comprises S6-1, the SE-ResNet network is trained by using the training set, then the SE-ResNet network trained by the step S6-1 is tested by using the original data corresponding to the test set, and the test set data corresponding to the original data with correct results and incorrect results in the test results are combined into an optimized training set; and S6-2, carrying out optimization training on the SE-ResNet network trained in the step S6-1 by adopting an optimization training set.
3. A race recognition method based on SE-ResNet network as claimed in claim 2, characterized in that when training in step S6-1, the optimizer is SGD, the loss function is a cross entropy loss function, the initial learning rate is 0.001, 100000 epochs are trained, the learning strategy is multistep,20000 epoch learning rates decay to 0.0001, 40000 epoch learning rates decay to the previous 0.00001, and the momentum is 0.99; when the optimization training in the step S6-2 is performed, the optimizer is Adam, the loss function is a cross entropy loss function, the initial learning rate is 0.0001, 50000 epochs are set for training, the learning strategy is step, the learning rate of each 5000 epochs is attenuated to 50% of the previous learning rate, the momentum is 0.99, and the second full connection layer learning rate and the bias learning rate in the SE-ResNet network are respectively expanded by 10 times to perform optimization to generate a network model.
4. The race recognition method based on SE-ResNet network as claimed in claim 2, wherein when the step S6-2 is optimized training, the ratio of the test set data corresponding to the original data with wrong test result in the step S6-1 in the optimized training set is 40% -50%.
5. The race recognition method based on SE-ResNet network as claimed in claim 2, wherein the ratio of data of general race categories to specific race categories in the test set, training set and the optimized training set is 1: 1.
6. the ethnic identification method based on the SE-ResNet network as claimed in claim 1, wherein the specific steps of the step S2 include: s2-1, classifying and positioning the original data by adopting a first neural network, wherein the classification comprises a front face, a side face and a lower head, the positioning is coordinate points of eyebrows, eyebrows and eyebrows at the left and right sides, coordinate points of inner canthus, eyes and outer canthus of eyes at the left and right sides, coordinate points of nose heads, two sides of nose wings and nose tails of a nose, coordinate points of two ends, an upper lip, a middle lip and a lower lip of a mouth and coordinate points of two upper and lower positions of a left and right ear connecting face outline; s2-2, comparing the distance between the left ear and the left eye with the distance between the right ear and the right eye to calculate the face front side angle, and comparing the y value coordinates of the two ears with the y value coordinates of the two eyes to calculate the face pitch angle; s2-3, identifying whether the area of the five sense organs is large-area shielded or not by adopting a second neural network; s2-4, removing the featureless images which are judged to be 90-degree side faces, 70-degree or above head lowering and large-area occlusion of the five sense organ areas; the first neural network structure is a modified VGG network, the base layer is a convolutional layer with 5 step lengths of 1 and an active layer, the base layer is a pooling layer, the three convolutional layers are a convolutional layer with 3 step lengths of 1 and an active layer and a pooling layer, the base layer is a fully-connected calculation classification result, and the calculation result is divided into the classification and coordinate points by adopting a partition layer slice; the second neural network structure is a modified VGG network, the base layer is a convolution layer with convolution kernels of 7 and 4 in step length and an activation layer, the pooling layer is a convolution layer with convolution kernels of 3 and 1 in step length and an activation layer, the pooling layer is a layer, and the full-connection layer is used for calculating classification results.
7. A race recognition method based on SE-ResNet network as claimed in claim 1, characterized in that said random adjustment in step S3 is to randomly select whether or not brightness, contrast, sharpness and sharpness are adjusted, and to randomly select forward adjustment or backward adjustment for the parameters of the selected adjustment, and the adjustment magnitude of the forward adjustment and the backward adjustment is the same.
8. A race recognition apparatus based on SE-ResNet network, comprising a processor and a memory, wherein the memory stores a computer program, and wherein the computer program, when executed by the processor, implements the race recognition method based on SE-ResNet network according to any one of claims 1 to 7.
9. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method for personality identification based on a SE-ResNet network of any of claims 1-7.
CN202111305054.5A 2021-11-05 2021-11-05 Human species identification method and device based on SE-ResNet network and computer storage medium Pending CN114155573A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111305054.5A CN114155573A (en) 2021-11-05 2021-11-05 Human species identification method and device based on SE-ResNet network and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111305054.5A CN114155573A (en) 2021-11-05 2021-11-05 Human species identification method and device based on SE-ResNet network and computer storage medium

Publications (1)

Publication Number Publication Date
CN114155573A true CN114155573A (en) 2022-03-08

Family

ID=80459255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111305054.5A Pending CN114155573A (en) 2021-11-05 2021-11-05 Human species identification method and device based on SE-ResNet network and computer storage medium

Country Status (1)

Country Link
CN (1) CN114155573A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710831A (en) * 2018-04-24 2018-10-26 华南理工大学 A kind of small data set face recognition algorithms based on machine vision
CN109255340A (en) * 2018-10-29 2019-01-22 东北大学 It is a kind of to merge a variety of face identification methods for improving VGG network
CN112052772A (en) * 2020-08-31 2020-12-08 福建捷宇电脑科技有限公司 Face shielding detection algorithm
CN112818967A (en) * 2021-04-16 2021-05-18 杭州魔点科技有限公司 Child identity recognition method based on face recognition and head and shoulder recognition
CN113052227A (en) * 2021-03-22 2021-06-29 山西三友和智慧信息技术股份有限公司 Pulmonary tuberculosis identification method based on SE-ResNet
WO2021169641A1 (en) * 2020-02-28 2021-09-02 深圳壹账通智能科技有限公司 Face recognition method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710831A (en) * 2018-04-24 2018-10-26 华南理工大学 A kind of small data set face recognition algorithms based on machine vision
CN109255340A (en) * 2018-10-29 2019-01-22 东北大学 It is a kind of to merge a variety of face identification methods for improving VGG network
WO2021169641A1 (en) * 2020-02-28 2021-09-02 深圳壹账通智能科技有限公司 Face recognition method and system
CN112052772A (en) * 2020-08-31 2020-12-08 福建捷宇电脑科技有限公司 Face shielding detection algorithm
CN113052227A (en) * 2021-03-22 2021-06-29 山西三友和智慧信息技术股份有限公司 Pulmonary tuberculosis identification method based on SE-ResNet
CN112818967A (en) * 2021-04-16 2021-05-18 杭州魔点科技有限公司 Child identity recognition method based on face recognition and head and shoulder recognition

Similar Documents

Publication Publication Date Title
CN110222787B (en) Multi-scale target detection method and device, computer equipment and storage medium
Zhao et al. Multi-focus image fusion with a natural enhancement via a joint multi-level deeply supervised convolutional neural network
CN110363116B (en) Irregular human face correction method, system and medium based on GLD-GAN
CN109284738B (en) Irregular face correction method and system
CN111160269A (en) Face key point detection method and device
CN110287790B (en) Learning state hybrid analysis method oriented to static multi-user scene
US6917703B1 (en) Method and apparatus for image analysis of a gabor-wavelet transformed image using a neural network
JP2021517330A (en) A method for identifying an object in an image and a mobile device for carrying out the method.
US11893789B2 (en) Deep neural network pose estimation system
CN114241548A (en) Small target detection algorithm based on improved YOLOv5
CN110738161A (en) face image correction method based on improved generation type confrontation network
CN111445410A (en) Texture enhancement method, device and equipment based on texture image and storage medium
CN109725721B (en) Human eye positioning method and system for naked eye 3D display system
CN111738344A (en) Rapid target detection method based on multi-scale fusion
CN112819772A (en) High-precision rapid pattern detection and identification method
CN111291701B (en) Sight tracking method based on image gradient and ellipse fitting algorithm
CN112381061B (en) Facial expression recognition method and system
CN112541422A (en) Expression recognition method and device with robust illumination and head posture and storage medium
CN111209873A (en) High-precision face key point positioning method and system based on deep learning
CN110991256A (en) System and method for carrying out age estimation and/or gender identification based on face features
CN106529441A (en) Fuzzy boundary fragmentation-based depth motion map human body action recognition method
CN107194364B (en) Huffman-L BP multi-pose face recognition method based on divide and conquer strategy
CN109784215B (en) In-vivo detection method and system based on improved optical flow method
CN111553250B (en) Accurate facial paralysis degree evaluation method and device based on face characteristic points
CN112861855A (en) Group-raising pig instance segmentation method based on confrontation network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination