CN107944363A - Face image processing process, system and server - Google Patents

Face image processing process, system and server Download PDF

Info

Publication number
CN107944363A
CN107944363A CN201711131120.5A CN201711131120A CN107944363A CN 107944363 A CN107944363 A CN 107944363A CN 201711131120 A CN201711131120 A CN 201711131120A CN 107944363 A CN107944363 A CN 107944363A
Authority
CN
China
Prior art keywords
classification
data
convolutional neural
classified
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711131120.5A
Other languages
Chinese (zh)
Other versions
CN107944363B (en
Inventor
杨帆
张志伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201711131120.5A priority Critical patent/CN107944363B/en
Publication of CN107944363A publication Critical patent/CN107944363A/en
Application granted granted Critical
Publication of CN107944363B publication Critical patent/CN107944363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/178Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/179Human faces, e.g. facial parts, sketches or expressions metadata assisted face recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the invention discloses a kind of face image processing process, device and server, comprise the following steps:Obtain facial image to be sorted;The facial image is input in the convolutional neural networks model for being built with loss function, the loss function handles the data to be sorted that the convolutional neural networks model exports into row coefficient relaxationization, to increase the classification interface of the data to be sorted;The grouped data of the convolutional neural networks model output is obtained, and content understanding is carried out to the facial image according to the grouped data.Before classifying to facial image, the data characteristics to be sorted of the facial image of convolutional neural networks model extraction is handled into row coefficient relaxationization, using coefficient relaxationization handle can under conditions of more harsh training convolutional neural networks model, classification boundaries are significantly increased, therefore convolutional neural networks model is greatly improved content understanding precision.

Description

Face image processing method, system and server
Technical Field
The embodiment of the invention relates to the field of image processing, in particular to a method, a system and a server for processing a face image.
Background
The face recognition is a technology for processing, analyzing and understanding a face image by using a computer to recognize targets and objects of various face images. Face identification can be applied in many fields such as security protection, finance, and face identification's process generally divides into three stages: face detection, face alignment, face feature extraction and comparison, wherein the face feature extraction is a key technology for face recognition.
With the development of deep learning technology, convolutional neural networks have become powerful tools for extracting human face features, and for convolutional neural networks with fixed models, the most core technology is how to design a loss function, so that the loss function can effectively supervise the training of the convolutional neural networks, and thus the convolutional neural networks have the capability of extracting human face features. The cross entropy loss function of Softmax is mainly used in the prior art. The cross entropy loss function of Softmax trains the capability of the network for extracting features, the last layer of the network is used as the expression of a face, face data is mapped to a cosine space, the similarity of the face is judged by comparing cosine space distances of different faces, the cosine space distances of the same person are closer, and the cosine space distances of different persons are farther.
However, in the research of the inventor created by the invention, the characteristic extraction method of the cross entropy loss function of Softmax is a non-end-to-end method, is simple and easy to implement, but does not optimize the existing model to the maximum extent, i.e. the maximum classification boundary among different classes is ensured, the classification boundary is not obvious enough, and the content understanding accuracy cannot be improved.
Disclosure of Invention
The embodiment of the invention provides a face image processing method, a system and a server capable of increasing a classification data classification interface.
In order to solve the above technical problem, the embodiment of the present invention adopts a technical solution that: the method for processing the face image comprises the following steps:
acquiring a face image to be classified;
inputting the face image into a convolutional neural network model constructed with a loss function, wherein the loss function performs coefficient relaxation treatment on the data to be classified output by the convolutional neural network model so as to increase a classification interface of the data to be classified;
and acquiring classification data output by the convolutional neural network model, and performing content understanding on the face image according to the classification data.
Specifically, the coefficient relaxing process specifically includes the steps of:
and carrying out the same-scale reduction processing on the data to be classified output by the full connection layer of the convolutional neural network model so as to increase the classification interface of the data to be classified.
Specifically, the forward propagation of the convolutional neural network model is characterized by:
L=log(p i )
defining a function:
wherein i represents the class of the input image, j represents the classification class different from i, t represents the classification class different from i, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w (x) represents the face feature extracted by the convolutional neural network model i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
Specifically, the characteristics of the back propagation of the convolutional neural network model are described as follows:
defining a function:
wherein i represents the class of the input image, j represents the classification class different from i, t represents the classification class different from i, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w (x) represents the face feature extracted by the convolutional neural network model i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is the number of categories classified.
Specifically, the convolutional neural network model is formed by training through the following steps:
acquiring training sample data marked with classification judgment information;
inputting the training sample data into a convolutional neural network model to obtain model classification information of the training sample data;
comparing model classification information of different samples in the training sample data with the classification judgment information through a loss stopping function to judge whether the model classification information is consistent with the classification judgment information;
and when the model classification information is inconsistent with the classification judgment information, repeatedly and circularly updating the weight in the convolutional neural network model until the comparison result is consistent with the classification judgment information.
Specifically, the step of comparing, by a stop-loss function, whether the model classification information of different samples in the training sample data is consistent with the classification judgment information includes the following steps:
carrying out coefficient parameterization on the data to be classified output by the full connection layer of the convolutional neural network model, and synchronously reducing the data to be classified;
comparing the data to be classified subjected to coefficient parameterization with a boundary value in a preset first classification value interval, and determining the interval position of the data to be classified subjected to coefficient parameterization in the first classification value interval;
determining model classification information of different samples in the training sample data according to the classification result corresponding to the interval position;
and judging whether the classification reference information is consistent with the classification judgment information or not.
Specifically, the step of acquiring the face image to be classified further includes the following steps:
inputting the face image into the convolutional neural network model, and extracting the image features of the face image by the convolutional neural network model to form the data to be classified;
and performing coefficient relaxation processing on the data to be classified, and classifying the data to be classified when the data to be classified subjected to coefficient relaxation processing is larger than a preset classification threshold value.
Specifically, the content understanding of the face image comprises: and performing gender identification, age judgment, color value scoring or human face similarity comparison on the human face image.
In order to solve the above technical problem, an embodiment of the present invention further provides a face image processing system, where the face image processing system includes:
the acquisition module is used for acquiring a face image to be classified;
the processing module is used for inputting the face image into a convolutional neural network model constructed with a loss function, and the loss function performs coefficient relaxation processing on the data to be classified output by the convolutional neural network model so as to increase a classification interface of the data to be classified;
and the classification module is used for acquiring the classification data output by the convolutional neural network model and understanding the content of the face image according to the classification data.
Specifically, the face image processing system further includes:
and the first processing submodule is used for carrying out the same-scale reduction processing on the data to be classified output by the full connection layer of the convolutional neural network model so as to increase the classification interface of the data to be classified.
Specifically, the forward propagation of the convolutional neural network model is characterized by:
L=log(p i )
defining a function:
wherein i represents the class of the input image, j represents the classification class different from i, t represents the classification class different from i, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w (x) represents the face feature extracted by the convolutional neural network model i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
Specifically, the characteristics of the back propagation of the convolutional neural network model are described as follows:
defining a function:
wherein i represents the class of the input image, j represents the classification class different from i, t represents the classification class different from i, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w (x) represents the face feature extracted by the convolutional neural network model i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
Specifically, the face image processing system further includes:
the first acquisition submodule is used for acquiring training sample data marked with classification judgment information;
the first classification submodule is used for inputting the training sample data into a convolutional neural network model to obtain model classification information of the training sample data;
the first comparison pair module is used for comparing whether the model classification information of different samples in the training sample data is consistent with the classification judgment information or not through a loss stopping function;
and the second processing submodule is used for repeatedly and circularly updating the weight in the convolutional neural network model when the model classification information is inconsistent with the classification judgment information, and ending when the comparison result is consistent with the classification judgment information.
Specifically, the face image processing system further includes:
the first calculation submodule is used for carrying out coefficient parameterization on data to be classified output by the convolution neural network model full connection layer so as to synchronously reduce the data to be classified;
the second comparison submodule is used for comparing the data to be classified subjected to coefficient parameterization with a boundary value in a preset first classification value interval and determining the interval position of the data to be classified subjected to coefficient parameterization in the first classification value interval;
the second classification submodule is used for determining model classification information of different samples in the training sample data according to the classification result corresponding to the interval position;
and the first judgment submodule is used for judging whether the classification reference information is consistent with the classification judgment information or not.
Specifically, the face image processing system further includes:
the third classification submodule is used for inputting the face image into the convolutional neural network model, and the convolutional neural network model extracts the image characteristics of the face image to form the data to be classified;
and the third processing submodule is used for carrying out coefficient relaxation processing on the data to be classified, and classifying the data to be classified when the data to be classified subjected to coefficient relaxation processing is larger than a preset classification threshold value.
Specifically, the content understanding of the face image includes: and performing gender identification, age judgment, color value scoring or human face similarity comparison on the human face image.
In order to solve the above technical problem, an embodiment of the present invention further provides a server, including:
one or more processors;
a memory;
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the above-described face image processing method.
The embodiment of the invention has the beneficial effects that: before the face image is classified, coefficient relaxation processing is carried out on the to-be-classified data features of the face image extracted by the convolutional neural network model, namely, the to-be-classified data is scaled in the same proportion, and the classification interface of the classification data is increased. The convolutional neural network model can be trained under a more severe condition by adopting coefficient relaxation processing, so that the classification boundary is obviously increased, and the content understanding accuracy of the convolutional neural network model is greatly improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic view of a basic flow chart of a face image processing method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a basic flow of a convolutional neural network model training method according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for performing coefficient relaxation processing on parameters during convolutional neural network model training according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a method for implementing a convolutional neural network model according to an embodiment of the present invention;
FIG. 5 is a block diagram of the basic structure of a face image processing system according to an embodiment of the present invention;
fig. 6 is a block diagram of a basic structure of a server according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
In some of the flows described in the present specification and claims and in the above figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, with the order of the operations being indicated as 101, 102, etc. merely to distinguish between the various operations, and the order of the operations by themselves does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
It should be noted that the basic structure of the convolutional neural network includes two layers, one is a feature extraction layer, and the input of each neuron is connected with the local acceptance domain of the previous layer and extracts the feature of the local acceptance domain. Once the local feature is extracted, the position relation between the local feature and other features is determined; the other is a feature mapping layer, each calculation layer of the network is composed of a plurality of feature mappings, each feature mapping is a plane, and the weights of all neurons on the plane are equal. The feature mapping structure adopts a sigmoid function with small influence function kernel as an activation function of the convolution network, so that the feature mapping has displacement invariance. In addition, since the neurons on one mapping surface share the weight, the number of free parameters of the network is reduced. Each convolutional layer in the convolutional neural network is followed by a computation layer for local averaging and quadratic extraction, and the characteristic quadratic feature extraction structure reduces the feature resolution.
Convolutional neural networks are used primarily to identify two-dimensional patterns of displacement, scaling and other forms of distortion invariance. Because the feature detection layer of the convolutional neural network learns through the training data, when the convolutional neural network is used, the displayed feature extraction is avoided, and the learning is implicitly carried out from the training data; moreover, because the weights of the neurons on the same feature mapping surface are the same, the network can learn in parallel, which is also a great advantage of the convolutional network relative to the network in which the neurons are connected with each other.
The convolutional neural network model in the embodiment adopts an inclusion _ v2 model of GoogleNet. But not limited to this, the convolutional neural network model can also adopt an inclusion _ v3 or an inclusion _ v4 model according to different application scenarios.
Referring to fig. 1, fig. 1 is a basic flow chart of the face image processing method according to the present embodiment.
As shown in fig. 1, a method for processing a face image includes the following steps:
s1100, obtaining a face image to be classified;
the method for acquiring the face image comprises two methods of acquiring and extracting video data of the stored image in real time. The real-time acquisition is mainly used for real-time application (such as judgment of age, gender, color value, similarity and the like of a user) of an intelligent terminal (a mobile phone, a tablet personal computer and monitoring equipment). The extracted and stored image video data is mainly used for further processing the stored image and video data, and can also be used for the intelligent terminal to apply historical photos.
S1200, inputting the face image into a convolutional neural network model with a loss function, wherein the loss function performs coefficient relaxation processing on to-be-classified data output by the convolutional neural network model to increase a classification interface of the to-be-classified data;
the convolutional neural network model has been trained to converge when processing the face image, and has been enabled to process the face image as expected by a specific training mode.
In the embodiment, the convolutional neural network model performs feature extraction on the input face image to obtain the most expressive feature capable of representing the face image, and forms to-be-classified data on the full-connection layer of the convolutional neural network model.
In this embodiment, a cross entropy loss function of Softmax is used to perform preprocessing on the data to be classified, where the preprocessing is performed by performing coefficient relaxation processing on the data to be classified, that is, performing scaling down processing on the data to be classified output by the full connection layer of the convolutional neural network model in the same scale, so as to increase the classification interface of the data to be classified. The specific operation method is that a relaxation coefficient which is larger than 0 and smaller than 1 is multiplied before the data to be processed, the relaxation coefficient is obtained through multiple times of test verification, and one scheme which can be adopted is as follows: setting classification accuracy of the convolutional neural network model, training the convolutional neural network model by selecting different relaxation coefficients, recording the time for the classification of the different convolutional neural network models to reach the accuracy, and taking the coefficient adopted by the convolutional neural network model with the shortest training time as the relaxation coefficient.
When a data is scaled and compared with the classification boundary value, and still can be larger than or within the classification boundary value, the original data that has not been scaled must be larger than or within the classification boundary value. And (3) performing coefficient relaxation processing on the data to be classified, but the classification boundary value is not changed, the data to be classified subjected to coefficient relaxation processing is reduced relative to the classification boundary value, and the classification interface of the data to be classified is increased by changing the phase. And meanwhile, processing the data to be classified, and training the convolutional neural network model under a more convergent training condition so that the classification of the convolutional neural network model is more accurate.
S1300, obtaining classification data output by the convolutional neural network model, and understanding the contents of the face image according to the classification data.
After the data to be classified is screened by the loss function, the data to be classified (subjected to coefficient relaxation processing) is classified in a classification layer of the convolutional neural network model.
And the classification layer classifies the data to be classified according to a preset classification standard and outputs the classified data. The classification data output by the classification layer is one or more numerical values, and the content understanding of the face image is realized by comparing the classification data with a classification threshold value. For example, when the content of the face image is understood as that the face similarity is matched, a similarity threshold is preset, the numerical value output by the classification data is compared with the similarity threshold, when the comparison result is greater than the threshold, the face image is homologous with the reference comparison image, otherwise, the face image is different from the reference comparison image.
Content understanding includes, but is not limited to, performing gender identification, age determination, color scoring, or face similarity comparisons. The classification data represents main recognizable features in the face image, and the gender, age and color value of the face image can be judged by comparing the features with preset classification standards. And the similarity between two face images can be calculated according to the comparison of cos (cosine space) distances of the two face image classification data.
In the embodiment, before the face image is classified, coefficient relaxation processing is performed on the to-be-classified data features of the face image extracted by the convolutional neural network model, that is, the to-be-classified data is scaled in the same proportion, and the classification interface of the classification data is increased. The convolutional neural network model can be trained under a more severe condition by adopting coefficient relaxation processing, so that the classification boundary is obviously increased, and the content understanding accuracy of the convolutional neural network model is greatly improved.
Specifically, in this embodiment, the loss function adopted by the convolutional neural network model is a cross entropy loss function of Softmax, and after performing coefficient parameterization on the loss function, the forward propagation formula of the loss function is as follows:
the forward propagation of the convolutional neural network model is characterized by:
L=log(p i )
defining a function:
wherein i represents the class of the input image, j represents the classification class different from i, t represents the classification class different from i, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w (x) represents the face feature extracted by the convolutional neural network model i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
In the training process of the convolutional neural network model, the weight of the convolutional neural network model is triggered to be finely adjusted through comparison between output classification data and expected expectation, namely the convolutional neural network model is propagated reversely, and the weight of the convolutional neural network model is finely adjusted in a mode of solving a partial derivative, wherein the specific operation formula is as follows:
the back propagation of the convolutional neural network model is characterized as:
defining a function:
wherein i represents the class of the input image, j represents the class of the input image, t represents the class of the input image, k represents the coefficient relaxation parameter, and f (x) represents the face extracted by the convolutional neural network modelCharacteristic, w i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
In some preferred embodiments, k has a value of 0.5. And k value is determined according to experimental data, and the implementation result shows that when k =0.5, the separation accuracy of the convolutional neural network model is highest, and the time required for training the model is shortest.
The relaxation of the coefficients is illustrated by way of example, with f (x) w i Representing data to be classified, f (x) w j Representing classification threshold if and only if f (x) w i >f(x)*w j In time, the data to be classified can pass through the screening of the loss function.
Because:
f(x)*w i >f(x)*w j
if so:
f(x)*w i >f(x)*w j
then do not make
kf(x)*w i >f(x)*w j (0<k<1.0)
Wherein f (x) represents the characteristics extracted by the deep learning network on the image, the characteristics are fed into a Softmax loss function classifier, the characteristics represent the image belonging to the ith class, so there should be f (x) w i >f(x)*w j If our loss function can satisfy k x f (x) w i >f(x)*w j (0<k&lt, 1.0), then f (x) w must be present i >f(x)*w j I.e. completing the classification task, the method is to perform coefficient relaxation on Softmax.
Because the coefficients (0-k-n-1.0), namely the results of Softmax are relaxed, the classification interfaces between different classes are enlarged, so that the classification accuracy of the model on samples which are not easy to classify is improved, namely the robustness of the model is improved.
Specifically, the training method of the convolutional neural network model in this embodiment is as follows:
referring to fig. 2, fig. 2 is a schematic diagram of a basic flow of a convolutional neural network model training method according to the present embodiment.
As shown in fig. 2, the convolutional neural network model is formed by training through the following steps:
s2100, acquiring training sample data marked with classification judgment information;
the training sample data is the unit of the whole training set, and the training set is composed of a plurality of training sample training data.
The training sample data is composed of face data and classification judgment information for marking the face data.
The classification judgment information refers to the artificial judgment of training sample data by people according to the training direction of the input convolutional neural network model through the universal judgment standard and the fact state, namely the expected target of the output numerical value of the convolutional neural network model by people. If the face image data and the pre-stored target face image are identified to be the same person in one training sample data, the face image classification judgment information is calibrated to be the same as the pre-stored target face image.
S2200, inputting the training sample data into a convolutional neural network model to obtain model classification information of the training sample data;
and sequentially inputting the training sample set into the convolutional neural network model, and obtaining the model classification information output by the last full-connected layer of the convolutional neural network model.
The model classification information is excitation data output by the convolutional neural network model according to the input face image, before the convolutional neural network model is not trained to be converged, the classification reference information is a numerical value with large discreteness, and when the convolutional neural network model is not trained to be converged, the classification reference information is relatively stable data.
S2300, comparing model classification information of different samples in the training sample data with the classification judgment information by a loss stopping function to judge whether the model classification information is consistent with the classification judgment information;
the stop-loss function is a detection function for detecting whether the model classification information in the convolutional neural network model is consistent with expected classification judgment information or not. When the output result of the convolutional neural network model is inconsistent with the expected result of the classification judgment information, the weights in the convolutional neural network model need to be corrected so that the output result of the convolutional neural network model is the same as the expected result of the classification judgment information.
And S2400, when the model classification information is inconsistent with the classification judgment information, repeatedly and circularly updating the weight in the convolutional neural network model until the comparison result is consistent with the classification judgment information, and ending.
When the output result of the convolutional neural network model is inconsistent with the expected result of the classification judgment information, the weights in the convolutional neural network model need to be corrected so that the output result of the convolutional neural network model is the same as the expected result of the classification judgment information.
Specifically, in the training process, a stop-loss function is required to perform coefficient relaxation processing on data to be classified.
Referring to fig. 3, fig. 3 is a flowchart illustrating a method for performing coefficient relaxation processing on parameters during convolutional neural network model training according to the present embodiment.
As shown in fig. 3, step S2300 includes the steps of:
s2310, carrying out coefficient parameterization on data to be classified output by the convolutional neural network model full-connection layer, and synchronously reducing the data to be classified;
the training sample data is sequentially input into the convolutional neural network model, the data to be classified output by the full connection layer of the convolutional neural network model is obtained, and the data to be classified is synchronously scaled, namely, a coefficient which is larger than zero and smaller than 1 is multiplied before the data to be classified.
S2320, comparing the data to be classified subjected to coefficient parameterization with a boundary value in a preset first classification value interval, and determining an interval position of the data to be classified subjected to coefficient parameterization in the first classification value interval;
and comparing the data to be classified subjected to coefficient relaxation processing with a preset first classification number value interval.
The first classification value interval is set according to an expected classification result, when the classification result is similarity comparison, the first classification value interval is a single classification threshold, and when the classification result is race division, the first classification value interval is 3 different value intervals. When the classification result is the gender classification, the first classification value interval is 2 different value intervals. And when the classification result is the color value division, the first classification value interval is a plurality of continuous value intervals.
And comparing the data to be classified subjected to coefficient relaxation processing with a preset first classification number value interval to obtain the specific position of the data to be classified subjected to coefficient relaxation processing in the first classification number value interval.
S2330, determining model classification information of different samples in the training sample data according to the classification result corresponding to the interval position;
and acquiring a corresponding classification result according to the specific position of the data to be classified subjected to coefficient relaxation processing in the first classification number value interval. For example, when the classification result is similarity comparison, and the data to be classified after coefficient relaxation processing is greater than the classification threshold, the corresponding classification result is that the training sample data is similar to the reference image, otherwise, the two results are not similar.
The model classification information is the classification result of the model.
S2340, determining whether the classification reference information matches the classification determination information.
Comparing the obtained model classification information with the manually expected classification judgment information, and comparing whether the obtained model classification information and the manually expected classification judgment information are consistent, for example, if the model classification information judges that the exercise sample data is similar to the reference image, and if the manually preset expectation is that the exercise sample data is not similar to the reference image, the classification result is determined to be not in accordance with the preset expectation.
In some embodiments, the convolutional neural network model of the embodiments of the present invention is used for face similarity alignment.
Referring to fig. 4, fig. 4 is a diagram illustrating an embodiment of a convolutional neural network model according to the present embodiment.
As shown in fig. 4, step S1100 is followed by the following steps:
s1110, inputting the face image into the convolutional neural network model, wherein the convolutional neural network model extracts image features of the face image to form the data to be classified;
in the step, the face image is used for carrying out similarity comparison with the reference image, and whether the face image to be classified is homologous with the reference image or not is confirmed, namely whether the two photos are the same person or not is judged.
And inputting the facial image to be classified into the convolutional neural network model to obtain the data to be classified of the facial image.
S1120, performing coefficient relaxation processing on the data to be classified, and classifying the data to be classified when the data to be classified subjected to coefficient relaxation processing is larger than a preset classification threshold.
And performing coefficient relaxation processing on the data to be classified, comparing the processed classified data with a preset classification threshold value, wherein the classification threshold value is a specific threshold value obtained according to experimental data and used for comparison and judgment in order to identify whether the face image to be classified is the same as the reference image. For example, setting the value range of the data to be classified after the coefficient relaxation processing to be between 0 and 1, and setting the classification threshold value to be 0.5, when the value of the data to be classified after the coefficient relaxation processing is more than 0.5, the face image to be detected and the reference image are homologous; and when the value of the data to be classified subjected to coefficient relaxation processing is less than 0.5, the face image to be classified is different from or has a different source from the reference image.
By performing coefficient relaxation processing on the data to be classified, the classification interface of the data to be classified can be increased, accidental errors caused by limit judgment when the data is near a classification threshold value are reduced, and the classification accuracy is improved.
In order to solve the above technical problem, an embodiment of the present invention further provides a face image processing system. Referring to fig. 5, fig. 5 is a block diagram of a basic structure of the face image processing system according to the present embodiment.
As shown in fig. 5, the face image processing system includes: the device comprises an acquisition module, a processing module and a classification module. The acquisition module is used for acquiring a face image to be classified; the processing module is used for inputting the face image into a convolutional neural network model constructed with a loss function, and the loss function performs coefficient relaxation processing on the data to be classified output by the convolutional neural network model so as to increase the classification interface of the data to be classified; the classification module is used for acquiring classification data output by the convolutional neural network model and understanding the content of the face image according to the classification data.
In some embodiments, the face image processing system further comprises: and the first processing submodule is used for carrying out the same-scale reduction processing on the data to be classified output by the full connection layer of the convolutional neural network model so as to increase the classification interface of the data to be classified.
In some embodiments, the forward propagation of the convolutional neural network model is characterized by:
L=log(p i )
defining a function:
wherein i represents the class of the input image, j represents the classification class different from i, t represents the classification class different from i, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w (x) represents the face feature extracted by the convolutional neural network model i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
In some embodiments, the back propagation of the convolutional neural network model is characterized as:
defining a function:
wherein i represents the category of the input image, j represents the category of the input image, t represents the category of the input image, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w (x) represents the face feature extracted by the convolutional neural network model i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
In some embodiments, the face image processing system further comprises: the device comprises a first obtaining submodule, a first classification submodule, a first comparison submodule and a second processing submodule. The first acquisition submodule is used for acquiring training sample data marked with classification judgment information; the first classification submodule is used for inputting training sample data into the convolutional neural network model to obtain model classification information of the training sample data; the first comparison sub-module is used for comparing model classification information of different samples in the training sample data with classification judgment information through a loss stopping function to judge whether the model classification information is consistent with the classification judgment information; and the second processing submodule is used for repeatedly and circularly iterating and updating the weight in the convolutional neural network model when the model classification information is inconsistent with the classification judgment information, and ending when the comparison result is consistent with the classification judgment information.
In some embodiments, the face image processing system further comprises: the device comprises a first calculating submodule, a second comparing submodule, a second classifying submodule and a first judging submodule. The first calculation submodule is used for carrying out coefficient parameterization on data to be classified output by the convolution neural network model full-connection layer, so that the data to be classified is synchronously reduced; the second comparison submodule is used for comparing the data to be classified subjected to coefficient parameterization with a boundary value in a preset first classification value interval and determining the interval position of the data to be classified subjected to coefficient parameterization in the first classification value interval; the second classification submodule is used for determining model classification information of different samples in the training sample data according to the classification result corresponding to the interval position; the first judgment submodule is used for judging whether the classification reference information is consistent with the classification judgment information.
In some embodiments, the face image processing system further comprises: a third classification submodule and a third processing submodule. The third classification submodule is used for inputting the face image into the convolutional neural network model, and the convolutional neural network model extracts the image characteristics of the face image to form data to be classified; the third processing submodule is used for carrying out coefficient relaxation processing on the data to be classified, and when the data to be classified subjected to coefficient relaxation processing is larger than a preset classification threshold value, the data to be classified is classified.
In some embodiments, the content understanding of the face image comprises: and performing gender identification, age judgment, color value scoring or human face similarity comparison on the human face image.
The embodiment also provides a server. Referring to fig. 6, fig. 6 is a schematic diagram of a basic structure of a server according to the present embodiment.
As shown in fig. 6, the server includes: one or more processors 3110 and memory 3120; one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to:
acquiring a face image to be classified;
inputting the face image into a convolutional neural network model constructed with a loss function, wherein the loss function performs coefficient relaxation treatment on the data to be classified output by the convolutional neural network model so as to increase a classification interface of the data to be classified;
and obtaining classification data output by the convolutional neural network model, and performing content understanding on the face image according to the classification data.
Before classifying the face image, the server performs coefficient relaxation processing on the to-be-classified data features of the face image extracted by the convolutional neural network model, namely, the to-be-classified data is scaled in the same proportion, and the classification interface of the classification data is increased. The convolutional neural network model can be trained under a more severe condition by adopting coefficient relaxation processing, so that the classification boundary is obviously increased, and the content understanding accuracy of the convolutional neural network model is greatly improved.
It should be noted that in this embodiment, all the programs for implementing the face image processing method in this embodiment are stored in the memory of the server, and the processor can call the programs in the memory to execute all the functions listed in the face image processing method. The functions realized by the server are described in detail in the face image processing method in this embodiment, and are not described herein again.
It should be noted that the preferred embodiments of the present invention are shown in the description and the drawings, but the present invention may be embodied in many different forms and is not limited to the embodiments described in the description, which are not intended as additional limitations to the present disclosure, which is provided for the purpose of providing a more thorough understanding of the present disclosure. Moreover, the above technical features are combined with each other to form various embodiments which are not listed above, and all of them are regarded as the scope of the present invention described in the specification; further, modifications and variations will occur to those skilled in the art in light of the foregoing description, and it is intended to cover all such modifications and variations as fall within the true spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A face image processing method is characterized by comprising the following steps:
acquiring a face image to be classified;
inputting the face image into a convolutional neural network model constructed with a loss function, wherein the loss function performs coefficient relaxation treatment on the data to be classified output by the convolutional neural network model so as to increase a classification interface of the data to be classified;
and acquiring classification data output by the convolutional neural network model, and performing content understanding on the face image according to the classification data.
2. The method according to claim 1, wherein the coefficient relaxation process specifically includes the steps of:
and carrying out the same-scale reduction processing on the data to be classified output by the full connection layer of the convolutional neural network model so as to increase the classification interface of the data to be classified.
3. The method of processing a human face image according to claim 1, wherein the forward propagation of the convolutional neural network model is characterized by:
L=log(p i )
defining a function:
wherein i represents the class of the input image, j represents the classification class different from i, t represents the classification class different from i, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w i Denotes the ithWeight of class, w j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
4. The method of processing a human face image according to claim 1, wherein the feature of back propagation of the convolutional neural network model is described as:
defining a function:
wherein i represents the class of the input image, j represents the classification class different from i, t represents the classification class different from i, k represents the coefficient relaxation parameter, f (x) represents the face feature extracted by the convolutional neural network model, and w (x) represents the face feature extracted by the convolutional neural network model i Weight, w, representing the ith category j Weight, w, representing the jth class t The weight of the t-th category is represented, and N is represented as the number of categories to be classified.
5. The method for processing the human face image according to claim 1, wherein the convolutional neural network model is formed by training through the following steps:
acquiring training sample data marked with classification judgment information;
inputting the training sample data into a convolutional neural network model to obtain model classification information of the training sample data;
comparing model classification information of different samples in the training sample data with the classification judgment information through a loss stopping function to judge whether the model classification information is consistent with the classification judgment information;
and when the model classification information is inconsistent with the classification judgment information, repeatedly and circularly updating the weight in the convolutional neural network model until the comparison result is consistent with the classification judgment information, and ending.
6. The method according to claim 5, wherein the step of comparing, by means of a stop-loss function, whether the model classification information of different samples in the training sample data is consistent with the classification judgment information includes the following steps:
carrying out coefficient parameterization on the data to be classified output by the full connection layer of the convolutional neural network model, and synchronously reducing the data to be classified;
comparing the data to be classified subjected to coefficient parameterization with a boundary value in a preset first classification value interval, and determining the interval position of the data to be classified subjected to coefficient parameterization in the first classification value interval;
determining model classification information of different samples in the training sample data according to the classification result corresponding to the interval position;
and judging whether the classification reference information is consistent with the classification judgment information or not.
7. The image processing method according to claim 1, wherein the step of obtaining the face image to be classified further comprises the following steps:
inputting the face image into the convolutional neural network model, and extracting the image features of the face image by the convolutional neural network model to form the data to be classified;
and performing coefficient relaxation processing on the data to be classified, and classifying the data to be classified when the data to be classified subjected to coefficient relaxation processing is larger than a preset classification threshold value.
8. The method for processing the face image according to any one of claims 1 to 7, wherein the content understanding of the face image comprises: and performing gender identification, age judgment, color value scoring or human face similarity comparison on the human face image.
9. A face image processing system, characterized in that the face image processing system comprises:
the acquisition module is used for acquiring a face image to be classified;
the processing module is used for inputting the face image into a convolutional neural network model constructed with a loss function, and the loss function performs coefficient relaxation processing on to-be-classified data output by the convolutional neural network model so as to increase a classification interface of the to-be-classified data;
and the classification module is used for acquiring the classification data output by the convolutional neural network model and understanding the content of the face image according to the classification data.
10. A server, comprising:
one or more processors;
a memory;
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of facial image processing of any of claims 1-8.
CN201711131120.5A 2017-11-15 2017-11-15 Face image processing process, system and server Active CN107944363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711131120.5A CN107944363B (en) 2017-11-15 2017-11-15 Face image processing process, system and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711131120.5A CN107944363B (en) 2017-11-15 2017-11-15 Face image processing process, system and server

Publications (2)

Publication Number Publication Date
CN107944363A true CN107944363A (en) 2018-04-20
CN107944363B CN107944363B (en) 2019-04-26

Family

ID=61931294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711131120.5A Active CN107944363B (en) 2017-11-15 2017-11-15 Face image processing process, system and server

Country Status (1)

Country Link
CN (1) CN107944363B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299690A (en) * 2018-09-21 2019-02-01 浙江中正智能科技有限公司 A method of video real-time face accuracy of identification can be improved
CN110009059A (en) * 2019-04-16 2019-07-12 北京字节跳动网络技术有限公司 Method and apparatus for generating model
CN110490242A (en) * 2019-08-12 2019-11-22 腾讯医疗健康(深圳)有限公司 Training method, eye fundus image classification method and the relevant device of image classification network
CN116702014A (en) * 2023-08-03 2023-09-05 中电科新型智慧城市研究院有限公司 Population identification method, device, terminal equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104346622A (en) * 2013-07-31 2015-02-11 富士通株式会社 Convolutional neural network classifier, and classifying method and training method thereof
CN104408470A (en) * 2014-12-01 2015-03-11 中科创达软件股份有限公司 Gender detection method based on average face preliminary learning
CN106022317A (en) * 2016-06-27 2016-10-12 北京小米移动软件有限公司 Face identification method and apparatus
US20160307072A1 (en) * 2015-04-17 2016-10-20 Nec Laboratories America, Inc. Fine-grained Image Classification by Exploring Bipartite-Graph Labels
CN106096538A (en) * 2016-06-08 2016-11-09 中国科学院自动化研究所 Face identification method based on sequencing neural network model and device
CN106384080A (en) * 2016-08-31 2017-02-08 广州精点计算机科技有限公司 Apparent age estimating method and device based on convolutional neural network
CN106649886A (en) * 2017-01-13 2017-05-10 深圳市唯特视科技有限公司 Method for searching for images by utilizing depth monitoring hash of triple label
US20170169315A1 (en) * 2015-12-15 2017-06-15 Sighthound, Inc. Deeply learned convolutional neural networks (cnns) for object localization and classification
CN107301640A (en) * 2017-06-19 2017-10-27 太原理工大学 A kind of method that target detection based on convolutional neural networks realizes small pulmonary nodules detection

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104346622A (en) * 2013-07-31 2015-02-11 富士通株式会社 Convolutional neural network classifier, and classifying method and training method thereof
CN104408470A (en) * 2014-12-01 2015-03-11 中科创达软件股份有限公司 Gender detection method based on average face preliminary learning
US20160307072A1 (en) * 2015-04-17 2016-10-20 Nec Laboratories America, Inc. Fine-grained Image Classification by Exploring Bipartite-Graph Labels
US20170169315A1 (en) * 2015-12-15 2017-06-15 Sighthound, Inc. Deeply learned convolutional neural networks (cnns) for object localization and classification
CN106096538A (en) * 2016-06-08 2016-11-09 中国科学院自动化研究所 Face identification method based on sequencing neural network model and device
CN106022317A (en) * 2016-06-27 2016-10-12 北京小米移动软件有限公司 Face identification method and apparatus
CN106384080A (en) * 2016-08-31 2017-02-08 广州精点计算机科技有限公司 Apparent age estimating method and device based on convolutional neural network
CN106649886A (en) * 2017-01-13 2017-05-10 深圳市唯特视科技有限公司 Method for searching for images by utilizing depth monitoring hash of triple label
CN107301640A (en) * 2017-06-19 2017-10-27 太原理工大学 A kind of method that target detection based on convolutional neural networks realizes small pulmonary nodules detection

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299690A (en) * 2018-09-21 2019-02-01 浙江中正智能科技有限公司 A method of video real-time face accuracy of identification can be improved
CN110009059A (en) * 2019-04-16 2019-07-12 北京字节跳动网络技术有限公司 Method and apparatus for generating model
CN110009059B (en) * 2019-04-16 2022-03-29 北京字节跳动网络技术有限公司 Method and apparatus for generating a model
CN110490242A (en) * 2019-08-12 2019-11-22 腾讯医疗健康(深圳)有限公司 Training method, eye fundus image classification method and the relevant device of image classification network
CN110490242B (en) * 2019-08-12 2024-03-29 腾讯医疗健康(深圳)有限公司 Training method of image classification network, fundus image classification method and related equipment
CN116702014A (en) * 2023-08-03 2023-09-05 中电科新型智慧城市研究院有限公司 Population identification method, device, terminal equipment and storage medium

Also Published As

Publication number Publication date
CN107944363B (en) 2019-04-26

Similar Documents

Publication Publication Date Title
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
CN108108807B (en) Learning type image processing method, system and server
Hoang Ngan Le et al. Robust hand detection and classification in vehicles and in the wild
CN107818314A (en) Face image processing method, device and server
CN109002766B (en) Expression recognition method and device
CN107679513B (en) Image processing method and device and server
CN107633204A (en) Face occlusion detection method, apparatus and storage medium
WO2019033525A1 (en) Au feature recognition method, device and storage medium
Barnouti Improve face recognition rate using different image pre-processing techniques
CN107886062A (en) Image processing method, system and server
CN107944363A (en) Face image processing process, system and server
CN105956570B (en) Smiling face&#39;s recognition methods based on lip feature and deep learning
CN113761259A (en) Image processing method and device and computer equipment
CN111325237B (en) Image recognition method based on attention interaction mechanism
Divya et al. Facial expression recognition by calculating euclidian distance for eigen faces using PCA
CN110135505A (en) Image classification method, device, computer equipment and computer readable storage medium
Li et al. Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes
CN111401343B (en) Method for identifying attributes of people in image and training method and device for identification model
CN110717407A (en) Human face recognition method, device and storage medium based on lip language password
Rasel et al. An efficient framework for hand gesture recognition based on histogram of oriented gradients and support vector machine
Omaia et al. 2D-DCT distance based face recognition using a reduced number of coefficients
CN113033587A (en) Image recognition result evaluation method and device, electronic equipment and storage medium
KR101334858B1 (en) Automatic butterfly species identification system and method, and portable terminal having automatic butterfly species identification function using the same
Kota et al. Principal component analysis for gesture recognition using systemc
Yun et al. Disguised‐Face Discriminator for Embedded Systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant