WO2021083241A1 - 人脸图像质量评价方法、特征提取模型训练方法、图像处理系统、计算机可读介质和无线通信终端 - Google Patents

人脸图像质量评价方法、特征提取模型训练方法、图像处理系统、计算机可读介质和无线通信终端 Download PDF

Info

Publication number
WO2021083241A1
WO2021083241A1 PCT/CN2020/124546 CN2020124546W WO2021083241A1 WO 2021083241 A1 WO2021083241 A1 WO 2021083241A1 CN 2020124546 W CN2020124546 W CN 2020124546W WO 2021083241 A1 WO2021083241 A1 WO 2021083241A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
face
convolution
image
processing
Prior art date
Application number
PCT/CN2020/124546
Other languages
English (en)
French (fr)
Inventor
颜波
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2021083241A1 publication Critical patent/WO2021083241A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Definitions

  • the embodiments of the present application relate to the field of map construction and image recognition, and more specifically, to a method for evaluating the quality of a face image, a method for training a feature extraction model, a device for evaluating the quality of a face image, a training device for a feature extraction model, and image processing System, computer readable medium and wireless communication terminal.
  • the embodiments of the present application provide a face image quality evaluation method, a feature extraction model training method, a face image quality evaluation device, a feature extraction model training device, an image processing system, a computer readable medium, and wireless
  • the communication terminal is conducive to quickly assessing the quality of the face.
  • a method for evaluating the quality of a face image includes: obtaining a to-be-processed image containing a face; detecting the to-be-processed image to obtain a corresponding face image; Input the trained feature extraction model based on the mobile face recognition network, perform feature extraction on the face image to obtain feature data; input the feature data into the first fully connected layer and the second fully connected layer that are set continuously Convolution processing to obtain the face quality score of the face image.
  • a method for training a feature extraction model includes: obtaining a sample image containing a human face in response to an image processing instruction of an image service system; and inputting the sample image into a continuously set convolutional layer and The deep convolution layer performs continuous convolution processing to obtain a first convolution result; inputting the first convolution result into n consecutively set bottleneck structure layers to perform continuous convolution processing to obtain a second convolution result; Wherein, n>5, and is a positive integer; convolution processing the second convolution result using the convolutional layer and the linear global depth convolution layer that are continuously set to obtain the third convolution result; using the fully connected layer pair
  • the third convolution result is fully connected to obtain face feature data corresponding to the sample image; the face feature data is input into a loss function model to calculate loss parameters, and optimization is performed based on the loss parameters Iteratively train the feature extraction model.
  • a device for evaluating the quality of a face image includes: a to-be-processed image acquisition module for acquiring a to-be-processed image containing a human face; a face image extraction module for evaluating the to-be-processed image Perform detection to obtain the corresponding face image; the face feature data extraction module is used to input the face image into a trained feature extraction model based on a mobile face recognition network, and perform feature extraction on the face image to Obtaining feature data; a face quality scoring module, configured to input the feature data into a first fully connected layer and a second fully connected layer that are continuously set for convolution processing to obtain a face quality score of the face image.
  • a device for training a feature extraction model includes: a sample data acquisition module for acquiring a sample image containing a human face in response to an image processing instruction of an image service system; and generating a first convolution result A module for inputting the sample image into the convolutional layer and the depth convolutional layer that are continuously set to perform continuous convolution processing to obtain the first convolution result; the second convolution result generation module is used for converting the first convolution result to the first convolution result.
  • the convolution result is input into the consecutively set n bottleneck structure layers to perform continuous convolution processing to obtain the second convolution result; where n>5, and is a positive integer; the third convolution result generation module is used to use continuous convolution
  • the set convolutional layer and the linear global depth convolutional layer perform convolution processing on the second convolution result to obtain the third convolution result;
  • a face feature data generation module is used to use the fully connected layer to perform convolution processing on the third convolution result.
  • the convolution result is fully connected to obtain the face feature data corresponding to the sample image;
  • the iterative training module is used to input the face feature data into the loss function model to calculate the loss parameter, and perform processing based on the loss parameter Optimize to iteratively train the feature extraction model.
  • an image processing system includes: a service module for obtaining images to be processed; an image processing module for responding to service processing instructions issued by the service module to execute any of the above-mentioned embodiments.
  • a wireless communication terminal including: one or more processors; a storage device, configured to store one or more programs, when the one or more programs are used by the one or more processors When executed, the one or more processors are caused to execute the method in the first aspect or the second aspect.
  • a computer-readable medium for storing computer software instructions used to execute the method in the first aspect or the second aspect, and the computer software instructions include the programs designed to execute the above aspects.
  • the names of the wireless communication terminal and the positioning system do not limit the device itself. In actual implementation, these devices may appear under other names. As long as the function of each device is similar to that of this application, it falls within the scope of the claims of this application and its equivalent technologies.
  • Fig. 1 shows a schematic diagram of a face image quality evaluation method according to an embodiment of the present application.
  • Fig. 2 shows a schematic structural diagram of a feature extraction model based on a mobile face recognition network according to an embodiment of the present application.
  • FIG. 3 shows a schematic diagram of the structure of a bottleneck structure layer with a step length of 1 in an embodiment of the present application.
  • FIG. 4 shows a schematic diagram of the structure of a bottleneck structure layer with a step length of 2 in an embodiment of the present application.
  • FIG. 5 shows a schematic diagram of the overall architecture of a face image quality evaluation model according to an embodiment of the present application.
  • Fig. 6 shows a schematic diagram of a training method of a feature extraction model according to an embodiment of the present application.
  • Fig. 7 shows a schematic diagram of the composition of a face image quality evaluation apparatus according to an embodiment of the present application.
  • FIG. 8 shows a schematic diagram of the composition of a training device for a feature extraction model according to an embodiment of the present application.
  • Fig. 9 shows a schematic diagram of the composition of an image processing system according to an embodiment of the present application.
  • Fig. 10 shows a schematic block diagram of a computer system of a wireless communication terminal according to an embodiment of the present application.
  • this exemplary embodiment provides a face image quality evaluation method.
  • the model has a smaller magnitude and can be applied to smart terminal devices such as mobile phones and tablet computers.
  • Fig. 1 shows a schematic diagram of a method for evaluating the quality of a face image according to an embodiment of the present application. As shown in Figure 1, the method includes some or all of the following:
  • S13 Input the face image into a trained feature extraction model based on a mobile face recognition network, and perform feature extraction on the face image to obtain feature data;
  • S14 Input the characteristic data into the first fully connected layer and the second fully connected layer that are continuously set to perform convolution processing, so as to obtain a face quality score of the face image.
  • the above-mentioned smart terminal device may be a smart terminal such as a mobile phone or a tablet computer equipped with a camera module. Users can use the camera module of the terminal device to take pictures and obtain images to be processed containing human faces. Alternatively, the user can also take pictures through an external camera component to obtain a to-be-processed image containing a human face. Alternatively, the image to be processed from other devices can also be received via a wired or wireless network.
  • the image to be processed may be preprocessed to acquire the corresponding face image.
  • the above S12 can be implemented through the following steps:
  • S121 Perform face detection on the to-be-processed image to obtain a face area
  • S122 Perform face key point detection on the face area to obtain key points of the face area
  • S123 Perform alignment processing on the face area based on the key points of the face area to obtain an aligned face image.
  • the trained face detection model can be used to perform face detection on the image to be processed to determine the face area
  • the trained face key point detection model can be used to perform key point detection on the face area to extract the face Key point information. Then use the preset similarity transformation matrix to transform the face area and transform it to the standard face.
  • the similarity transformation matrix may include the following formula:
  • the upper left corner 2*2 matrix is the rotation part; t x and t y are translation factors, including 4 degrees of freedom, namely rotation, x-direction translation, y-direction translation, and scaling factor s.
  • a model can also be used for face detection and key point information detection, for example, a Hyper Face model is used for face detection and key point positioning and head angle detection. estimate.
  • a mobile face recognition network (Mobile Face Nets) model may be trained in advance. Specifically, it can include the following steps:
  • S22 Input the sample data into a convolutional layer and a depth convolutional layer that are continuously set to perform continuous convolution processing to obtain a first convolution result.
  • S25 Perform a fully connected process on the third convolution result by using a fully connected layer to obtain face feature data corresponding to the sample data.
  • face image data of multiple people in different scenes can be obtained as raw data.
  • images in different states related to the face itself can be obtained as raw data, such as states of different face poses, occlusions, and expressions.
  • the trained face detection and face key point detection models can be used to perform face detection and face key point detection, and then the face is transformed into a standard face according to the similarity transformation.
  • the original data is preprocessed using the method as in the above-mentioned embodiment to obtain sample data.
  • the subsequent sample data can be input into an improved mobile face recognition network.
  • the improved mobile face recognition network in this embodiment includes a different number of bottleneck layers from the prior art, and the bottleneck structure includes different structures, as well as an improved last layer.
  • the improved mobile face recognition network may include the following settings: a first convolutional layer, a deep convolutional layer, six consecutive bottleneck structure layers, a second convolutional layer, and a linear global Deep convolutional layer and fully connected layer.
  • the step size and the number of execution repetitions in each layer are configured based on the six consecutive layers of the bottleneck structure.
  • the preset step length corresponding to the bottleneck structure layer of the odd-numbered layer is P
  • the preset step length corresponding to the bottleneck structure layer of the even-numbered layer is Q
  • P Q
  • both P and Q are positive integers.
  • the bottleneck structure includes the first convolutional layer, the deep convolutional layer, and the second convolutional layer, which are sequentially arranged.
  • Convolutional layer Squeeze and Excitation Network (SE-Net) layer and summation calculation (add) layer.
  • SE-Net Squeeze and Excitation Network
  • add summation calculation
  • the convolution kernel of the first convolutional layer is 1*1, using the PReLU (Parametric Rectified Linear Unit, parameter correction linear unit) activation function for activation;
  • the convolution kernel of the deep convolution layer is 3*3, using PReLU activation The function is activated;
  • the convolution kernel of the second convolution layer is 1*1, and the linear activation function is used for activation.
  • PReLU Parametric Rectified Linear Unit, parameter correction linear unit
  • the initial input parameters are input to the first convolution layer for convolution processing; the output result of the first convolution layer is input to the deep convolution layer for convolution processing; the output result of the deep convolution layer is input to the second convolution layer for convolution processing;
  • the output result of the second convolutional layer is input to the backlog excitation network layer for channel weight allocation processing; the output result and initial input parameters of the backlog excitation network layer are then input into the summation calculation layer for calculation, and the final output result of the bottleneck structure layer is obtained.
  • the convolution kernel of the first convolutional layer is 1*1, and the PReLU activation function is used for activation
  • the convolution kernel of the deep convolution layer is 3*3, and the PReLU activation function is used for activation
  • the convolution kernel of the second convolution layer is 1*1, and the linear activation function is used for activation.
  • the second bottleneck structure layer repeatedly uses the residual structure when repeated operations are repeated, which can effectively alleviate the gradient dispersion problem caused by the deepening of the neural network layer, which is more conducive to model learning and convergence.
  • the backlog incentive (SE block) network layer By modifying the structure of the bottleneck structure, adding a backlog incentive (SE block) network layer, it can effectively take into account that the importance of each channel may be different. By adding an importance weight to each channel, and then multiplying it by the original value of each channel, the feature representation capability of each channel can be increased. It avoids the defect that the structure in the prior art scheme considers the importance of each channel to be the same.
  • the input is the initial feature map
  • the output is a 1*1*C vector as the importance weight of each channel.
  • the network will automatically learn the importance of each channel during the training process, thereby enhancing the network
  • the feature extraction and expression capabilities of the system improve the performance of the model.
  • the convolution kernel of the convolution layer in S24 is 1*1.
  • the convolution kernel of the linear global depth convolution layer is 7*7.
  • the last layer is set as a fully connected layer, and the final output is a 128-dimensional vector.
  • a standardized processing layer may be set after the above-mentioned improved mobile face recognition network feature extraction model.
  • a standardized processing layer based on the L2 paradigm.
  • the formula of the L2 normal form can include:
  • x is the element of the output vector of the feature extraction model
  • the facial features in the process of model training, after the facial features are obtained, they can be input into the ArcFace Loss function model to calculate the loss of the model.
  • the formula of the ArcFace Loss function can include:
  • L is the total loss
  • N is the number of samples
  • n is the number of categories
  • s and m are hyperparameters
  • is the angle between the face feature and the weight of each category.
  • the loss can be passed to the embedding layer, and then passed to the feature extraction model based on the mobile face recognition network. Then use the Adam optimization algorithm to optimize the model, set the initial learning rate to 0.1; then gradually decrease according to the training data and the number of training steps, and finally obtain the improved features based on the mobile face recognition network that can recognize the face in real time and accurately Extract the model.
  • the feature extraction model can be run in smart mobile terminal settings.
  • the detected face image corresponding to the image to be processed can be input into the feature extraction model, and the corresponding feature data can be extracted.
  • the aligned face image can be input into the feature extraction model, which consists of the convolutional layer, deep convolutional layer, and convolutional layer of the model.
  • the six consecutively set bottleneck structure layers, convolutional layers, linear global depth convolutional layers and fully connected layers are processed in sequence and standardized, and finally the feature vector of the face image is output.
  • the face image quality score may be marked in advance. Specifically, you can first select a standard face image of each object as the reference image, and then calculate the cosine similarity between other face images of the object and the reference object, and take the similarity value as the face image Quality rating.
  • the similarity is proportional to the face quality score.
  • the reference image is used as a high-quality image.
  • the higher the quality of the image is.
  • two fully connected layers may be set after the above-mentioned standardization processing layer as the quality evaluation model.
  • the number of neurons in the first fully connected layer can be configured to be one-half of the face feature (embedding) dimension, and the activation function is the relu activation function; the number of neurons in the second fully connected layer is 1, activation
  • the function is a sigmoid activation function, and the output is a face quality score between 0-1, so that the face feature space is mapped to the face quality score space.
  • the loss function of the quality evaluation model can adopt the MSE (mean-square error, mean square error) loss function, and its formula can include:
  • I the face quality score value predicted by the model
  • y i the marked face quality score value
  • the loss function can be passed to the fully connected layer according to the back propagation algorithm, and the Adam algorithm is used to optimize the two-layer fully connected layer of the quality evaluation model.
  • the initial learning rate is set to 0.01, and then according to the training data and training steps The number gradually decreases.
  • the quality evaluation model can be used to obtain the corresponding face quality score.
  • the network weight of the feature extraction model can be fixed unchanged.
  • a complete face image quality evaluation model is formed by adding two fully connected layers after the feature extraction model, and the two fully connected layers are used to Perform face quality scoring, so that the facial feature extraction and face quality scoring of the image are completed in the same network, which can fully guarantee the performance and universality of the model.
  • the feature extraction model is constructed based on the mobile face recognition network model.
  • the specific processing process of feature extraction is improved, and the feature extraction model is The magnitude is smaller, the accuracy is higher, and the speed is faster; and it can ensure that the size and running time of the model can meet the requirements of deployment on the mobile terminal, and realize the real-time and accurate evaluation of the face image quality on the mobile terminal.
  • the above model can be applied to the facial recognition system of mobile devices such as smart phones and tablet computers. For example, selecting a high-quality face image from a photo sequence and inputting it into the face recognition system can significantly improve the face. Recognize the efficiency and performance of the system; or, apply to the camera's capture and continuous shooting functions, and use the facial quality evaluation model to help users select satisfactory photos more conveniently.
  • the training method of the feature extraction model described above may include the following steps:
  • S33 Input the first convolution result into n consecutively set bottleneck structure layers to perform continuous convolution processing to obtain a second convolution result; where n>5, and is a positive integer;
  • S36 Input the face feature data into a loss function model to calculate a loss parameter, and optimize based on the loss parameter to iteratively train a feature extraction model.
  • the above-mentioned image business system can be used for a business system for processing face recognition tasks; for example, a business system for station entry recognition, or a business system for processing surveillance images, or an access control system, and so on.
  • a business system for processing face recognition tasks for example, a business system for station entry recognition, or a business system for processing surveillance images, or an access control system, and so on.
  • This disclosure does not specifically limit the specific content of the business system
  • the method further includes: inputting a scoring model to train the scoring model, including:
  • S41 Input the face feature data into a first fully connected layer and a second fully connected layer that are continuously set for processing, so as to obtain a face quality score of the sample image;
  • S42 Input the face quality score into a scoring loss function to obtain a scoring loss parameter, and optimize based on the above scoring loss parameter to iteratively train a scoring model.
  • the preset step size corresponding to the bottleneck structure layer of the odd-numbered layer is P
  • the preset step size corresponding to the bottleneck structure layer of the even-numbered layer is Q; where P >Q, and both P and Q are positive integers.
  • the method further includes: configuring the number of execution repetitions of each bottleneck structure layer based on the level of each bottleneck structure layer in the n consecutive bottleneck structure layers.
  • the method includes: using the first convolution layer and the depth convolution layer set by the bottleneck structure layer , The second convolution layer, and the squeeze excitation network layer, sequentially perform convolution, depth convolution, convolution, and channel weight distribution processing on the first convolution result to obtain the second convolution result.
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not be implemented in this application.
  • the implementation process of the example constitutes any limitation.
  • the face image quality evaluation method according to the embodiment of the application is described in detail above.
  • the face image quality evaluation device according to the embodiment of the application will be described below with reference to the accompanying drawings.
  • the technical features described in the method embodiment are applicable to the following devices Examples.
  • FIG. 7 shows a schematic block diagram of a face image quality evaluation device 70 according to an embodiment of the present application. As shown in Fig. 7, the device 70 includes:
  • the to-be-processed image acquisition module 701 can be used to acquire a to-be-processed image containing a human face.
  • the face image extraction module 702 may be used to detect the to-be-processed image to obtain a corresponding face image.
  • the face feature data extraction module 703 may be used to input the face image into a trained feature extraction model based on a mobile face recognition network, and perform feature extraction on the face image to obtain feature data.
  • the face quality scoring module 704 may be used to input the characteristic data into the first fully connected layer and the second fully connected layer that are continuously set for convolution processing, so as to obtain the face quality score of the face image.
  • the to-be-processed image acquisition module 701 may include:
  • the face area recognition module is used to perform face detection on the to-be-processed image to obtain a face area.
  • the key point detection module is used to perform face key point detection on the face area to obtain the key points of the face area.
  • the alignment processing module is configured to perform alignment processing on the face area based on the key points of the face area to obtain an aligned face image.
  • the device 70 further includes:
  • the standardization processing module is used to perform standardization processing on the characteristic data to obtain the characteristic data after the standardization processing.
  • the device 70 further includes:
  • the original data processing unit is used to obtain original data and preprocess the original data to obtain sample data.
  • the first convolution processing unit is configured to input the sample data into the convolution layer and the depth convolution layer that are continuously set to perform continuous convolution processing to obtain the first convolution result.
  • the bottleneck structure processing unit is configured to input the first convolution result into n consecutively set bottleneck structure layers to perform continuous convolution processing to obtain the second convolution result; where n>5, and is a positive integer.
  • the second convolution processing unit is configured to perform convolution processing on the second convolution result by using the convolution layer and the linear global depth convolution layer that are continuously set to obtain the third convolution result.
  • the fully connected processing unit is configured to perform fully connected processing on the third convolution result by using a fully connected layer to obtain face feature data corresponding to the sample data.
  • the device 70 further includes:
  • the step size configuration module is used for the n consecutively set bottleneck structure layers in the feature extraction model, the preset step size corresponding to the bottleneck structure layer of the odd-numbered layer is P, and the preset step size corresponding to the bottleneck structure layer of the even-numbered layer is Q; where P>Q, and both P and Q are positive integers.
  • the device 70 further includes:
  • the repetition number configuration module is configured to configure the execution repetition number of each bottleneck structure layer based on the level of each bottleneck structure layer in the n consecutive bottleneck structure layers.
  • the bottleneck structure layer may use the first convolutional layer, the deep convolutional layer, the second convolutional layer, and the squeeze excitation network layer set by the bottleneck structure layer.
  • the first convolution result is sequentially processed by convolution, depth convolution, convolution, and channel weight distribution to obtain the second convolution result.
  • FIG. 8 shows a schematic block diagram of a training device 80 for a feature extraction model according to an embodiment of the present application. As shown in Fig. 8, the device 80 includes:
  • the sample data acquisition module 801 is configured to acquire a sample image containing a human face in response to an image processing instruction of the image service system.
  • the first convolution result generation module 802 is configured to input the sample image into the convolution layer and the depth convolution layer that are continuously set to perform continuous convolution processing to obtain the first convolution result.
  • the second convolution result generating module 803 is configured to input the first convolution result into n consecutively set bottleneck structure layers for continuous convolution processing to obtain the second convolution result; where n>5, and Is a positive integer.
  • the third convolution result generation module 804 is configured to perform convolution processing on the second convolution result by using the convolution layer and the linear global depth convolution layer that are continuously set to obtain the third convolution result.
  • the face feature data generating module 805 is configured to perform a fully connected process on the third convolution result by using a fully connected layer to obtain face feature data corresponding to the sample image.
  • the iterative training module 806 is configured to input the face feature data into a loss function model to calculate loss parameters, and perform optimization based on the loss parameters to iteratively train the feature extraction model.
  • the device 80 may further include:
  • the scoring unit is configured to input the face feature data into the first fully connected layer and the second fully connected layer that are continuously set for processing, so as to obtain the face quality score of the sample image.
  • the iterative training unit is used to input the face quality score into the scoring loss function to obtain the scoring loss parameter, and optimize it based on the scoring loss parameter to iteratively train the scoring model.
  • the preset step size corresponding to the bottleneck structure layer of the odd-numbered layer is P
  • the preset step size corresponding to the bottleneck structure layer of the even-numbered layer is P
  • the step size is Q; where P>Q, and both P and Q are positive integers.
  • the device 80 may further include:
  • the repetition number configuration module is configured to configure the execution repetition number of each bottleneck structure layer based on the level of each bottleneck structure layer in the n consecutive bottleneck structure layers.
  • the bottleneck structure layer may use the first convolutional layer, the deep convolutional layer, the second convolutional layer, and the squeeze excitation network layer set by the bottleneck structure layer.
  • the first convolution result is sequentially processed by convolution, depth convolution, convolution, and channel weight distribution to obtain the second convolution result.
  • FIG. 9 shows a schematic block diagram of an image processing system 900 according to an embodiment of the present application. As shown in Figure 9, the system 900 includes:
  • the service module 901 is used to obtain images to be processed.
  • the image processing module 902 is configured to execute a face image quality evaluation method in response to a business processing instruction issued by the business module to obtain a scoring result of the image to be processed.
  • the model training module 903 is configured to execute the training method of the feature extraction model in response to the image processing instruction issued by the service module, so as to obtain the feature extraction model.
  • the above-mentioned business module may be a related business application in application scenarios such as a monitoring system, a security inspection system, or an access control system.
  • the service module can collect and store the to-be-processed image containing the face in real time.
  • the facial image quality evaluation device 70, the feature extraction model training device 80, and the image processing system 900 each have units, modules, and other operations and/or functions to achieve facial image quality evaluation.
  • the method and the corresponding process in the training method of the feature extraction model are not repeated here for the sake of brevity.
  • modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory.
  • the features and functions of two or more modules or units described above may be embodied in one module or unit.
  • the features and functions of a module or unit described above can be further divided into multiple modules or units to be embodied.
  • Fig. 10 shows a schematic structural diagram of a computer system suitable for implementing a wireless communication terminal according to an embodiment of the present invention.
  • the computer system 1000 includes a central processing unit (Central Processing Unit, CPU) 1001, which can be loaded into a random storage unit according to a program stored in a read-only memory (Read-Only Memory, ROM) 1002 or from a storage part 1008. Access to the program in the memory (Random Access Memory, RAM) 1003 to execute various appropriate actions and processing. In RAM 1003, various programs and data required for system operation are also stored.
  • the CPU 1001, the ROM 1002, and the RAM 1003 are connected to each other through a bus 1004.
  • An input/output (Input/Output, I/O) interface 1005 is also connected to the bus 1004.
  • the following components are connected to the I/O interface 1005: input part 1006 including keyboard, mouse, etc.; output part 1007 including cathode ray tube (Cathode Ray Tube, CRT), liquid crystal display (LCD), etc., and speakers, etc. ; A storage part 1004 including a hard disk, etc.; and a communication part 1009 including a network interface card such as a LAN (Local Area Network) card and a modem.
  • the communication section 1009 performs communication processing via a network such as the Internet.
  • the driver 1010 is also connected to the I/O interface 1005 as needed.
  • a removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 1010 as needed, so that the computer program read therefrom is installed into the storage part 1008 as needed.
  • an embodiment of the present invention includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication part 1009, and/or installed from the removable medium 1011.
  • CPU central processing unit
  • the computer-readable medium shown in the embodiment of the present invention may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above.
  • Computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable removable Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable of the above The combination.
  • the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein.
  • This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wireless, wired, etc., or any suitable combination of the above.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the above-mentioned module, program segment, or part of the code contains one or more for realizing the specified logic function.
  • Executable instructions may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram or flowchart, and the combination of blocks in the block diagram or flowchart can be implemented by a dedicated hardware-based system that performs the specified function or operation, or can be implemented by It is realized by a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present invention may be implemented in software or hardware, and the described units may also be provided in a processor. Among them, the names of these units do not constitute a limitation on the unit itself under certain circumstances.
  • this application also provides a computer-readable medium.
  • the computer-readable medium may be included in the electronic device described in the above-mentioned embodiments; or it may exist alone without being assembled into the electronic device. in.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by an electronic device, the electronic device realizes the method described in the following embodiments. For example, the electronic device can implement the steps shown in FIG. 1.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiment described above is only illustrative.
  • the division of the unit is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the unit described as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may also be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • this function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

本申请实施例公开了一种人脸图像质量评价方法、特征提取模型的训练方法、图像处理系统、计算机可读介质和无线通信终端,该评价方法包括:获取包含人脸的待处理图像;对所述待处理图像进行检测以获取对应的人脸图像;将人脸图像输入已训练的基于移动人脸识别网络的特征提取模型,对所述人脸图像进行特征提取以获取特征数据;将所述特征数据输入连续设置的第一全连接层和第二全连接层进行处理,以获取所述人脸图像的人脸质量评分。本申请实施例的方法、装置、系统、无线通信终端和计算机可读介质,能够实现对人脸图像质量的快速评估,同时保证质量评估结果的准确性。

Description

人脸图像质量评价方法、特征提取模型训练方法、图像处理系统、计算机可读介质和无线通信终端
本申请要求在2019年10月31日提交的申请号为201911055879.9、发明名称为“人脸图像质量评价方法及装置、计算机可读介质、通信终端”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及地图构建及图像识别领域,并且更具体地,涉及一种人脸图像质量评价方法、特征提取模型的训练方法、人脸图像质量评价装置、特征提取模型的训练装置、图像处理系统、计算机可读介质和无线通信终端。
背景技术
在现有基于特征工程的图像处理匹配方法和基于深度学习的方法中,对人脸质量进行评价时,存在一定的问题和不足。例如,人脸图像的评分标注需要依靠人工来完成,需要花费大量的时间和精力并且具有一定的主观性,而且影响人脸质量的因素较多,人工标注无法全面地考虑到多方面因素的影响,导致标注样本不准确,进而影响模型的准确性。另外,越来越多的模型需要应用在智能移动终端设备上,因此对模型大小和性能有着更高的要求,现有的人脸质量评估方法在模型大小和运行时间上都很难满足要求。
发明内容
有鉴于此,本申请实施例提供了一种人脸图像质量评价方法、特征提取模型的训练方法、人脸图像质量评价装置、特征提取模型的训练装置、图像处理系统、计算机可读介质和无线通信终端,有利于实现快速对人脸质量进行评价。
第一方面,提供了一种人脸图像质量评价方法,该方法包括:获取包含人脸的待处理图像;对所述待处理图像进行检测以获取对应的人脸图像;将所述人脸图像输入已训练的基于移动人脸识别网络的特征提取模型,对所述人脸图像进行特征提取以获取特征数据;将所述特征数据输入连续设置的第一全连接层和第二全连接层进行卷积处理,以获取所述人脸图像的人脸质量评分。
第二方面,提供了一种特征提取模型的训练方法,该方法包括:响应于图像业务系统的图像处理指令,获取包含人脸的样本图像;将所述样本图像输入连续设置的卷积层和深度卷积层进行连续的卷积处理以获取第一卷积结果;将所述第一卷积结果输入连续设置的n个瓶颈结构层中进行连续的卷积处理以获取第二卷积结果;其中,n>5,且为正整数;利用连续设置的卷积层和线性全局深度卷积层对所述第二卷积结果进行卷积处理以获取第三卷积结果;利用全连接层对所述第三卷积结果进行全连接处理以获取所述样本图像对应的人脸特征数据;将所述人脸特征数据输入损失函数模型中以计算损失参数,并基于所述损失参数进行优化以迭代训练特征提取模型。
第三方面,提供了一种人脸图像质量评价装置,该装置包括:待处理图像获取模块,用于获取包含人脸的待处理图像;人脸图像提取模块,用于对所述待处理图像进行检测以获取对应的人脸图像;人脸特征数据提取模块,用于将所述人脸图像输入已训练的基于移动人脸识别网络的特征提取模型,对所述人脸图像进行特征提取以获取特征数据;人脸质量评分模块,用于将所述特征数据输入连续设置的第一全连接层和第二全连接层进行卷积处理,以获取所述人脸图像的人脸质量评分。
第四方面,提供了一种特征提取模型的训练装置,该装置包括:样本数据获取模块,用于响应于图像业务系统的图像处理指令,获取包含人脸的样本图像;第一卷积结果生成模块,用于将所述样本图像输入连续设置的卷积层和深度卷积层进行连续的卷积处理以获取第一卷积结果;第二卷积结果生成模块,用于将所述第一卷积结果输入连续设置 的n个瓶颈结构层中进行连续的卷积处理以获取第二卷积结果;其中,n>5,且为正整数;第三卷积结果生成模块,用于利用连续设置的卷积层和线性全局深度卷积层对所述第二卷积结果进行卷积处理以获取第三卷积结果;人脸特征数据生成模块,用于利用全连接层对所述第三卷积结果进行全连接处理以获取所述样本图像对应的人脸特征数据;迭代训练模块,用于将所述人脸特征数据输入损失函数模型中以计算损失参数,并基于所述损失参数进行优化以迭代训练特征提取模型。
第五方面,提供了一种图像处理系统,该系统包括:业务模块,用于获取待处理图像;图像处理模块,用于响应所述业务模块发出的业务处理指令以执行如上述实施例中任一项所述的人脸图像质量评价方法,以获取所述待处理图像的评分结果。
第六方面,提供了一种无线通信终端,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器执行上述第一方面或第二方面中的方法。
第七方面,提供了一种计算机可读介质,用于储存为执行上述第一方面或第二方面中的方法所用的计算机软件指令,其包含用于执行上述各方面所设计的程序。
本申请中,无线通信终端以及定位系统等的名字对设备本身不构成限定,在实际实现中,这些设备可以以其他名称出现。只要各个设备的功能和本申请类似,属于本申请权利要求及其等同技术的范围之内。
本申请的这些方面或其他方面在以下实施例的描述中会更加简明易懂。
附图说明
图1示出了本申请实施例的人脸图像质量评价方法的示意图。
图2示出了本申请实施例的基于移动人脸识别网络的特征提取模型的架构示意图。
图3示出了本申请实施例的步长为1的瓶颈结构层的架构示意图。
图4示出了本申请实施例的步长为2的瓶颈结构层的架构示意图。
图5示出了本申请实施例的人脸图像质量评价模型的整体架构示意图。
图6示出了本申请实施例的特征提取模型的训练方法的示意图。
图7示出了本申请实施例的人脸图像质量评价装置的组成示意图。
图8示出了本申请实施例的特征提取模型的训练装置的组成示意图。
图9示出了本申请实施例的图像处理系统的组成示意图。
图10示出了本申请实施例的无线通信终端的计算机系统的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
在相关技术中,通过采集视觉图像来构建环境地图时,现有方案只考虑传统的图像特征,而传统图像特征的抗噪能力差,定位成功率低。并且,在构建的地图中,若发生光线明暗变化或者季节变换导致环境特征发生改变,则可能导致无法进行定位。另外,现有方案在构建地图时大多只利用视觉图像的二维特征信息,定位自由度存在欠缺,定位鲁棒性较差。这样,就需要一种方法,解决上述的现有技术存在的缺点和不足。
应理解,本申请实施例的技术方案可以应用于图像处理。
现有技术在进行人脸图像质量评价时,现有技术中的基于深度学习的人脸质量评价方法,一方面,训练过程中人脸图像的评分标注依靠人工来完成,需要花费大量的时间和精力并且具有一定的主观性。另一方面,影响人脸质量的因素是多方面的,可以包括人脸姿态、人脸遮挡、对比度、分辨率、光照和背景等,人工标注无法全面地考虑到多方面因素的影响,这将会对人脸评估模型结果的准确度产生一定影响。此外,现在越来越多的图像评价方法需要应用在智能手机、平板电脑等智能移动终端设备上,因此对模 型大小和性能有着更高的要求,现有的人脸质量评估方法在模型大小和运行时间上都很难满足要求,需要一种更轻量的人脸质量评估模型。
针对上述的现有技术的缺点和不足,本示例实施方式中提供了一种人脸图像质量评价方法,模型具有更小的量级,可以应用于手机、平板电脑等智能终端设备。
图1示出了本申请实施例的一种人脸图像质量评价方法的示意性。如图1所示,该方法包括以下部分或全部内容:
S11,获取包含人脸的待处理图像;
S12,对所述待处理图像进行检测以获取对应的人脸图像;
S13,将所述人脸图像输入已训练的基于移动人脸识别网络的特征提取模型,对所述人脸图像进行特征提取以获取特征数据;
S14,将所述特征数据输入连续设置的第一全连接层和第二全连接层进行卷积处理,以获取所述人脸图像的人脸质量评分。
具体地,上述的智能终端设备可以是配置有摄像模组的手机、平板电脑等智能终端。用户可以利用终端设备自带的摄像模组进行拍照,获取包含人脸的待处理图像。或者,用户也可以通过外接的摄像组件进行拍照来获取包含人脸的待处理图像。或者,也可以通过有线或无线网络接收其他设备发送的待处理图像。
可选地,在本申请实施例中,在获取待处理图像后,由于图像中可能包含背景、噪声等,因此可以对待处理图像进行预处理来获取对应的人脸图像。具体来说,上述的S12可以通过以下步骤来实现:
S121,对所述待处理图像进行人脸检测以获取人脸区域;
S122,对所述人脸区域进行人脸关键点检测以获取所述人脸区域的关键点;
S123,基于所述人脸区域的关键点对所述人脸区域进行对齐处理,以获取对齐处理后的人脸图像。
举例来说,可以利用已训练的人脸检测模型对对待处理图像进行人脸检测确定人脸区域,以及利用已训练的人脸关键点检测模型对人脸区域进行关键点检测来提取人脸的关键点信息。再利用预设的相似变换矩阵对人脸区域进行转换,变换到标准人脸。举例来说,相似变换矩阵可以包括下式:
Figure PCTCN2020124546-appb-000001
其中,左上角2*2矩阵为旋转部分;t x和t y为平移因子,包括4个自由度,即旋转、x方向平移、y方向平移和缩放因子s。
对于人脸区域图像来说,相似变换前后长度比、夹角、圆心保持均保持不变。
此外,上述的人脸检测模型和人脸关键点检测模型使用常规技术即可以实现,本公开在此不做特殊限定。或者,在本公开的其他示例性实施方式中,也可以使用一个模型进行人脸检测以及人脸关键点信息的检测,例如,使用Hyper Face模型进行人脸检测和关键点定位和头部角度的估计。
可选地,在本申请实施例中,可以预先训练移动人脸识别网络(Mobile Face Nets)模型。具体来说,可以包括以下步骤:
S21,获取原始数据,并对所述原始数据进行预处理以获取样本数据。
S22,将所述样本数据输入连续设置的卷积层和深度卷积层进行连续的卷积处理以获取第一卷积结果。
S23,将所述第一卷积结果输入连续设置的n个瓶颈结构层中进行连续的卷积处理以获取第二卷积结果;其中,n>5,且为正整数。
S24,利用连续设置的卷积层和线性全局深度卷积层对所述第二卷积结果进行卷积处 理以获取第三卷积结果;
S25,利用全连接层对所述第三卷积结果进行全连接处理以获取所述样本数据对应的人脸特征数据。
具体的,可以获取多个人在不同场景的人脸图像数据作为原始数据,例如,可以获取与人脸自身相关的不同状态下的图像作为原始数据,例如不同人脸姿态、遮挡以及表情等状态。或者,还可以获取在不同图像参数下的图像;例如,针对图像采集传感器来说,可以配置不同的对比度、分辨率或亮度等;针对图像采集环境来说,可以配置不同的光照、位置、背景等。
在获取原始数据后,可以利用训练好的人脸检测和人脸关键点检测模型,进行人脸检测和人脸关键点检测,然后根据相似变换将人脸变换到标准人脸。例如,使用如上述实施例中的方法对原始数据进行预处理,来获取样本数据。
具体的,参考图2所示,上述的卷积层的卷积核为3*3,步长s=2;上述的深度卷积层的卷积核为3*3,步长s=1。
具体的,可以将对其后的样本数据输入改进的移动人脸识别网络中。具体来说,本实施方式中的改进的移动人脸识别网络包括与现有技术不同数量的瓶颈结构(bottleneck)层,瓶颈结构包含不同的结构,以及改进的最后一层等。具体来说,参考图2所示,改进的移动人脸识别网络可以包括依次设置的:第一卷积层、深度卷积层、连续的六个瓶颈结构层、第二卷积层、线性全局深度卷积层以及全连接层。
其中,在连续的六个瓶颈结构层中,基于连续六层的瓶颈结构对各层中的步长以及执行的重复次数进行配置。例如,奇数层瓶颈结构层对应配置的预设步长为P,偶数层瓶颈结构层对应配置的预设步长为Q;其中,P>Q,且P、Q均为正整数。举例来说,可以配置P=2,Q=1。第一瓶颈结构层配置为步长s=2,重复次数n=1;第二瓶颈结构层配置为步长为s=1,重复次数n=4;第三瓶颈结构层配置为步长s=2,重复次数n=1;第四瓶颈结构层配置为步长s=1,重复次数n=6;第五瓶颈结构层配置为步长s=2,重复次数n=1;第六瓶颈结构层配置为步长s=1,重复次数n=2。
对于配置有不同预设步长的瓶颈结构层来说,被配置为步长s=1时,参考图3所示,瓶颈结构包括依次设置的第一卷积层、深度卷积层、第二卷积层、积压激励网络(Squeeze and Excitation Network,SE-Net)层和求和计算(add)层。其中,第一卷积层的卷积核为1*1,使用PReLU(Parametric Rectified Linear Unit,参数校正线性单元)激活函数进行激活;深度卷积层的卷积核为3*3,使用PReLU激活函数进行激活;第二卷积层的卷积核为1*1,使用线性激活函数进行激活。初始输入参数输入第一卷积层进行卷积处理;第一卷积层的输出结果输入深度卷积层进行卷积处理;深度卷积层的输出结果输入第二卷积层进行卷积处理;第二卷积层的输出结果输入积压激励网络层对通道权重分配处理;积压激励网络层的输出结果和初始输入参数再输入求和计算层中进行计算,得到该瓶颈结构层的最终输出结果。
瓶颈结构层被配置为步长s=2时,参考图4所示,瓶颈结构包括依次设置的第一卷积层、深度卷积层、第二卷积层和积压激励网络层。其中,第一卷积层的卷积核为1*1,使用PReLU激活函数进行激活;深度卷积层的卷积核为3*3,使用PReLU激活函数进行激活,并配置步长stride=2;第二卷积层的卷积核为1*1,使用线性激活函数进行激活。
通过配置第一瓶颈结构层的步长s=2,重复次数n=1,以及配置第二瓶颈结构层的步长s=1,重复次数n=4;并且配置步长s=1的瓶颈结构层采用残差结构,而步长s=2的瓶颈结构层未使用残差结构。从而使得第二瓶颈结构层在多次重复操作时,重复使用残差结构,可以有效的缓解随着神经网络层数加深而带来的梯度弥散问题,从而更利于模型学习和收敛。
此外,通过修改瓶颈结构的结构,增加了积压激励(SE block)网络层,能够有效的考虑到每个通道的重要性可能不同。通过为每个通道添加一个重要性权重,然后再乘以 每个通道原来的值,可以增加各个通道的特征表示能力。避免了现有技术方案中结构认为每个通道的重要性都是相同的缺陷。对于积压激励网络层来说,输入为初始的特征图谱,输出为1*1*C的向量作为每个通道的重要性权重,网络在训练过程中会自动学习各个通道的重要性,从而增强网络的特征提取和表达能力,提高模型的性能。
具体的,参考图2所示,S24中的卷积层的卷积核为1*1。线性全局深度卷积层的卷积核为7*7。最后一层设置为全连接层,最终的输出为128维向量。通过将最后一层设置为全连接层,能够对线性全局深度卷积层的输出结果进行降维,并保持较小的运算量。并且,通过实际运行验证,有效的提升了模型的精度。
可选地,在本申请实施例中,在上述的改进的基于移动人脸识别网络特征提取模型后还可以设置一标准化处理层。例如,使用基于L2范式的标准化处理层。
在利用上述的改进的特征提取模型对样本数据进行特征提取得到各样本数据对应的训练人脸特征数据后,利用L2范式对训练人脸特征数据进行标准化处理,从而得到标准化后的最终的人脸特征(embedding)。
举例来说,L2范式的公式可以包括:
Figure PCTCN2020124546-appb-000002
其中,x为特征提取模型输出向量的元素,K为向量的长度,如上述实施例中所述,K=128。
可选地,在本申请实施例中,在模型训练的过程中,在获取人脸特征后,可以将其输入ArcFace Loss函数模型中计算模型的损失。具体来说,ArcFace Loss函数的公式可以包括:
Figure PCTCN2020124546-appb-000003
其中,L为总的损失,N为样本数量,n为类别数量,s和m为超参数,θ为人脸特征和各个类别权重之间的夹角。本示例性实施方式中,配置s=64,m=0.5。
具体的,获取总的损失后,根据反向传播算法,可将损失传递到embedding层,然后再传递到基于移动人脸识别网络的特征提取模型。再利用Adam优化算法对模型进行优化,设置初始学习率为0.1;然后根据训练数据和训练步数逐渐递减,最终获取可以对人脸进行实时、准确识别的改进的基于移动人脸识别网络的特征提取模型。该特征提取模型可以在智能移动终端设置中运行。
在对基于移动人脸识别网络的特征提取模型训练完成后,便可以将待处理图像对应的检测出的人脸图像输入该特征提取模型中,并提取对应的特征数据。
举例来说,在利用训练完成的特征提取模型对待处理图像对应的人脸图像进行特征提取时,可以将对齐后的人脸图像输入特征提取模型,由模型的卷积层、深度卷积层、连续设置的六个瓶颈结构层、卷积层、线性全局深度卷积层和全连接层依次进行处理,以及标准化处理,最终输出人脸图像的特征向量。
可选地,在本申请实施例中,在S14中,可以预先进行人脸图像质量评分的标注。具体来说,可以先选取每个对象的一张标准人脸图像作为参考图像,然后再计算该对象的其他人脸图像与参考对象的余弦相似度,将该相似度取值作为该人脸图像的质量评分。
当上述的特征提取模型的性能足够时,相似度与人脸质量评分成正比关系,参考图像作为高质量的图像,同一个人的其他人脸图像与参考图像做比较时,图像的质量越高相似度就会越高,相反,如果相似度越低则说明人脸图像的质量越差。
可选地,在本申请实施例中,在特征提取模型后,可以在上述的标准化处理层之后设置两层全连接层,作为质量评价模型。具体来说,可以配置第一全连接层的神经元个数为人脸特征(embedding)维度的二分之一,激活函数为relu激活函数;第二全连接层 的神经元个数为1,激活函数为sigmoid激活函数,输出为0-1之间的人脸质量评分,从而将人脸特征空间映射到人脸质量评分空间。
基于上述的人脸图像评分标注结果,作为训练样本,对上述的质量评价模型进行有监督式的训练。质量评价模型的损失函数可以采用MSE(mean-square error,均方误差)损失函数,其公式可以包括:
Figure PCTCN2020124546-appb-000004
其中,
Figure PCTCN2020124546-appb-000005
为模型预测的人脸质量评分值,y i为标注的人脸质量评分值。
在计算MES损失后,根据反向传播算法可将损失函数传递到全连接层,并利用Adam算法优化质量评价模型的两层全连接层,初始学习率设置为0.01,然后根据训练数据和训练步数逐渐递减。优化完成后,对于任意的人脸图像,便可利用该质量评价模型获取对应人脸质量评分。
可选地,在本申请实施例中,对于上述的基于移动人脸识别网络的特征提取模型和质量评价模型,在训练过程中,可以固定特征提取模型的网络权重不变。
可选地,在本申请实施例中,参考图5所示,通过在特征提取模型后增加两个全连接层来形成一个完整的人脸图像质量评价模型,并利用该两个全连接层来进行人脸质量评分,使得对图像的人脸特征提取和人脸质量评分在同一个网络内完成,可以充分保证模型的性能和普适性。另外,特征提取模型基于移动人脸识别网络模型构建,通过修改模型的结构,修改瓶颈结构层的配置,以及修改瓶颈结构层的具体结构,进而改进特征提取的具体处理过程,使特征提取模型的量级更小、精度更高,速度更快;并能够保证模型的大小和运行时间能够满足在移动端部署的要求,实现在移动端对人脸图像质量进行实时、准确的评估。上述的模型可以应用于智能手机、平板电脑等移动设备的人脸系别系统中,例如,从一段照片序列中挑选出人脸质量高的图像输入到人脸识别系统中,可以显著提高人脸识别系统的效率和性能;或者,应用于相机的抓拍和连拍等功能,利用人脸质量评估模型可以更方便地帮助用户挑选出满意的照片等。
可选地,在本申请实施例中,参考图6所示,还提供一种特征提取模型的训练方法。参考图6中所示,上述的特征提取模型的训练方法可以包括以下步骤:
S31,响应于图像业务系统的图像处理指令,获取包含人脸的样本图像;
S32,将所述样本图像输入连续设置的卷积层和深度卷积层进行连续的卷积处理以获取第一卷积结果;
S33,将所述第一卷积结果输入连续设置的n个瓶颈结构层中进行连续的卷积处理以获取第二卷积结果;其中,n>5,且为正整数;
S34,利用连续设置的卷积层和线性全局深度卷积层对所述第二卷积结果进行卷积处理以获取第三卷积结果;
S35,利用全连接层对所述第三卷积结果进行全连接处理以获取所述样本图像对应的人脸特征数据;
S36,将所述人脸特征数据输入损失函数模型中以计算损失参数,并基于所述损失参数进行优化以迭代训练特征提取模型。
举例来说,上述的图像业务系统可以用于处理人脸识别任务的业务系统;例如车站进站识别的业务系统,或者是处理监控图像的业务系统,或者是门禁系统等等。本公开对业务系统的具体内容不做特殊限定
可选地,在本申请实施例中,所述获取所述样本图像对应的人脸特征数据后,所述方法还包括:输入评分模型以训练评分模型,包括:
S41,将所述人脸特征数据输入连续设置的第一全连接层和第二全连接层进行处理,以获取所述样本图像的人脸质量评分;
S42,将所述人脸质量评分输入评分损失函数以获取评分损失参数,并基于上述评分损失参数进行优化以迭代训练评分模型。
具体的,所述特征提取模型中连续设置的n个瓶颈结构层中,奇数层瓶颈结构层对应的预设步长为P,偶数层瓶颈结构层对应的预设步长为Q;其中,P>Q,且P、Q均为正整数。
可选地,在本申请实施例中,所述方法还包括:基于各瓶颈结构层在所述连续n个瓶颈结构层中的所在层次,配置各所述瓶颈结构层的执行重复次数。
可选地,在本申请实施例中,将所述第一卷积结果输入所述瓶颈结构层后,所述方法包括:利用所述瓶颈结构层设置的第一卷积层、深度卷积层、第二卷积层、挤压激励网络层,对所述第一卷积结果依次进行卷积、深度度卷积、卷积以及通道权重分配处理,以获取第二卷积结果。
特征提取模型的训练方法的具体训练过程在上述的人脸图像质量评价方法中以做详细阐述,本实施例中不在复述。
应理解,本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
还应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
上文中详细描述了根据本申请实施例的人脸图像质量评价方法,下面将结合附图,描述根据本申请实施例的人脸图像质量评价装置,方法实施例所描述的技术特征适用于以下装置实施例。
图7示出了本申请实施例的人脸图像质量评价装置70的示意性框图。如图7所示,该装置70包括:
待处理图像获取模块701,可以用于获取包含人脸的待处理图像。
人脸图像提取模块702,可以用于对所述待处理图像进行检测以获取对应的人脸图像。
人脸特征数据提取模块703,可以用于将所述人脸图像输入已训练的基于移动人脸识别网络的特征提取模型,对所述人脸图像进行特征提取以获取特征数据。
人脸质量评分模块704,可以用于将所述特征数据输入连续设置的第一全连接层和第二全连接层进行卷积处理,以获取所述人脸图像的人脸质量评分。
可选地,在本申请实施例中,所述待处理图像获取模块701可以包括:
人脸区域识别模块,用于对所述待处理图像进行人脸检测以获取人脸区域。
关键点检测模块,用于对所述人脸区域进行人脸关键点检测以获取所述人脸区域的关键点。
对齐处理模块,用于基于所述人脸区域的关键点对所述人脸区域进行对齐处理,以获取对齐处理后的人脸图像。
可选地,在本申请实施例中,所述装置70还包括:
标准化处理模块,用于对所述特征数据进行标准化处理以获取标准化处理后的特征数据。
可选地,在本申请实施例中,所述装置70还包括:
原始数据处理单元,用于获取原始数据,并对所述原始数据进行预处理以获取样本数据。
第一卷积处理单元,用于将所述样本数据输入连续设置的卷积层和深度卷积层进行连续的卷积处理以获取第一卷积结果。
瓶颈结构处理单元,用于将所述第一卷积结果输入连续设置的n个瓶颈结构层中进行连续的卷积处理以获取第二卷积结果;其中,n>5,且为正整数。
第二卷积处理单元,用于利用连续设置的卷积层和线性全局深度卷积层对所述第二卷积结果进行卷积处理以获取第三卷积结果。
全连接处理单元,用于利用全连接层对所述第三卷积结果进行全连接处理以获取所述样本数据对应的人脸特征数据。
可选地,在本申请实施例中,所述装置70还包括:
步长配置模块,用于对所述特征提取模型中连续设置的n个瓶颈结构层中,奇数层瓶颈结构层对应的预设步长为P,偶数层瓶颈结构层对应的预设步长为Q;其中,P>Q,且P、Q均为正整数。
可选地,在本申请实施例中,所述装置70还包括:
重复次数配置模块,用于基于各瓶颈结构层在所述连续n个瓶颈结构层中的所在层次,配置各所述瓶颈结构层的执行重复次数。
可选地,在本申请实施例中,所述瓶颈结构层可以利用所述瓶颈结构层设置的第一卷积层、深度卷积层、第二卷积层、挤压激励网络层,对所述第一卷积结果依次进行卷积、深度度卷积、卷积以及通道权重分配处理,以获取第二卷积结果。
图8示出了本申请实施例的特征提取模型的训练装置80的示意性框图。如图8所示,该装置80包括:
样本数据获取模块801,用于响应于图像业务系统的图像处理指令,获取包含人脸的样本图像。
第一卷积结果生成模块802,用于将所述样本图像输入连续设置的卷积层和深度卷积层进行连续的卷积处理以获取第一卷积结果。
第二卷积结果生成模块803,用于将所述第一卷积结果输入连续设置的n个瓶颈结构层中进行连续的卷积处理以获取第二卷积结果;其中,n>5,且为正整数。
第三卷积结果生成模块804,用于利用连续设置的卷积层和线性全局深度卷积层对所述第二卷积结果进行卷积处理以获取第三卷积结果。
人脸特征数据生成模块805,用于利用全连接层对所述第三卷积结果进行全连接处理以获取所述样本图像对应的人脸特征数据。
迭代训练模块806,用于将所述人脸特征数据输入损失函数模型中以计算损失参数,并基于所述损失参数进行优化以迭代训练特征提取模型。
可选地,在本申请实施例中,所述的装置80还可以包括:
评分单元,用于将所述人脸特征数据输入连续设置的第一全连接层和第二全连接层进行处理,以获取所述样本图像的人脸质量评分。
迭代训练单元,用于将所述人脸质量评分输入评分损失函数以获取评分损失参数,并基于上述评分损失参数进行优化以迭代训练评分模型。
可选地,在本申请实施例中,所述特征提取模型中连续设置的n个瓶颈结构层中,奇数层瓶颈结构层对应的预设步长为P,偶数层瓶颈结构层对应的预设步长为Q;其中,P>Q,且P、Q均为正整数。
可选地,在本申请实施例中,所述的装置80还可以包括:
重复次数配置模块,用于基于各瓶颈结构层在所述连续n个瓶颈结构层中的所在层次,配置各所述瓶颈结构层的执行重复次数。
可选地,在本申请实施例中,所述瓶颈结构层可以利用所述瓶颈结构层设置的第一卷积层、深度卷积层、第二卷积层、挤压激励网络层,对所述第一卷积结果依次进行卷积、深度度卷积、卷积以及通道权重分配处理,以获取第二卷积结果。
图9示出了本申请实施例的图像处理系统900的示意性框图。如图9所示,该系统900包括:
业务模块901,用于获取待处理图像。
图像处理模块902,用于响应所述业务模块发出的业务处理指令以执行人脸图像质量评价方法,以获取所述待处理图像的评分结果。
模型训练模块903,用于响应所述业务模块发出的图像处理指令以执行特征提取模型的训练方法,以获取所述特征提取模型。
可选地,在本申请实施例中,上述的业务模块可以是监控系统、安检系统或者门禁系统等应用场景的相关业务应用。业务模块可以实时的采集包含脸的待处理图像并进行存储。
应理解,根据本申请实施例的人脸图像质量评价装置70、特征提取模型的训练装置80、图像处理系统900中的各个单元、模块和其它操作和/或功能分别为了实现人脸图像质量评价方法、特征提取模型的训练装方法中的相应流程,为了简洁,在此不再赘述。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
图10示出了适于用来实现本发明实施例的无线通信终端的计算机系统的结构示意图。
需要说明的是,图10示出的电子设备的计算机系统1000仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。
如图10所示,计算机系统1000包括中央处理单元(Central Processing Unit,CPU)1001,其可以根据存储在只读存储器(Read-Only Memory,ROM)1002中的程序或者从储存部分1008加载到随机访问存储器(Random Access Memory,RAM)1003中的程序而执行各种适当的动作和处理。在RAM 1003中,还存储有系统操作所需的各种程序和数据。CPU 1001、ROM 1002以及RAM 1003通过总线1004彼此相连。输入/输出(Input/Output,I/O)接口1005也连接至总线1004。
以下部件连接至I/O接口1005:包括键盘、鼠标等的输入部分1006;包括诸如阴极射线管(Cathode Ray Tube,CRT)、液晶显示器(Liquid Crystal Display,LCD)等以及扬声器等的输出部分1007;包括硬盘等的储存部分1004;以及包括诸如LAN(Local Area Network,局域网)卡、调制解调器等的网络接口卡的通信部分1009。通信部分1009经由诸如因特网的网络执行通信处理。驱动器1010也根据需要连接至I/O接口1005。可拆卸介质1011,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器1010上,以便于从其上读出的计算机程序根据需要被安装入储存部分1008。
特别地,根据本发明的实施例,下文参考流程图描述的过程可以被实现为计算机软件程序。例如,本发明的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分1009从网络上被下载和安装,和/或从可拆卸介质1011被安装。在该计算机程序被中央处理单元(CPU)1001执行时,执行本申请的系统中限定的各种功能。
需要说明的是,本发明实施例所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本发明中,计 算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本发明中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、有线等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本发明各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本发明实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现,所描述的单元也可以设置在处理器中。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定。
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被一个该电子设备执行时,使得该电子设备实现如下述实施例中所述的方法。例如,所述的电子设备可以实现如图1所示的各个步骤。
此外,上述附图仅是根据本发明示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
该作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是 各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
该功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。

Claims (20)

  1. 一种人脸图像质量评价方法,其特征在于,包括:
    获取包含人脸的待处理图像;
    对所述待处理图像进行检测以获取对应的人脸图像;
    将所述人脸图像输入已训练的基于移动人脸识别网络的特征提取模型,对所述人脸图像进行特征提取以获取特征数据;
    将所述特征数据输入连续设置的第一全连接层和第二全连接层进行处理,以获取所述人脸图像的人脸质量评分。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述待处理图像进行预处理以获取对应的人脸图像,包括:
    对所述待处理图像进行人脸检测以获取人脸区域;
    对所述人脸区域进行人脸关键点检测以获取所述人脸区域的关键点;
    基于所述人脸区域的关键点对所述人脸区域进行对齐处理,以获取对齐处理后的人脸图像。
  3. 根据权利要求2所述的方法,其特征在于,所述基于所述人脸区域的关键点对所述人脸区域进行对齐处理,以获取对齐处理后的人脸图像,包括:
    利用预设相似变换矩阵对所述人脸区域进行转换以变换到标准人脸,且所述标准人脸与所述人脸区域对应的长度比参数、夹角参数、圆心参数保持不变。
  4. 根据权利要求1所述的方法,其特征在于,在获取所述特征数据后,方法还包括:
    对所述特征数据进行标准化处理以获取标准化处理后的特征数据。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:预先训练所述基于移动人脸识别网络的特征提取模型,包括:
    获取原始数据,并对所述原始数据进行预处理以获取样本数据;
    将所述样本数据输入连续设置的卷积层和深度卷积层进行连续的卷积处理以获取第一卷积结果;
    将所述第一卷积结果输入连续设置的n个瓶颈结构层中进行连续的卷积处理以获取第二卷积结果;其中,n>5,且为正整数;
    利用连续设置的卷积层和线性全局深度卷积层对所述第二卷积结果进行卷积处理以获取第三卷积结果;
    利用全连接层对所述第三卷积结果进行全连接处理以获取所述样本数据对应的人脸特征数据。
  6. 根据权利要求5所述的方法,其特征在于,所述特征提取模型中连续设置的n个瓶颈结构层中,奇数层瓶颈结构层对应的预设步长为P,偶数层瓶颈结构层对应的预设步长为Q;其中,P>Q,且P、Q均为正整数。
  7. 根据权利要求5或6所述的方法,其特征在于,所述方法还包括:
    基于各瓶颈结构层在所述连续n个瓶颈结构层中的所在层次,配置各所述瓶颈结构层的执行重复次数。
  8. 根据权利要求6所述的方法,其特征在于,所述方法还包括:配置Q=1,瓶颈结构包括依次设置的第一卷积层、深度卷积层、第二卷积层、积压激励网络层和求和计算层;以及
    配置P=2,瓶颈结构包括依次设置的第一卷积层、深度卷积层、第二卷积层和积压激励网络层。
  9. 根据权利要求5所述的方法,其特征在于,将所述第一卷积结果输入所述瓶颈结构层后,所述方法包括:
    利用所述瓶颈结构层设置的第一卷积层、深度卷积层、第二卷积层、挤压激励网络层,对所述第一卷积结果依次进行卷积、深度度卷积、卷积以及通道权重分配处理,以 获取第二卷积结果。
  10. 根据权利要求5所述的方法,其特征在于,所述移动人脸识别网络包括依次设置的:第一卷积层、深度卷积层、连续的六个瓶颈结构层、第二卷积层、线性全局深度卷积层以及全连接层。
  11. 根据权利要求1所述的方法,其特征在于,所述对所述人脸图像进行特征提取以获取特征数据后,所述方法还包括:
    利用标准化处理层对所述特征数据进行标准化处理,以得到标准化后的特征数据。
  12. 一种特征提取模型的训练方法,其特征在于,包括:
    响应于图像业务系统的图像处理指令,获取包含人脸的样本图像;
    将所述样本图像输入连续设置的卷积层和深度卷积层进行连续的卷积处理以获取第一卷积结果;
    将所述第一卷积结果输入连续设置的n个瓶颈结构层中进行连续的卷积处理以获取第二卷积结果;其中,n>5,且为正整数;
    利用连续设置的卷积层和线性全局深度卷积层对所述第二卷积结果进行卷积处理以获取第三卷积结果;
    利用全连接层对所述第三卷积结果进行全连接处理以获取所述样本图像对应的人脸特征数据;
    将所述人脸特征数据输入损失函数模型中以计算损失参数,并基于所述损失参数进行优化以迭代训练特征提取模型。
  13. 根据权利要求12所述的方法,其特征在于,所述获取所述样本图像对应的人脸特征数据后,所述方法还包括:输入评分模型以训练评分模型,包括:
    将所述人脸特征数据输入连续设置的第一全连接层和第二全连接层进行处理,以获取所述样本图像的人脸质量评分;
    将所述人脸质量评分输入评分损失函数以获取评分损失参数,并基于上述评分损失参数进行优化以迭代训练评分模型。
  14. 根据权利要求12所述的方法,其特征在于,所述特征提取模型中连续设置的n个瓶颈结构层中,奇数层瓶颈结构层对应的预设步长为P,偶数层瓶颈结构层对应的预设步长为Q;其中,P>Q,且P、Q均为正整数。
  15. 根据权利要求12或13所述的方法,其特征在于,所述方法还包括:
    基于各瓶颈结构层在所述连续n个瓶颈结构层中的所在层次,配置各所述瓶颈结构层的执行重复次数。
  16. 根据权利要求12所述的方法,其特征在于,将所述第一卷积结果输入所述瓶颈结构层后,所述方法包括:
    利用所述瓶颈结构层设置的第一卷积层、深度卷积层、第二卷积层、挤压激励网络层,对所述第一卷积结果依次进行卷积、深度度卷积、卷积以及通道权重分配处理,以获取第二卷积结果。
  17. 一种图像处理系统,其特征在于,包括:
    业务模块,用于获取待处理图像;
    图像处理模块,用于响应所述业务模块发出的业务处理指令以执行如权利1至11中任一项所述的人脸图像质量评价方法,以获取所述待处理图像的评分结果。
  18. 根据权利要求17所述的系统,其特征在于,所述系统还包括:
    模型训练模块,用于响应所述业务模块发出的图像处理指令以执行如权利12至16中任一项所述的特征提取模型的训练方法,以获取所述特征提取模型。
  19. 一种计算机可读介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至11中任一项所述的人脸图像质量评价方法;或者,如权利要求12至16中任一项所述的特征提取模型的训练方法。
  20. 一种无线通信终端,其特征在于,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1至11中任一项所述的人脸图像质量评价方法;或者,如权利要求12至16中任一项所述的特征提取模型的训练方法。
PCT/CN2020/124546 2019-10-31 2020-10-28 人脸图像质量评价方法、特征提取模型训练方法、图像处理系统、计算机可读介质和无线通信终端 WO2021083241A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911055879.9A CN110866471A (zh) 2019-10-31 2019-10-31 人脸图像质量评价方法及装置、计算机可读介质、通信终端
CN201911055879.9 2019-10-31

Publications (1)

Publication Number Publication Date
WO2021083241A1 true WO2021083241A1 (zh) 2021-05-06

Family

ID=69653584

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124546 WO2021083241A1 (zh) 2019-10-31 2020-10-28 人脸图像质量评价方法、特征提取模型训练方法、图像处理系统、计算机可读介质和无线通信终端

Country Status (2)

Country Link
CN (1) CN110866471A (zh)
WO (1) WO2021083241A1 (zh)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361524A (zh) * 2021-06-29 2021-09-07 北京百度网讯科技有限公司 图像处理方法及装置
CN113420871A (zh) * 2021-07-28 2021-09-21 浙江大华技术股份有限公司 图像质量的评估方法、装置、存储介质及电子装置
CN113435488A (zh) * 2021-06-17 2021-09-24 深圳大学 一种图像采样概率提升方法及其应用
CN113642452A (zh) * 2021-08-10 2021-11-12 汇纳科技股份有限公司 人体图像质量评价方法、装置、系统及存储介质
CN113792682A (zh) * 2021-09-17 2021-12-14 平安科技(深圳)有限公司 基于人脸图像的人脸质量评估方法、装置、设备及介质
CN114494246A (zh) * 2022-03-31 2022-05-13 腾讯科技(深圳)有限公司 图像质量评估方法、装置、电子设备及存储介质
CN114511058A (zh) * 2022-01-27 2022-05-17 国网江苏省电力有限公司泰州供电分公司 一种用于电力用户画像的负荷元件构建方法及装置
CN115050081A (zh) * 2022-08-12 2022-09-13 平安银行股份有限公司 表情样本生成方法、表情识别方法、装置及终端设备
CN115049839A (zh) * 2022-08-15 2022-09-13 珠海翔翼航空技术有限公司 针对飞行模拟训练设备客观质量测试的质量检测方法
CN115294014A (zh) * 2022-06-07 2022-11-04 首都医科大学附属北京朝阳医院 一种头颈动脉图像处理方法、装置、存储介质及终端
CN115830028A (zh) * 2023-02-20 2023-03-21 阿里巴巴达摩院(杭州)科技有限公司 图像评价方法、设备、系统及存储介质
WO2023142753A1 (zh) * 2022-01-27 2023-08-03 华为技术有限公司 图像相似性度量方法及其装置
CN116740777A (zh) * 2022-09-28 2023-09-12 荣耀终端有限公司 人脸质量检测模型的训练方法及其相关设备

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866471A (zh) * 2019-10-31 2020-03-06 Oppo广东移动通信有限公司 人脸图像质量评价方法及装置、计算机可读介质、通信终端
CN111401291B (zh) * 2020-03-24 2023-07-14 盛景智能科技(嘉兴)有限公司 陌生人的识别方法和装置
CN113781379B (zh) * 2020-05-20 2024-03-19 上海高德威智能交通系统有限公司 图像质量确定方法、装置、电子设备及存储介质
CN111401344B (zh) * 2020-06-04 2020-09-29 腾讯科技(深圳)有限公司 人脸识别方法和装置及人脸识别系统的训练方法和装置
CN111753731A (zh) * 2020-06-24 2020-10-09 上海立可芯半导体科技有限公司 人脸质量评估方法、装置和系统以及人脸质量评估模型的训练方法
CN111783622A (zh) * 2020-06-29 2020-10-16 北京百度网讯科技有限公司 人脸表情识别的方法、装置、设备和计算机可读存储介质
CN112365451B (zh) * 2020-10-23 2024-06-21 微民保险代理有限公司 图像质量等级的确定方法、装置、设备及计算机可读介质
CN112381782B (zh) * 2020-11-11 2022-09-09 腾讯科技(深圳)有限公司 人脸图像质量评估方法、装置、计算机设备及存储介质
CN112418098A (zh) * 2020-11-24 2021-02-26 深圳云天励飞技术股份有限公司 视频结构化模型的训练方法及相关设备
CN112465792A (zh) * 2020-12-04 2021-03-09 北京华捷艾米科技有限公司 一种人脸质量的评估方法及相关装置
CN112560783A (zh) * 2020-12-25 2021-03-26 京东数字科技控股股份有限公司 用于评估关注状态的方法、装置、系统、介质及产品
CN113192028B (zh) * 2021-04-29 2022-05-31 合肥的卢深视科技有限公司 人脸图像的质量评价方法、装置、电子设备及存储介质
CN113420806B (zh) * 2021-06-21 2023-02-03 西安电子科技大学 一种人脸检测质量评分方法及系统
CN114742779A (zh) * 2022-04-01 2022-07-12 中国科学院光电技术研究所 基于深度学习的高分辨力自适应光学图像质量评价方法
CN114596620B (zh) * 2022-05-10 2022-08-05 深圳市海清视讯科技有限公司 人脸识别设备补光控制方法、装置、设备及存储介质
CN115908260B (zh) * 2022-10-20 2023-10-20 北京的卢铭视科技有限公司 模型训练方法、人脸图像质量评价方法、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951825A (zh) * 2017-02-13 2017-07-14 北京飞搜科技有限公司 一种人脸图像质量评估系统以及实现方法
CN109117797A (zh) * 2018-08-17 2019-01-01 浙江捷尚视觉科技股份有限公司 一种基于人脸质量评价的人脸抓拍识别方法
CN109800796A (zh) * 2018-12-29 2019-05-24 上海交通大学 基于迁移学习的船舶目标识别方法
CN110287880A (zh) * 2019-06-26 2019-09-27 西安电子科技大学 一种基于深度学习的姿态鲁棒性人脸识别方法
CN110866471A (zh) * 2019-10-31 2020-03-06 Oppo广东移动通信有限公司 人脸图像质量评价方法及装置、计算机可读介质、通信终端

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038474B (zh) * 2017-12-28 2020-04-14 深圳励飞科技有限公司 人脸检测方法、卷积神经网络参数的训练方法、装置及介质
CN109102014A (zh) * 2018-08-01 2018-12-28 中国海洋大学 基于深度卷积神经网络的类别不平衡的图像分类方法
CN109447990B (zh) * 2018-10-22 2021-06-22 北京旷视科技有限公司 图像语义分割方法、装置、电子设备和计算机可读介质
CN109886341B (zh) * 2019-02-25 2021-03-02 厦门美图之家科技有限公司 一种训练生成人脸检测模型的方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951825A (zh) * 2017-02-13 2017-07-14 北京飞搜科技有限公司 一种人脸图像质量评估系统以及实现方法
CN109117797A (zh) * 2018-08-17 2019-01-01 浙江捷尚视觉科技股份有限公司 一种基于人脸质量评价的人脸抓拍识别方法
CN109800796A (zh) * 2018-12-29 2019-05-24 上海交通大学 基于迁移学习的船舶目标识别方法
CN110287880A (zh) * 2019-06-26 2019-09-27 西安电子科技大学 一种基于深度学习的姿态鲁棒性人脸识别方法
CN110866471A (zh) * 2019-10-31 2020-03-06 Oppo广东移动通信有限公司 人脸图像质量评价方法及装置、计算机可读介质、通信终端

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435488B (zh) * 2021-06-17 2023-11-07 深圳大学 一种图像采样概率提升方法及其应用
CN113435488A (zh) * 2021-06-17 2021-09-24 深圳大学 一种图像采样概率提升方法及其应用
CN113361524B (zh) * 2021-06-29 2024-05-03 北京百度网讯科技有限公司 图像处理方法及装置
CN113361524A (zh) * 2021-06-29 2021-09-07 北京百度网讯科技有限公司 图像处理方法及装置
CN113420871A (zh) * 2021-07-28 2021-09-21 浙江大华技术股份有限公司 图像质量的评估方法、装置、存储介质及电子装置
CN113420871B (zh) * 2021-07-28 2023-02-24 浙江大华技术股份有限公司 图像质量的评估方法、装置、存储介质及电子装置
CN113642452A (zh) * 2021-08-10 2021-11-12 汇纳科技股份有限公司 人体图像质量评价方法、装置、系统及存储介质
CN113642452B (zh) * 2021-08-10 2023-11-21 汇纳科技股份有限公司 人体图像质量评价方法、装置、系统及存储介质
CN113792682B (zh) * 2021-09-17 2024-05-10 平安科技(深圳)有限公司 基于人脸图像的人脸质量评估方法、装置、设备及介质
CN113792682A (zh) * 2021-09-17 2021-12-14 平安科技(深圳)有限公司 基于人脸图像的人脸质量评估方法、装置、设备及介质
CN114511058A (zh) * 2022-01-27 2022-05-17 国网江苏省电力有限公司泰州供电分公司 一种用于电力用户画像的负荷元件构建方法及装置
WO2023142753A1 (zh) * 2022-01-27 2023-08-03 华为技术有限公司 图像相似性度量方法及其装置
CN114494246B (zh) * 2022-03-31 2022-07-12 腾讯科技(深圳)有限公司 图像质量评估方法、装置、电子设备及存储介质
CN114494246A (zh) * 2022-03-31 2022-05-13 腾讯科技(深圳)有限公司 图像质量评估方法、装置、电子设备及存储介质
CN115294014A (zh) * 2022-06-07 2022-11-04 首都医科大学附属北京朝阳医院 一种头颈动脉图像处理方法、装置、存储介质及终端
CN115050081A (zh) * 2022-08-12 2022-09-13 平安银行股份有限公司 表情样本生成方法、表情识别方法、装置及终端设备
CN115050081B (zh) * 2022-08-12 2022-11-25 平安银行股份有限公司 表情样本生成方法、表情识别方法、装置及终端设备
CN115049839B (zh) * 2022-08-15 2022-11-01 珠海翔翼航空技术有限公司 针对飞行模拟训练设备客观质量测试的质量检测方法
CN115049839A (zh) * 2022-08-15 2022-09-13 珠海翔翼航空技术有限公司 针对飞行模拟训练设备客观质量测试的质量检测方法
CN116740777A (zh) * 2022-09-28 2023-09-12 荣耀终端有限公司 人脸质量检测模型的训练方法及其相关设备
CN115830028B (zh) * 2023-02-20 2023-05-23 阿里巴巴达摩院(杭州)科技有限公司 图像评价方法、设备、系统及存储介质
CN115830028A (zh) * 2023-02-20 2023-03-21 阿里巴巴达摩院(杭州)科技有限公司 图像评价方法、设备、系统及存储介质

Also Published As

Publication number Publication date
CN110866471A (zh) 2020-03-06

Similar Documents

Publication Publication Date Title
WO2021083241A1 (zh) 人脸图像质量评价方法、特征提取模型训练方法、图像处理系统、计算机可读介质和无线通信终端
WO2020199931A1 (zh) 人脸关键点检测方法及装置、存储介质和电子设备
JP7331171B2 (ja) 画像認識モデルをトレーニングするための方法および装置、画像を認識するための方法および装置、電子機器、記憶媒体、並びにコンピュータプログラム
WO2020006961A1 (zh) 用于提取图像的方法和装置
WO2020215974A1 (zh) 用于人体检测的方法和装置
WO2017220032A1 (zh) 基于深度学习的车牌分类方法、系统、电子装置及存储介质
WO2020019926A1 (zh) 特征提取模型训练方法、装置、计算机设备及计算机可读存储介质
US11270099B2 (en) Method and apparatus for generating facial feature
WO2021052159A1 (zh) 基于对抗迁移学习的人脸美丽预测方法及装置
CN108197618B (zh) 用于生成人脸检测模型的方法和装置
WO2019119505A1 (zh) 人脸识别的方法和装置、计算机装置及存储介质
WO2017124990A1 (zh) 基于多张图片一致性实现保险理赔反欺诈的方法、系统、设备及可读存储介质
CN112784778B (zh) 生成模型并识别年龄和性别的方法、装置、设备和介质
US10657359B2 (en) Generating object embeddings from images
WO2022247539A1 (zh) 活体检测方法、估算网络处理方法、装置、计算机设备和计算机可读指令产品
WO2022012179A1 (zh) 生成特征提取网络的方法、装置、设备和计算机可读介质
CN113971751A (zh) 训练特征提取模型、检测相似图像的方法和装置
WO2019056503A1 (zh) 门店监控评价方法、装置及存储介质
WO2022227765A1 (zh) 生成图像修复模型的方法、设备、介质及程序产品
CN114898111B (zh) 预训练模型生成方法和装置、目标检测方法和装置
CN110909578A (zh) 一种低分辨率图像识别方法、装置和存储介质
CN112651333A (zh) 静默活体检测方法、装置、终端设备和存储介质
Singh et al. Face recognition using open source computer vision library (OpenCV) with Python
CN117253044A (zh) 一种基于半监督交互学习的农田遥感图像分割方法
WO2023159819A1 (zh) 视觉处理及模型训练方法、设备、存储介质及程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20881279

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20881279

Country of ref document: EP

Kind code of ref document: A1