WO2022142445A1 - Model training method, and image quality evaluation method and apparatus - Google Patents

Model training method, and image quality evaluation method and apparatus Download PDF

Info

Publication number
WO2022142445A1
WO2022142445A1 PCT/CN2021/116766 CN2021116766W WO2022142445A1 WO 2022142445 A1 WO2022142445 A1 WO 2022142445A1 CN 2021116766 W CN2021116766 W CN 2021116766W WO 2022142445 A1 WO2022142445 A1 WO 2022142445A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
evaluated
training
real
image quality
Prior art date
Application number
PCT/CN2021/116766
Other languages
French (fr)
Chinese (zh)
Inventor
于文海
郭伟
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2022142445A1 publication Critical patent/WO2022142445A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the invention belongs to the field of computers, and in particular relates to a model training method, an image quality evaluation method and a device.
  • Image quality assessment is mainly divided into full reference quality assessment, semi-reference quality assessment, and no reference quality assessment.
  • the face image quality assessment is subject to individual differences in facial features, including but not limited to hairstyles, wearing glasses, makeup, etc., which will lead to large changes in content, which is a non-reference quality assessment.
  • the no-reference quality assessment methods most of the current methods still need to utilize subjective quality scores to train quality assessment models.
  • the training process of the existing image quality assessment model mainly includes image data collection, manual data cleaning and labeling of the collected data, and then detecting the region of interest through the detection model, and expanding the boundary margin to retain the object area with complete content, Object regions and human-annotated quality labels are input to the deep learning network for training learning.
  • Image quality assessment model training needs to collect a large amount of image data and label the quality score labels corresponding to the image data, which is a huge workload.
  • image data due to the individual subjectivity of those who perform the labeling work and the differences in the richness of the content contained in the images themselves, it is difficult to formulate a unified set of standards to perform the labeling work.
  • Different people are observing the same image. Due to differences in cognition, there will be differences in the quality level annotation of the same image. Therefore, data collection and annotation for quality assessment has always been a difficult problem in face image quality assessment.
  • the present invention provides the following solutions.
  • a method for training an image quality assessment model including: acquiring a real image sample set, wherein the real image sample set includes multiple real image samples; and using the real image sample set to iteratively train a pre-built generative adversarial network , collect multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds; generate a first training sample library consisting of real image sample sets and multiple fake image sample sets, according to multiple
  • the preset image quality level is automatically graded and labeled for each first training sample in the first training sample library to obtain a labeled first training sample library; the preset multi-classification network is trained by using the first training sample library to obtain Image quality assessment model.
  • automatically grading and labeling each first training sample in the first training sample library includes: labeling the real image samples included in the real image sample set as the highest image quality level; The number of iteration rounds corresponding to the set marks the multiple pseudo-image samples included in each pseudo-image sample set as the corresponding image quality level, wherein a higher iteration number corresponds to a higher image quality level.
  • automatically grading and labeling each first training sample in the first training sample library includes: labeling a plurality of real image samples included in the real image sample set as the highest image quality level; The Frescher distance between the image sample set and the real image sample set, according to the calculation result, the multiple fake image samples included in each fake image sample set are marked as the corresponding image quality level, wherein, the smaller Frescher distance corresponds to Higher image quality levels.
  • automatically grading and labeling each first training sample in the first training sample library includes: labeling a plurality of real image samples included in the real image sample set as the highest image quality level; The mean square error (MSE) value of the image sample and the real image sample, according to the mean square error (MSE) value, each fake image sample is marked as the corresponding image quality level, where the lower mean square error (MSE) value corresponds to at a higher image quality level.
  • MSE mean square error
  • acquiring a sample set of real images further includes: acquiring a plurality of real images, and performing the following preprocessing operations on the plurality of real images: determining a region of interest (ROI) in each real image by using an object detection algorithm, and Each real image is cropped according to the determined region of interest (ROI); and the size of the multiple real images is normalized to obtain a real image sample set.
  • ROI region of interest
  • the real image sample is a face image
  • the object detection algorithm is a face detection algorithm
  • the method further includes: removing non-frontal face pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
  • using the first training sample library to perform classification training on a preset multi-classification network to obtain an image quality assessment model includes: acquiring each labeled first training sample in the first training sample library, and the label is used to indicate the image quality level of the first training sample; perform row direction filtering processing on each first training sample to obtain a first filtered image; perform column direction filtering processing on each first training sample to obtain a second filtered image; Each first training sample and the corresponding first filtered image and the second filtered image are spliced and merged to generate a labeled second training sample respectively; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and A plurality of second training samples are input into a preset multi-classification network for iterative training to obtain an image quality evaluation model.
  • the preset multi-classification network is a ResNet network
  • the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  • the method further includes: pre-constructing a generative adversarial network, the generative adversarial network includes a generative network and a discriminant network; wherein the generative network includes a linear mapping layer, a plurality of convolutional layers, and each of the plurality of convolutional layers.
  • the generative network is used to receive random noise and generate fake image samples;
  • the discriminative network includes multiple convolutional layers and LeakyRelu after each convolutional layer of the multiple convolutional layers The activation function layer and pooling layer, as well as the fully connected layer, the LeakyRelu activation function layer and the sigmoid activation function layer after multiple convolutional layers, the discriminant network is used to determine the authenticity of real image samples and fake image samples.
  • the method further includes: generating a loss function of the network using a cross-entropy function.
  • an image quality evaluation method including: receiving an image to be evaluated; using an image quality evaluation model trained by the method of the first aspect to perform image quality evaluation on the image to be evaluated, so as to confirm that the image to be evaluated is a plurality of pre-evaluated images. one of the set image quality levels.
  • the image to be evaluated is a face image to be evaluated
  • the image quality evaluation model is used to perform quality evaluation on the face image
  • the method further includes: after receiving the image to be evaluated, using a face detection algorithm to determine the face to be evaluated
  • the region of interest (ROI) in the image, and the face image to be evaluated is cropped according to the determined region of interest (ROI);
  • the size of the cropped face image to be evaluated is normalized according to the size of the first training sample use key point detection algorithm and/or pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped. If the face image is a frontal image, the image quality evaluation model is used to evaluate the image quality of the image to be evaluated after the size is normalized.
  • using an image quality evaluation model to perform image quality assessment on the image to be evaluated includes: performing row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; performing column-direction filtering on the image to be evaluated to obtain a second Filter the image to be evaluated; input the combined image of the image to be evaluated, the first filtered image to be evaluated and the second filtered image to be evaluated into an image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels .
  • a model training device comprising: an acquisition module for acquiring a real image sample set, wherein the real image sample set includes a plurality of real image samples; a generative adversarial network module for using the real image sample set to pair The pre-built generative adversarial network is iteratively trained, and multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds are collected; the automatic labeling module is used to generate real image sample sets and multiple The first training sample library composed of pseudo-image sample sets is automatically graded and labeled for each first training sample of the first training sample library according to a plurality of preset image quality levels, so as to obtain a labeled first training sample library; the model The training module is used to train the preset multi-classification network by using the first training sample library to obtain an image quality evaluation model.
  • the automatic labeling module is further used to: label the real image samples included in the real image sample set as the highest image quality level;
  • the multiple pseudo-image samples included are annotated with corresponding image quality levels, where a higher number of iterations corresponds to a higher image quality level.
  • the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the Frecher distance between each fake image sample set and the real image sample set , according to the calculation results, the multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, wherein a smaller Frecher distance corresponds to a higher image quality level.
  • the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the mean square error (MSE) between each fake image sample and the real image sample value, and label each pseudo-image sample as a corresponding image quality level according to the mean square error (MSE) value, where lower mean square error (MSE) values correspond to higher image quality levels.
  • MSE mean square error
  • the acquisition module is further configured to: collect multiple real images, and perform the following preprocessing operations on the multiple real images: determine a region of interest (ROI) in each real image by using an object detection algorithm, and determine The region of interest (ROI) of each real image is cropped; and the size of multiple real images is normalized to obtain a real image sample set.
  • ROI region of interest
  • ROI region of interest
  • the real image sample is a face image
  • the object detection algorithm is a face detection algorithm
  • the acquiring module is further configured to: remove non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
  • the model training module is further configured to: obtain each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; for each first training sample The sample is subjected to row direction filtering processing to obtain a first filtered image; each first training sample is subjected to column direction filtering processing to obtain a second filtered image; each first training sample and the corresponding first filtered image and the second filtered image are The images are spliced and merged to respectively generate labeled second training samples; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and the plurality of second training samples are input into a preset multi-classification network for iterative training, to obtain an image quality assessment model.
  • the preset multi-classification network is a ResNet network
  • the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  • the generative adversarial network module is further configured to: construct a generative adversarial network in advance, and the generative adversarial network includes a generative network and a discriminant network; wherein, the generative network includes a linear mapping layer, a plurality of convolutional layers, and a plurality of convolutional layers located in a plurality of convolutional layers.
  • the batch normalization function and ReLU activation function after each convolutional layer of the layer is used to receive random noise and generate fake image samples;
  • the discriminative network includes multiple convolutional layers and each convolutional layer located in multiple convolutional layers
  • the loss function of the generative network employs a cross-entropy function.
  • an image quality evaluation device comprising: a receiving module for receiving the image to be evaluated; Evaluation to confirm that the image to be evaluated is one of several preset image quality levels.
  • the image to be evaluated is a face image to be evaluated
  • the image quality evaluation model is used to perform quality evaluation on the face image
  • the evaluation module is further configured to: after receiving the image to be evaluated, use a face detection algorithm to determine the image to be evaluated
  • the region of interest (ROI) in the face image, and the face image to be evaluated is cropped according to the determined region of interest (ROI); according to the size of the first training sample, the cropped face image to be evaluated is sized.
  • Normalization use the key point detection algorithm and/or the pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped, if If the face image to be evaluated is a frontal face image, the image quality evaluation model is used to evaluate the image quality of the to-be-evaluated image after size normalization.
  • the evaluation module is further configured to: perform row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; perform column-direction filtering on the image to be evaluated to obtain a second filtered image to be evaluated; , the combined image of the first filtered image to be evaluated and the second filtered image to be evaluated is input to the image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels.
  • a model training apparatus comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are processed by the at least one processor The processor executes to enable at least one processor to execute: the method of the first aspect.
  • an image quality assessment method comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor.
  • the processor executes to enable at least one processor to execute: the method of the second aspect.
  • a computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method of the first aspect and/or the second aspect.
  • FIG. 1 is a schematic flowchart of a model training method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a generative adversarial network according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a generation network according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a discrimination network according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of splicing a first training sample and a corresponding first filtered image and a second filtered image according to an embodiment of the present invention
  • FIG. 6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a model training apparatus according to another embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of an image quality assessment apparatus according to an embodiment of the present invention.
  • Embodiments of the present invention provide a model training method, an image quality assessment method, and an apparatus.
  • the inventive concept of the model training method is first introduced.
  • Embodiments of the present invention provide a model training method for training an image quality assessment model. Specifically, firstly, a real image sample set including multiple real image samples is obtained, a pre-built generative adversarial network is used, and the real image sample is used. Iteratively trains the generative adversarial network, and collects multiple fake image sample sets generated by the generation network in multiple iteration rounds, and generates a first training sample consisting of a real image sample set and multiple fake image sample sets.
  • each first training sample in the first training sample library can be automatically graded and labeled according to multiple preset image quality levels , obtain the first training sample library with labels, and further use the first training sample library to train the preset multi-class network to obtain an image quality evaluation model, and finally use the trained image quality evaluation model to evaluate the image quality of the image to be evaluated.
  • This embodiment only needs to collect a small number of clear real image samples to generate a large number of fake image samples of different quality levels, complete the labeling during the generation process, avoid manual intervention, reduce labor costs, and improve the quality of data labeling.
  • the training of the image quality assessment model is completed at a lower cost.
  • FIG. 1 is a schematic flowchart of a model training method 100 according to an embodiment of the present application, which is used to evaluate the quality of an image.
  • the execution subject may be one or more electronic devices; from a program perspective
  • the execution body may be a program mounted on these electronic devices accordingly.
  • the method 100 may include:
  • Step 101 Obtain a real image sample set, wherein the real image sample set includes a plurality of real image samples.
  • step 101 may further include: collecting multiple real images, and performing the following preprocessing operations on the multiple real images: using an object detection algorithm to determine the Each real image is cropped according to the determined region of interest (ROI); and the size of multiple real images is normalized to obtain a real image sample set.
  • ROI region of interest
  • the real image may be image data for a specific object, such as a face image, an animal image, a vehicle image, and the like.
  • Object detection algorithms are used to detect target objects from real images to obtain regions of interest (ROIs).
  • the real image sample is a face image
  • the object detection algorithm is a face detection algorithm
  • the method may further include: removing non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm. In this way, it can be avoided that the non-frontal face pictures will adversely affect the subsequent training.
  • ROI region of interest
  • Step 102 iteratively train the pre-built generative adversarial network using the real image sample set, and collect multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds.
  • the process of iterative training of a pre-built generative adversarial network is shown.
  • the generative and discriminative networks have opposite goals: the discriminative network tries to distinguish fake images from real images, while the generative network tries to produce images that look real enough to fool the discriminative network.
  • each training iteration can be divided into two stages: In the first stage, the discriminative network is trained, a batch of real images is sampled from the real image sample set D1, and the generative network receives Random noise R and generate fake image samples R', real image sample set D1 and fake image samples R' to form a training batch, where the labels of fake image samples are set to 0 (pseudo) and the labels of real image samples are set to 1 ( true), and train the discriminative network on this labeled batch using a binary cross-entropy loss. Backpropagation at this stage can only optimize the weights of the discriminative network.
  • the second stage train the generative network, first use the generative network to generate another batch of fake image samples, and then use the discriminative network again to judge whether the image is a fake image sample or a real image sample, in this stage all labels are set to 1 (true image sample) ). In other words, it is hoped that the discriminative network will falsely judge the fake image samples produced by the generative network to be true.
  • the weights of the discriminative network are fixed, so backpropagation only affects the weights of the generative network.
  • step 102 further includes: pre-constructing a generative adversarial network, where the generative adversarial network includes a generative network and a discriminant network; wherein the generative network includes a linear mapping layer, a plurality of convolutional layers, and each of the plurality of convolutional layers.
  • the generative network is used to generate fake image samples based on random noise.
  • the discriminative network includes multiple convolutional layers and a LeakyRelu activation function layer and a pooling layer located after each convolutional layer of the multiple convolutional layers, and a fully connected layer, a LeakyRelu activation function layer and a fully connected layer located after the multiple convolutional layers.
  • the sigmoid activation function layer, the discriminant network is used to determine the authenticity of real image samples and fake image samples.
  • the input of the generator network is 20-dimensional random noise of length 3*H*2*W*2, the first layer is a linear mapping, and the input is mapped as 1*3*(H*2)*( W*2) four-dimensional data;
  • the second layer is convolution operation, the output result of the first layer is convolved with the Kernel of 50*3*3, where the step size is 1, the padding is 1;
  • the third layer is convolution Operation, convolve the output result of the second layer with the 25*3*3 Kernel, where the step size is 1 and the padding is 1;
  • the fourth layer is convolution operation, and the output result of the third layer is convolved with 16*3*3
  • the Kernel is convolved, where the step size is 2, and the padding is 1;
  • the fifth layer is convolution operation, and the output result of the fourth layer is convolved with the 16*3*3 Kernel, where the step size is 1, and the padding is 1;
  • the sixth layer is a convolution
  • the output result is convolved with the 8*3*3 Kernel, where the step size is 1 and the padding is 1; the seventh layer is the convolution operation, and the sixth layer output result is convolved with the 3*3*3 Kernel, where the step size is 1 and the padding is 1.
  • a BatchNormlization layer and a ReLU activation function layer are added to the output of each of the above layers.
  • the loss function of the generating network adopts a cross-entropy function. Specifically, the calculation of the loss function uses the cross-entropy function between the prediction results of the adversarial network on the fake image samples and the real labels.
  • the input of the discriminant network is the real image sample set D1 and the fake image sample set R'
  • the label of the real image sample set D1 is set to 1 (true)
  • the label of the fake image sample set R' is set to 0 (pseudo)
  • the convolution result is processed by the LeakyRelu activation function, followed by a 2*2 average pooling process with a step size of 2;
  • the second layer is a convolution operation, Convolve the output result of the first layer with a 32*3*3 Kernel, where the step size is 1, the padding is 1, and the convolution result is processed by the LeakyRelu activation function, followed by 2* with a step size of 2
  • the third layer is convolution operation, the output result of the second layer is convolved with the 16*3*3 Kernel, where the step size is 1, the padding is 1, and the volume is adjusted by the LeakyRelu activation function.
  • the product results are processed, followed by a 2*2 average pooling process with a step size of 2; the fourth layer is 2 fully connected layers, the output of the third layer is mapped to 1*1024 dimensions, and the LeakyRelu activation function is used to pair the
  • the convolution result is processed, the 1*1024 dimension is mapped to 1*1 dimension, and finally the sigmoid activation function is connected to obtain a probability before 0-1, so as to perform two-classification.
  • Step 103 Generate a first training sample library consisting of a real image sample set and a plurality of pseudo-image sample sets, and automatically grade and label each first training sample in the first training sample library according to a plurality of preset image quality levels. , to get the first training sample library with labels.
  • the automatic grading and labeling of each first training sample in the first training sample library includes: according to the number of iterations corresponding to each pseudo image sample set, Multiple fake image samples are marked as corresponding image quality levels, wherein higher iteration times correspond to higher image quality levels; the real image samples included in the real image sample set are marked as the highest image quality level.
  • the preset image quality levels can be divided into 6 types from high to low, including "level I", “level II”, ..., "level VI".
  • the pseudo-image samples and real image samples can be stored in stages according to their quality by saving the pseudo-image samples in the intermediate training process. For example, when iterating 500 times, the image quality level of the pseudo-image sample set generated by the generation network is "level VI". When iteratively 1000 times, the image quality level of the fake image sample set generated by the network is "V level”. As the number of iterations increases, the distinction between the fake image samples generated by the network and the real image samples collected is lower. , that is, the pseudo-image sample set generated by the network has a higher image quality level and better quality.
  • multiple levels of pseudo-image sample sets with different qualities can be generated.
  • the multiple pseudo-image samples included in each pseudo-image sample set can be marked as corresponding image quality levels according to the number of iteration rounds corresponding to each pseudo-image sample set (such as the above-mentioned 500 times, 1000 times, etc.), wherein, A higher number of iterations corresponds to a higher image quality level; the real image samples contained in the real image sample set are marked as the highest image quality level "Class I".
  • the automatic grading and labeling of each first training sample in the first training sample library based on the image quality in step 103 includes: calculating the Fraser difference between each fake image sample set and the real image sample set Distance (Frech Inception Distance); according to the calculation results, multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, where a smaller Frech distance corresponds to a higher image quality level; And, multiple real image samples included in the real image sample set are marked as the highest image quality level.
  • the preset image quality levels can be divided into 6 types from high to low, including "level I", “level II”, ..., "level VI”.
  • the corresponding pseudo-image samples for different iterations can be saved in folders.
  • folder F1 stores the pseudo-image samples corresponding to the 10th round of training
  • folder F2 stores the corresponding pseudo-image samples when the training reaches the 20th round.
  • the Frecher distance between the data in multiple folders and the real image sample set D1 can be calculated to measure the quality difference between the generated fake image samples and the clear real image samples, and according to Calculation results, according to the distribution of Frecher distance, the folders generated in the second step are grouped into 5 categories, and they are arranged according to the distance from small to large to obtain a real image sample set D1 with a quality of "level I" and a quality of "level II".
  • the fake image sample set ..., the fake image sample set of quality "VI”.
  • parameters for evaluating the degree of similarity such as cosine similarity, KL divergence (Kullback-Leibler divergence), etc., may also be used instead of the above-mentioned Frechet distance.
  • the cosine similarity between the data in multiple folders and the real image sample set D1 can be calculated to measure the quality difference between the generated fake image samples and the clear real image sample pictures, and the quality can be sorted in descending order of similarity. It is a real image sample set D1 of "level I", a fake image sample set of "level II” quality, ..., a fake image sample set of "level VI" quality.
  • the evaluation of the degree of similarity may be performed based on the data in multiple folders and some image information of the real image sample set D1, or may also be based on the data in multiple folders and all of the real image sample set D1.
  • the image information is evaluated for the above degree of similarity, which is not specifically limited in this application.
  • the automatically labeled first training sample library D can be obtained, wherein the first training sample library D contains 6 subfolders, which can correspond to real image samples D1 with a quality of "Class I” and a quality of "II".
  • the automatic grading and labeling of each first training sample in the first training sample library based on the image quality in step 103 includes: calculating the mean square error (MSE) between each fake image sample and the real image sample ) value; label each fake image sample as a corresponding image quality level according to a mean squared error (MSE) value, where lower mean squared error (MSE) values correspond to higher image quality levels;
  • MSE mean square error
  • the image sample set contains multiple real image samples annotated with the highest image quality level.
  • Step 104 Use the first training sample library to train a preset multi-classification network to obtain an image quality evaluation model.
  • the first training sample library consists of a real image sample set and a plurality of fake image sample sets, and each of the first training samples carries a label used to indicate the image quality. For example, it is assumed that the image quality is divided into 6 from high to low.
  • the first training sample which is a real image sample, is labeled as "Class I”
  • the first training sample which is a fake image sample, is based on its image
  • the order of quality from good to bad is "Class II", ..., "Class VI”. Therefore, the preset multi-classification network can be trained by using the first training sample library until it converges, and the obtained image quality evaluation model can determine the image quality according to the input image as "level I", “level II”, ..., " One of the "Class VI”.
  • step 104 may specifically include: acquiring each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; Perform row direction filtering processing to obtain a first filtered image; perform column direction filtering processing on each first training sample to obtain a second filtered image; combine each first training sample with the corresponding first filtered image and the second filtered image Perform splicing and merging to generate labeled second training samples respectively; respectively obtain multiple second training samples corresponding to multiple first training samples, and input multiple second training samples into a preset multi-classification network for iterative training to Obtain an image quality assessment model.
  • first training samples Img of the first training sample library row-direction filtering and column-direction filtering are respectively performed on them, that is, the first training samples Img and the convolution kernel of 1*N are performed.
  • the filtered image Img1 in the row direction (the first filtered image) is obtained by convolution, and the first training sample Img is convolved with another convolution kernel of N*1 to obtain the filtered image Img2 in the column direction (the second filtered image). image).
  • the above-mentioned Img, Img1, and Img2 are merged into a picture of H*(3*W) (that is, the second training sample), as shown in Figure 5, in the second training sample, the first training sample Img is on the left, The first filtered image Img1 is in the middle and the second filtered image Img2 is on the right.
  • a plurality of second training samples corresponding to the plurality of first training samples constitute a second training sample library.
  • the plurality of second training samples are input into a preset multi-classification network for iterative training, and an image quality evaluation model can be obtained.
  • the above-mentioned preset multi-classification network is a ResNet network
  • the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  • the preset multi-classification network may also use a network other than ResNet.
  • an embodiment of the present invention also provides an image quality assessment method, which uses the model training method of the above embodiment to perform the image quality assessment method, which specifically includes: receiving an image to be assessed; using the method described in the above embodiment
  • the image quality evaluation model trained by the training method performs image quality evaluation on the image to be evaluated, so as to confirm that the image to be evaluated is one of a plurality of preset image quality levels.
  • the image to be evaluated is a face image to be evaluated
  • the image quality evaluation model is used to perform quality evaluation on the face image
  • the method further includes: after receiving the image to be evaluated, using a face detection algorithm to determine the face to be evaluated
  • the region of interest (ROI) in the image, and the face image to be evaluated is cropped according to the determined region of interest (ROI);
  • the size of the cropped face image to be evaluated is normalized according to the size of the first training sample use key point detection algorithm and/or pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped. If the face image is a frontal image, the image quality evaluation model is used to evaluate the image quality of the image to be evaluated after the size is normalized.
  • using an image quality evaluation model to perform image quality assessment on the image to be evaluated includes: performing row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; performing column-direction filtering on the image to be evaluated to obtain a second Filter the image to be evaluated; input the combined image of the image to be evaluated, the first filtered image to be evaluated and the second filtered image to be evaluated into an image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels .
  • FIG. 6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention.
  • the model training device includes:
  • the obtaining module 601 is configured to obtain a real image sample set, wherein the real image sample set includes a plurality of real image samples.
  • the generative adversarial network module 602 is used to iteratively train the pre-built generative adversarial network by using the real image sample set, and collect multiple pseudo-image sample sets respectively generated by the generative network in the generative adversarial network in multiple iteration rounds;
  • the automatic labeling module 603 is used to generate a first training sample library consisting of a real image sample set and a plurality of pseudo-image sample sets, and according to a plurality of preset image quality levels, each first training sample of the first training sample library is Perform automatic grading and labeling to obtain the first training sample library with labels;
  • the model training module 604 is configured to use the first training sample library to train a preset multi-classification network to obtain an image quality evaluation model.
  • the automatic labeling module is further used to: label the real image samples included in the real image sample set as the highest image quality level;
  • the multiple pseudo-image samples included are annotated with corresponding image quality levels, where a higher number of iterations corresponds to a higher image quality level.
  • the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the Frecher distance between each fake image sample set and the real image sample set , according to the calculation results, the multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, wherein a smaller Frecher distance corresponds to a higher image quality level.
  • the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the mean square error (MSE) between each fake image sample and the real image sample value, and label each pseudo-image sample as a corresponding image quality level according to the mean square error (MSE) value, where lower mean square error (MSE) values correspond to higher image quality levels.
  • MSE mean square error
  • the acquisition module is further configured to: collect multiple real images, and perform the following preprocessing operations on the multiple real images: determine a region of interest (ROI) in each real image by using an object detection algorithm, and determine The region of interest (ROI) of each real image is cropped; and the size of multiple real images is normalized to obtain a real image sample set.
  • ROI region of interest
  • ROI region of interest
  • the real image sample is a face image
  • the object detection algorithm is a face detection algorithm
  • the acquiring module is further configured to: remove non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
  • the model training module is further configured to: obtain each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; for each first training sample The sample is subjected to row direction filtering processing to obtain a first filtered image; each first training sample is subjected to column direction filtering processing to obtain a second filtered image; each first training sample and the corresponding first filtered image and the second filtered image are The images are spliced and merged to respectively generate labeled second training samples; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and the plurality of second training samples are input into a preset multi-classification network for iterative training, to obtain an image quality assessment model.
  • the preset multi-classification network is a ResNet network
  • the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  • the generative adversarial network module is further configured to: construct a generative adversarial network in advance, and the generative adversarial network includes a generative network and a discriminant network; wherein, the generative network includes a linear mapping layer, a plurality of convolutional layers, and a plurality of convolutional layers located in a plurality of convolutional layers.
  • the batch normalization function and ReLU activation function after each convolutional layer of the layer is used to receive random noise and generate fake image samples;
  • the discriminative network includes multiple convolutional layers and each convolutional layer located in multiple convolutional layers
  • the loss function of the generative network employs a cross-entropy function.
  • model training apparatus in the embodiment of the present application can implement each process of the foregoing embodiment of the model training method, and achieve the same effects and functions, which will not be repeated here.
  • the embodiments of the present invention further provide an image quality evaluation apparatus, which is used to execute the image quality evaluation methods provided by the above embodiments. Specifically, it includes: a receiving module for receiving an image to be evaluated; an evaluation module for performing image quality evaluation on the image to be evaluated by using the image quality evaluation model trained by the method in the first aspect, to confirm that the image to be evaluated is a plurality of pre-evaluated images. one of the set image quality levels.
  • the image to be evaluated is a face image to be evaluated
  • the image quality evaluation model is used to perform quality evaluation on the face image
  • the evaluation module is further configured to: after receiving the image to be evaluated, use a face detection algorithm to determine the image to be evaluated
  • the region of interest (ROI) in the face image, and the face image to be evaluated is cropped according to the determined region of interest (ROI); according to the size of the first training sample, the cropped face image to be evaluated is sized.
  • Normalization use the key point detection algorithm and/or the pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped, if If the face image to be evaluated is a frontal face image, the image quality evaluation model is used to evaluate the image quality of the to-be-evaluated image after size normalization.
  • the evaluation module is further configured to: perform row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; perform column-direction filtering on the image to be evaluated to obtain a second filtered image to be evaluated; , the combined image of the first filtered image to be evaluated and the second filtered image to be evaluated is input to the image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels.
  • FIG. 7 is a model training apparatus according to an embodiment of the present application, for executing the model training method shown in FIG. 1 , the apparatus includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein, The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the model training method described in the above embodiments.
  • FIG. 8 is an image quality evaluation apparatus according to an embodiment of the present application, which is used for executing the image quality evaluation method shown in the above-mentioned embodiment, the apparatus includes: at least one processor; and a memory connected in communication with the at least one processor ; wherein, the memory stores instructions executable by at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the image quality assessment method described in the above embodiments.
  • a non-volatile computer storage medium of a model training method and an image quality assessment method having computer-executable instructions stored thereon, the computer-executable instructions being configured to be executed when executed by a processor : the method described in the above embodiment.
  • the apparatuses, devices, and computer-readable storage media and methods provided in the embodiments of the present application are in one-to-one correspondence. Therefore, the apparatuses, devices, and computer-readable storage media also have beneficial technical effects similar to those of the corresponding methods.
  • the beneficial technical effects of the method have been described in detail, therefore, the beneficial technical effects of the apparatus, equipment and computer-readable storage medium will not be repeated here.
  • embodiments of the present invention may be provided as a method, system or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-persistent memory in computer readable media, random access memory (RAM) and/or non-volatile memory in the form of, for example, read only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash memory
  • Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • PRAM phase-change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read only memory
  • EEPROM Electrically Erasable Programmable Read Only Memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

A model training method, and an image quality evaluation method and apparatus. The training method comprises: acquiring a real image sample set, wherein the real image sample set comprises a plurality of real image samples (101); performing iterative training on a pre-built generative adversarial network by using the real image sample set, and collecting a plurality of pseudo image sample sets that are respectively generated within a plurality of rounds of iteration by a generative network in the generative adversarial network (102); generating a first training sample library composed of the real image sample set and the plurality of pseudo image sample sets, and automatically categorizing and labeling each first training sample of the first training sample library according to a plurality of preset image quality levels to obtain the first training sample library (103); and training a preset multi-classification network by using the first training sample library so as to obtain an image quality evaluation model (104). By using the foregoing method, only a small number of clear real image samples need to be collected to generate a large number of pseudo image samples of different quality levels, and automatic labeling reduces manual costs while improving the quality of data labeling, so that the image quality evaluation model is trained at a lower cost.

Description

模型训练方法、图像质量评估方法及装置Model training method, image quality assessment method and device
本申请要求于2020年12月28日提交的、申请号为202011578791.8、标题为“模型训练方法、图像质量评估方法及装置”的中国专利申请的优先权,该中国专利申请的公开内容以引用的方式并入本文。This application claims the priority of the Chinese patent application with the application number of 202011578791.8 and titled "Model Training Method, Image Quality Evaluation Method and Device", filed on December 28, 2020, the disclosure content of the Chinese patent application is cited by method is incorporated herein.
技术领域technical field
本发明属于计算机领域,具体涉及模型训练方法、图像质量评估方法及装置。The invention belongs to the field of computers, and in particular relates to a model training method, an image quality evaluation method and a device.
背景技术Background technique
本部分旨在为权利要求书中陈述的本发明的实施方式提供背景或上下文。此处的描述不因为包括在本部分中就承认是现有技术。This section is intended to provide a background or context for the embodiments of the invention that are recited in the claims. The descriptions herein are not admitted to be prior art by inclusion in this section.
随着诸如人脸识别的对象识别技术的应用普及,人们对于对象识别的精度要求越来越高,然而采集的对象图像质量直接影响对象识别精度,质量差的对象图像在进行对象识别时会导致误识或漏识,因此在进行对象识别之前进行质量评估就显得十分重要。With the popularization of object recognition technologies such as face recognition, people have higher and higher requirements for object recognition accuracy. However, the quality of the collected object images directly affects the object recognition accuracy. Poor quality object images will lead to object recognition. Misrecognition or omission, so it is very important to conduct quality assessment before object recognition.
图像的质量评估主要分为全参考质量评估、半参考质量评估、无参考质量评估。例如,人脸图像质量评估由于受到个体的面部特征差异,包括但不限于发型、佩戴眼镜、化妆等会导致内容变化较大,属于无参考质量评估。在无参考质量评估方法中,目前大多数方法仍然需要利用主观质量分数来训练质量评价模型。Image quality assessment is mainly divided into full reference quality assessment, semi-reference quality assessment, and no reference quality assessment. For example, the face image quality assessment is subject to individual differences in facial features, including but not limited to hairstyles, wearing glasses, makeup, etc., which will lead to large changes in content, which is a non-reference quality assessment. Among the no-reference quality assessment methods, most of the current methods still need to utilize subjective quality scores to train quality assessment models.
现有图像质量评估模型训练过程主要包括,图像数据采集,人工对采集数据进行数据清洗与标注,然后通过检测模型检测到感兴趣区域,并进行边界余量扩张以保留内容完整的对象区域,将对象区域与人工标注质量标签输入到深度学习网络进行训练学习。The training process of the existing image quality assessment model mainly includes image data collection, manual data cleaning and labeling of the collected data, and then detecting the region of interest through the detection model, and expanding the boundary margin to retain the object area with complete content, Object regions and human-annotated quality labels are input to the deep learning network for training learning.
图像质量评估模型训练需要采集大量的图像数据,标注图像数据对应的质量分数标签,这是一项工作量巨大的工程。同时,由于进行标注工作的人员存在个体主观性以及图像本身包含的内容丰富程度差异,因此很难制定一套统一的标准来执行标注工作。不同的人在观察同一张图像,由于认知的区别,会导致对同一张图像的质量等级标注存在差异,因此质量评估的数据采集与标注一直是人脸图像质量评估的难题。Image quality assessment model training needs to collect a large amount of image data and label the quality score labels corresponding to the image data, which is a huge workload. At the same time, due to the individual subjectivity of those who perform the labeling work and the differences in the richness of the content contained in the images themselves, it is difficult to formulate a unified set of standards to perform the labeling work. Different people are observing the same image. Due to differences in cognition, there will be differences in the quality level annotation of the same image. Therefore, data collection and annotation for quality assessment has always been a difficult problem in face image quality assessment.
发明内容SUMMARY OF THE INVENTION
针对上述现有技术中存在的问题,提出了一种模型训练方法、图像质量评估方法及装置,利用这种方法、装置,能够解决上述问题。Aiming at the problems existing in the above-mentioned prior art, a model training method, an image quality assessment method and an apparatus are proposed, and the above-mentioned problems can be solved by using the method and apparatus.
本发明提供了以下方案。The present invention provides the following solutions.
第一方面,提供一种图像质量评估模型训练方法,包括:获取真实图像样本集,其中,真实图像样本集包括多个真实图像样本;利用真实图像样本集对预先构建的生成对抗网络进行迭代训练,收集生成对抗网络中的生成网络在多个迭代轮次中分别生成的多个伪图像样本集;生成由真实图像样本集和多个伪图像样本集组成的第一训练样本库,根据多个预设的图像质量级别对第一训练样本库的每个第一训练样本进行自动分级标注,得到带标签的第一训练样本库;利用第一训练样本库对预设多分类网络进行训练,获得图像质量评估模型。In a first aspect, a method for training an image quality assessment model is provided, including: acquiring a real image sample set, wherein the real image sample set includes multiple real image samples; and using the real image sample set to iteratively train a pre-built generative adversarial network , collect multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds; generate a first training sample library consisting of real image sample sets and multiple fake image sample sets, according to multiple The preset image quality level is automatically graded and labeled for each first training sample in the first training sample library to obtain a labeled first training sample library; the preset multi-classification network is trained by using the first training sample library to obtain Image quality assessment model.
在一些实施方式中,对第一训练样本库的每个第一训练样本进行自动分级标注,包括:将真实图像样本集包含的真实图像样本标注为最高的图像质量级别;根据每个伪图像样本集对应的迭代轮次数将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更高的迭代次数对应于更高的图像质量级别。In some embodiments, automatically grading and labeling each first training sample in the first training sample library includes: labeling the real image samples included in the real image sample set as the highest image quality level; The number of iteration rounds corresponding to the set marks the multiple pseudo-image samples included in each pseudo-image sample set as the corresponding image quality level, wherein a higher iteration number corresponds to a higher image quality level.
在一些实施方式中,对第一训练样本库的每个第一训练样本进行自动分级标注,包括:将真实图像样本集包含的多个真实图像样本标注为最高的图像质量级别;计算每个伪图像样本集与真实图像样本集的弗雷歇距离,根据计算结果将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更小的弗雷歇距离对应于更高的图像质量级别。In some embodiments, automatically grading and labeling each first training sample in the first training sample library includes: labeling a plurality of real image samples included in the real image sample set as the highest image quality level; The Frescher distance between the image sample set and the real image sample set, according to the calculation result, the multiple fake image samples included in each fake image sample set are marked as the corresponding image quality level, wherein, the smaller Frescher distance corresponds to Higher image quality levels.
在一些实施方式中,对第一训练样本库的每个第一训练样本进行自动分级标注,包括:将真实图像样本集包含的多个真实图像样本标注为最高的图像质量级别;计算每个伪图像样本与真实图像样本的均方误差(MSE)值,根据均方误差(MSE)值将每个伪图像样本标注为对应的图像质量级别,其中,更低的均方误差(MSE)值对应于更高的图像质量级别。In some embodiments, automatically grading and labeling each first training sample in the first training sample library includes: labeling a plurality of real image samples included in the real image sample set as the highest image quality level; The mean square error (MSE) value of the image sample and the real image sample, according to the mean square error (MSE) value, each fake image sample is marked as the corresponding image quality level, where the lower mean square error (MSE) value corresponds to at a higher image quality level.
在一些实施方式中,获取真实图像样本集还包括:采集多个真实图像,对多个真实图像进行以下预处理操作:利用对象检测算法确定每个真实图像中的感兴趣区域(ROI),并根据确定的感兴趣区域(ROI)对每个真实图像进行裁剪处理;以及,对多个真实图像进行尺寸归一化,以得到真实图像样本集。In some embodiments, acquiring a sample set of real images further includes: acquiring a plurality of real images, and performing the following preprocessing operations on the plurality of real images: determining a region of interest (ROI) in each real image by using an object detection algorithm, and Each real image is cropped according to the determined region of interest (ROI); and the size of the multiple real images is normalized to obtain a real image sample set.
在一些实施方式中,真实图像样本为人脸图像,对象检测算法为人脸检测算法。In some embodiments, the real image sample is a face image, and the object detection algorithm is a face detection algorithm.
在一些实施方式中,获取真实图像样本集之后,方法还包括:利用关键点检测算法和/或姿态估计算法去除真实图像样本集中的非正脸图片。In some embodiments, after acquiring the real image sample set, the method further includes: removing non-frontal face pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
在一些实施方式中,利用第一训练样本库对预设多分类网络进行分类训练,获得图像质量评估模型,包括:获取第一训练样本库中的每个带标签的第一训练样本,标签用于指示 第一训练样本的图像质量级别;对每个第一训练样本进行行方向滤波处理,得到第一滤波图像;对每个第一训练样本进行列方向滤波处理,得到第二滤波图像;将每个第一训练样本和对应的第一滤波图像和第二滤波图像进行拼接合并,分别生成带标签的第二训练样本;分别获取多个第一训练样本对应的多个第二训练样本,并将多个第二训练样本输入预设多分类网络进行迭代训练,以获取图像质量评估模型。In some embodiments, using the first training sample library to perform classification training on a preset multi-classification network to obtain an image quality assessment model includes: acquiring each labeled first training sample in the first training sample library, and the label is used to indicate the image quality level of the first training sample; perform row direction filtering processing on each first training sample to obtain a first filtered image; perform column direction filtering processing on each first training sample to obtain a second filtered image; Each first training sample and the corresponding first filtered image and the second filtered image are spliced and merged to generate a labeled second training sample respectively; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and A plurality of second training samples are input into a preset multi-classification network for iterative training to obtain an image quality evaluation model.
在一些实施方式中,预设多分类网络为ResNet网络,预设多分类网络使用二分类交叉熵函数作为损失函数且利用softmax函数进行二分类。In some embodiments, the preset multi-classification network is a ResNet network, and the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
在一些实施方式中,方法还包括:预先构建生成对抗网络,生成对抗网络包括生成网络和判别网络;其中,生成网络包括线性映射层、多个卷积层以及位于多个卷积层的每个卷积层之后的批标准化函数和ReLU激活函数,生成网络用于接收随机噪音并生成伪图像样本;判别网络包括多个卷积层和位于多个卷积层的每个卷积层之后的LeakyRelu激活函数层和池化层,以及位于多个卷积层之后的全连接层、LeakyRelu激活函数层和sigmoid激活函数层,判别网络用于对真实图像样本和伪图像样本进行真伪判定。In some embodiments, the method further includes: pre-constructing a generative adversarial network, the generative adversarial network includes a generative network and a discriminant network; wherein the generative network includes a linear mapping layer, a plurality of convolutional layers, and each of the plurality of convolutional layers. Batch normalization function and ReLU activation function after the convolutional layer, the generative network is used to receive random noise and generate fake image samples; the discriminative network includes multiple convolutional layers and LeakyRelu after each convolutional layer of the multiple convolutional layers The activation function layer and pooling layer, as well as the fully connected layer, the LeakyRelu activation function layer and the sigmoid activation function layer after multiple convolutional layers, the discriminant network is used to determine the authenticity of real image samples and fake image samples.
在一些实施方式中,方法还包括:生成网络的损失函数采用交叉熵函数。In some embodiments, the method further includes: generating a loss function of the network using a cross-entropy function.
第二方面,提供一种图像质量评估方法,包括:接收待评估图像;利用如第一方面的方法训练得到的图像质量评估模型对待评估图像进行图像质量评估,以确认待评估图像为多个预设的图像质量级别之一。In a second aspect, an image quality evaluation method is provided, including: receiving an image to be evaluated; using an image quality evaluation model trained by the method of the first aspect to perform image quality evaluation on the image to be evaluated, so as to confirm that the image to be evaluated is a plurality of pre-evaluated images. one of the set image quality levels.
在一些实施方式中,待评估图像为待评估人脸图像,图像质量评估模型用于对人脸图像进行质量评估,方法还包括:接收待评估图像之后,利用人脸检测算法确定待评估人脸图像中的感兴趣区域(ROI),并根据确定的感兴趣区域(ROI)对待评估人脸图像进行裁剪处理;根据第一训练样本的尺寸对裁剪处理后的待评估人脸图像进行尺寸归一化;利用关键点检测算法和/或姿态估计算法确定尺寸归一化之后的待评估人脸图像是否为正脸图像;其中,若待评估人脸图像不是正脸图像则停止评估,若待评估人脸图像是正脸图像则利用图像质量评估模型对尺寸归一化之后的待评估图像进行图像质量评估。In some embodiments, the image to be evaluated is a face image to be evaluated, the image quality evaluation model is used to perform quality evaluation on the face image, and the method further includes: after receiving the image to be evaluated, using a face detection algorithm to determine the face to be evaluated The region of interest (ROI) in the image, and the face image to be evaluated is cropped according to the determined region of interest (ROI); the size of the cropped face image to be evaluated is normalized according to the size of the first training sample use key point detection algorithm and/or pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped. If the face image is a frontal image, the image quality evaluation model is used to evaluate the image quality of the image to be evaluated after the size is normalized.
在一些实施方式中,利用图像质量评估模型对待评估图像进行图像质量评估,包括:对待评估图像进行行方向滤波处理,得到第一滤波待评估图像;对待评估图像进行列方向滤波处理,得到第二滤波待评估图像;将待评估图像、第一滤波待评估图像和第二滤波待评估图像的合并图像输入图像质量评估模型进行评估,以确定待评估图像为预设的多个图像质量级别之一。In some embodiments, using an image quality evaluation model to perform image quality assessment on the image to be evaluated includes: performing row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; performing column-direction filtering on the image to be evaluated to obtain a second Filter the image to be evaluated; input the combined image of the image to be evaluated, the first filtered image to be evaluated and the second filtered image to be evaluated into an image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels .
第三方面,提供一种模型训练装置,包括:获取模块,用于获取真实图像样本集,其中,真实图像样本集包括多个真实图像样本;生成对抗网络模块,用于利用真实图像样本集 对预先构建的生成对抗网络进行迭代训练,收集生成对抗网络中的生成网络在多个迭代轮次中分别生成的多个伪图像样本集;自动标注模块,用于生成由真实图像样本集和多个伪图像样本集组成的第一训练样本库,根据多个预设的图像质量级别对第一训练样本库的每个第一训练样本进行自动分级标注,得到带标签的第一训练样本库;模型训练模块,用于利用第一训练样本库对预设多分类网络进行训练,获得图像质量评估模型。In a third aspect, a model training device is provided, comprising: an acquisition module for acquiring a real image sample set, wherein the real image sample set includes a plurality of real image samples; a generative adversarial network module for using the real image sample set to pair The pre-built generative adversarial network is iteratively trained, and multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds are collected; the automatic labeling module is used to generate real image sample sets and multiple The first training sample library composed of pseudo-image sample sets is automatically graded and labeled for each first training sample of the first training sample library according to a plurality of preset image quality levels, so as to obtain a labeled first training sample library; the model The training module is used to train the preset multi-classification network by using the first training sample library to obtain an image quality evaluation model.
在一些实施方式中,自动标注模块还用于:将真实图像样本集包含的真实图像样本标注为最高的图像质量级别;根据每个伪图像样本集对应的迭代轮次数将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更高的迭代次数对应于更高的图像质量级别。In some embodiments, the automatic labeling module is further used to: label the real image samples included in the real image sample set as the highest image quality level; The multiple pseudo-image samples included are annotated with corresponding image quality levels, where a higher number of iterations corresponds to a higher image quality level.
在一些实施方式中,自动标注模块还用于:将真实图像样本集包含的多个真实图像样本标注为最高的图像质量级别;计算每个伪图像样本集与真实图像样本集的弗雷歇距离,根据计算结果将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更小的弗雷歇距离对应于更高的图像质量级别。In some embodiments, the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the Frecher distance between each fake image sample set and the real image sample set , according to the calculation results, the multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, wherein a smaller Frecher distance corresponds to a higher image quality level.
在一些实施方式中,自动标注模块还用于:将真实图像样本集包含的多个真实图像样本标注为最高的图像质量级别;计算每个伪图像样本与真实图像样本的均方误差(MSE)值,根据均方误差(MSE)值将每个伪图像样本标注为对应的图像质量级别,其中,更低的均方误差(MSE)值对应于更高的图像质量级别。In some embodiments, the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the mean square error (MSE) between each fake image sample and the real image sample value, and label each pseudo-image sample as a corresponding image quality level according to the mean square error (MSE) value, where lower mean square error (MSE) values correspond to higher image quality levels.
在一些实施方式中,获取模块还用于:采集多个真实图像,对多个真实图像进行以下预处理操作:利用对象检测算法确定每个真实图像中的感兴趣区域(ROI),并根据确定的感兴趣区域(ROI)对每个真实图像进行裁剪处理;以及,对多个真实图像进行尺寸归一化,以得到真实图像样本集。In some embodiments, the acquisition module is further configured to: collect multiple real images, and perform the following preprocessing operations on the multiple real images: determine a region of interest (ROI) in each real image by using an object detection algorithm, and determine The region of interest (ROI) of each real image is cropped; and the size of multiple real images is normalized to obtain a real image sample set.
在一些实施方式中,真实图像样本为人脸图像,对象检测算法为人脸检测算法。In some embodiments, the real image sample is a face image, and the object detection algorithm is a face detection algorithm.
在一些实施方式中,获取真实图像样本集之后,获取模块还用于:利用关键点检测算法和/或姿态估计算法去除真实图像样本集中的非正脸图片。In some embodiments, after acquiring the real image sample set, the acquiring module is further configured to: remove non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
在一些实施方式中,模型训练模块还用于:获取第一训练样本库中的每个带标签的第一训练样本,标签用于指示第一训练样本的图像质量级别;对每个第一训练样本进行行方向滤波处理,得到第一滤波图像;对每个第一训练样本进行列方向滤波处理,得到第二滤波图像;将每个第一训练样本和对应的第一滤波图像和第二滤波图像进行拼接合并,分别生成带标签的第二训练样本;分别获取多个第一训练样本对应的多个第二训练样本,并将多个第二训练样本输入预设多分类网络进行迭代训练,以获取图像质量评估模型。In some embodiments, the model training module is further configured to: obtain each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; for each first training sample The sample is subjected to row direction filtering processing to obtain a first filtered image; each first training sample is subjected to column direction filtering processing to obtain a second filtered image; each first training sample and the corresponding first filtered image and the second filtered image are The images are spliced and merged to respectively generate labeled second training samples; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and the plurality of second training samples are input into a preset multi-classification network for iterative training, to obtain an image quality assessment model.
在一些实施方式中,预设多分类网络为ResNet网络,预设多分类网络使用二分类交叉熵函数作为损失函数且利用softmax函数进行二分类。In some embodiments, the preset multi-classification network is a ResNet network, and the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
在一些实施方式中,生成对抗网络模块还用于:预先构建生成对抗网络,生成对抗网络包括生成网络和判别网络;其中,生成网络包括线性映射层、多个卷积层以及位于多个卷积层的每个卷积层之后的批标准化函数和ReLU激活函数,生成网络用于接收随机噪音并生成伪图像样本;判别网络包括多个卷积层和位于多个卷积层的每个卷积层之后的LeakyRelu激活函数层和池化层,以及位于多个卷积层之后的全连接层、LeakyRelu激活函数层和sigmoid激活函数层,判别网络用于对真实图像样本和伪图像样本进行真伪判定。In some embodiments, the generative adversarial network module is further configured to: construct a generative adversarial network in advance, and the generative adversarial network includes a generative network and a discriminant network; wherein, the generative network includes a linear mapping layer, a plurality of convolutional layers, and a plurality of convolutional layers located in a plurality of convolutional layers. The batch normalization function and ReLU activation function after each convolutional layer of the layer, the generative network is used to receive random noise and generate fake image samples; the discriminative network includes multiple convolutional layers and each convolutional layer located in multiple convolutional layers The LeakyRelu activation function layer and the pooling layer after the layer, as well as the fully connected layer, the LeakyRelu activation function layer and the sigmoid activation function layer after multiple convolutional layers, the discriminant network is used for real image samples and fake image samples. determination.
在一些实施方式中,生成网络的损失函数采用交叉熵函数。In some embodiments, the loss function of the generative network employs a cross-entropy function.
第四方面,提供一种图像质量评估装置,包括:接收模块,用于接收待评估图像;评估模块,用于利用如权第一方面的方法训练得到的图像质量评估模型对待评估图像进行图像质量评估,以确认待评估图像为多个预设的图像质量级别之一。In a fourth aspect, an image quality evaluation device is provided, comprising: a receiving module for receiving the image to be evaluated; Evaluation to confirm that the image to be evaluated is one of several preset image quality levels.
在一些实施方式中,待评估图像为待评估人脸图像,图像质量评估模型用于对人脸图像进行质量评估,评估模块还用于:接收待评估图像之后,利用人脸检测算法确定待评估人脸图像中的感兴趣区域(ROI),并根据确定的感兴趣区域(ROI)对待评估人脸图像进行裁剪处理;根据第一训练样本的尺寸对裁剪处理后的待评估人脸图像进行尺寸归一化;利用关键点检测算法和/或姿态估计算法确定尺寸归一化之后的待评估人脸图像是否为正脸图像;其中,若待评估人脸图像不是正脸图像则停止评估,若待评估人脸图像是正脸图像则利用图像质量评估模型对尺寸归一化之后的待评估图像进行图像质量评估。In some embodiments, the image to be evaluated is a face image to be evaluated, the image quality evaluation model is used to perform quality evaluation on the face image, and the evaluation module is further configured to: after receiving the image to be evaluated, use a face detection algorithm to determine the image to be evaluated The region of interest (ROI) in the face image, and the face image to be evaluated is cropped according to the determined region of interest (ROI); according to the size of the first training sample, the cropped face image to be evaluated is sized. Normalization; use the key point detection algorithm and/or the pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped, if If the face image to be evaluated is a frontal face image, the image quality evaluation model is used to evaluate the image quality of the to-be-evaluated image after size normalization.
在一些实施方式中,评估模块还用于:对待评估图像进行行方向滤波处理,得到第一滤波待评估图像;对待评估图像进行列方向滤波处理,得到第二滤波待评估图像;将待评估图像、第一滤波待评估图像和第二滤波待评估图像的合并图像输入图像质量评估模型进行评估,以确定待评估图像为预设的多个图像质量级别之一。In some embodiments, the evaluation module is further configured to: perform row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; perform column-direction filtering on the image to be evaluated to obtain a second filtered image to be evaluated; , the combined image of the first filtered image to be evaluated and the second filtered image to be evaluated is input to the image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels.
第五方面,提供一种模型训练装置,包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如第一方面的方法。In a fifth aspect, a model training apparatus is provided, comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are processed by the at least one processor The processor executes to enable at least one processor to execute: the method of the first aspect.
第六方面,提供一种图像质量评估方法,包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如第二方面的方法。In a sixth aspect, an image quality assessment method is provided, comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor. The processor executes to enable at least one processor to execute: the method of the second aspect.
第七方面,提供一种计算机可读存储介质,计算机可读存储介质存储有程序,当程序被多核处理器执行时,使得多核处理器执行如第一方面和/或第二方面的方法。In a seventh aspect, a computer-readable storage medium is provided, the computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method of the first aspect and/or the second aspect.
本申请实施例采用的上述至少一个技术方案能够达到以下有益效果:本实施例中,。The above-mentioned at least one technical solution adopted in the embodiment of the present application can achieve the following beneficial effects: In this embodiment, .
应当理解,上述说明仅是本发明技术方案的概述,以便能够更清楚地了解本发明的技术手段,从而可依照说明书的内容予以实施。为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举例说明本发明的具体实施方式。It should be understood that the above description is only an overview of the technical solutions of the present invention, so that the technical means of the present invention can be more clearly understood, and thus can be implemented in accordance with the contents of the description. In order to make the above-mentioned and other objects, features and advantages of the present invention more apparent and comprehensible, specific embodiments of the present invention are illustrated below.
附图说明Description of drawings
通过阅读下文的示例性实施例的详细描述,本领域普通技术人员将明白本文所述的优点和益处以及其他优点和益处。附图仅用于示出示例性实施例的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的标号表示相同的部件。在附图中:The advantages and benefits described herein, as well as other advantages and benefits, will become apparent to those of ordinary skill in the art upon reading the following detailed description of the exemplary embodiments. The drawings are for purposes of illustrating exemplary embodiments only and are not to be considered limiting of the invention. Also, the same components are denoted by the same reference numerals throughout the drawings. In the attached image:
图1为根据本发明一实施例的模型训练方法的流程示意图;1 is a schematic flowchart of a model training method according to an embodiment of the present invention;
图2为根据本发明一实施例的生成对抗网络的示意图;FIG. 2 is a schematic diagram of a generative adversarial network according to an embodiment of the present invention;
图3为根据本发明一实施例的生成网络的示意图;3 is a schematic diagram of a generation network according to an embodiment of the present invention;
图4为根据本发明一实施例的判别网络的示意图;4 is a schematic diagram of a discrimination network according to an embodiment of the present invention;
图5为根据本发明一实施例的将第一训练样本和对应的第一滤波图像和第二滤波图像进行拼接合的示意图;5 is a schematic diagram of splicing a first training sample and a corresponding first filtered image and a second filtered image according to an embodiment of the present invention;
图6为根据本发明一实施例的模型训练装置的结构示意图;6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention;
图7为根据本发明又一实施例的模型训练装置的结构示意图;7 is a schematic structural diagram of a model training apparatus according to another embodiment of the present invention;
图8为根据本发明一实施例的图像质量评估装置的结构示意图。FIG. 8 is a schematic structural diagram of an image quality assessment apparatus according to an embodiment of the present invention.
在附图中,相同或对应的标号表示相同或对应的部分。In the drawings, the same or corresponding reference numerals denote the same or corresponding parts.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that the present disclosure will be more thoroughly understood, and will fully convey the scope of the present disclosure to those skilled in the art.
在本发明中,应理解,诸如“包括”或“具有”等术语旨在指示本说明书中所公开的特征、数字、步骤、行为、部件、部分或其组合的存在,并且不旨在排除一个或多个其他特征、数字、步骤、行为、部件、部分或其组合存在的可能性。In the present invention, it should be understood that terms such as "comprising" or "having" are intended to indicate the presence of features, numbers, steps, acts, components, parts or combinations thereof disclosed in this specification, and are not intended to exclude a or multiple other features, numbers, steps, acts, components, parts, or combinations thereof may exist.
另外还需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。In addition, it should be noted that the embodiments of the present invention and the features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
本发明实施例提供一种模型训练方法、图像质量评估方法及装置,下面,首先对模型训练方法的发明构思进行介绍。Embodiments of the present invention provide a model training method, an image quality assessment method, and an apparatus. In the following, the inventive concept of the model training method is first introduced.
本发明实施例提供一种模型训练方法,用于训练得到图像质量评估模型,具体来说,首先获取包括多个真实图像样本的真实图像样本集,预先构建的生成对抗网络,利用该真实图像样本集对生成对抗网络进行迭代训练,并收集其中生成网络在多个迭代轮次中分别生成的多个伪图像样本集,生成由真实图像样本集和多个伪图像样本集组成的第一训练样本库,由于生成网络在多个迭代轮次逐渐生成质量更高的伪图像样本,因此可以根据多个预设的图像质量级别对该第一训练样本库的每个第一训练样本进行自动分级标注,得到带标签的第一训练样本库,进一步可以利用该第一训练样本库对预设多分类网络进行训练以获得图像质量评估模型,最后利用经训练的图像质量评估模型对待评估图像进行图像质量评估,确定该待评估图像为多个预设的图像质量级别之一。本实施例仅需采集少量清晰的真实图像样本,即可生成大量不同质量等级的伪图像样本,在生成过程中完成标注,避免人工干预,减少人工成本的同时提升数据标注的质量,进而能够以更小成本完成该图像质量评估模型的训练。Embodiments of the present invention provide a model training method for training an image quality assessment model. Specifically, firstly, a real image sample set including multiple real image samples is obtained, a pre-built generative adversarial network is used, and the real image sample is used. Iteratively trains the generative adversarial network, and collects multiple fake image sample sets generated by the generation network in multiple iteration rounds, and generates a first training sample consisting of a real image sample set and multiple fake image sample sets. Since the generation network gradually generates pseudo-image samples with higher quality in multiple iteration rounds, each first training sample in the first training sample library can be automatically graded and labeled according to multiple preset image quality levels , obtain the first training sample library with labels, and further use the first training sample library to train the preset multi-class network to obtain an image quality evaluation model, and finally use the trained image quality evaluation model to evaluate the image quality of the image to be evaluated. Evaluate, and determine that the image to be evaluated is one of a plurality of preset image quality levels. This embodiment only needs to collect a small number of clear real image samples to generate a large number of fake image samples of different quality levels, complete the labeling during the generation process, avoid manual intervention, reduce labor costs, and improve the quality of data labeling. The training of the image quality assessment model is completed at a lower cost.
本领域技术人员可以理解,所描述的应用场景仅是本发明的实施方式可以在其中得以实现的一个示例。本发明实施方式的适用范围不受任何限制。在介绍了本发明的基本原理之后,下面具体介绍本发明的各种非限制性实施方式。Those skilled in the art can understand that the described application scenario is only an example in which the embodiments of the present invention can be implemented. The scope of application of the embodiments of the present invention is not limited in any way. Having introduced the basic principles of the present invention, various non-limiting embodiments of the present invention are described in detail below.
图1为根据本申请一实施例的模型训练方法100的流程示意图,用于评估图像的质量,在该流程中,从设备角度而言,执行主体可以是一个或者多个电子设备;从程序角度而言,执行主体相应地可以是搭载于这些电子设备上的程序。1 is a schematic flowchart of a model training method 100 according to an embodiment of the present application, which is used to evaluate the quality of an image. In this process, from a device perspective, the execution subject may be one or more electronic devices; from a program perspective In other words, the execution body may be a program mounted on these electronic devices accordingly.
如图1所示,该方法100可以包括:As shown in FIG. 1, the method 100 may include:
步骤101、获取真实图像样本集,其中,真实图像样本集包括多个真实图像样本。Step 101: Obtain a real image sample set, wherein the real image sample set includes a plurality of real image samples.
在一实施例中,为了获取便于后续训练的真实图像样本集,步骤101还可以包括:采集多个真实图像,对多个真实图像进行以下预处理操作:利用对象检测算法确定每个真实图像中的感兴趣区域(ROI),并根据确定的感兴趣区域(ROI)对每个真实图像进行裁剪处理;以及,对多个真实图像进行尺寸归一化,以得到真实图像样本集。In one embodiment, in order to obtain a real image sample set that is convenient for subsequent training, step 101 may further include: collecting multiple real images, and performing the following preprocessing operations on the multiple real images: using an object detection algorithm to determine the Each real image is cropped according to the determined region of interest (ROI); and the size of multiple real images is normalized to obtain a real image sample set.
其中,真实图像可以是针对某一特定对象的图像数据,比如可以是人脸图像、动物图像车辆图像等等。对象检测算法用于从真实图像检测到目标对象,从而获得感兴趣区域(ROI)。The real image may be image data for a specific object, such as a face image, an animal image, a vehicle image, and the like. Object detection algorithms are used to detect target objects from real images to obtain regions of interest (ROIs).
在一实施例中,真实图像样本为人脸图像,对象检测算法为人脸检测算法。In one embodiment, the real image sample is a face image, and the object detection algorithm is a face detection algorithm.
在一实施例中,获取真实图像样本集之后,该方法还可以包括:利用关键点检测算法和/或姿态估计算法去除真实图像样本集中的非正脸图片。由此可以避免非正脸图片对于后续的训练带来不利影响。In one embodiment, after acquiring the real image sample set, the method may further include: removing non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm. In this way, it can be avoided that the non-frontal face pictures will adversely affect the subsequent training.
例如,可以采用可见光摄像机采集包含清晰的人脸图片数据库A,对人脸图片数据库A中的每一张图片采用开源的人脸检测算法进行人脸区域检测,得到对应图片的感兴趣区域(ROI),对原图进行裁剪得到对应的清晰人脸图片数据库B,对清晰人脸图片数据库B中的每一张图片进行尺寸归一化,得到一组尺寸为H*W人脸图片,比如其中H=120,W=160,最后可以使用关键点检测及姿态估计算法去除侧脸、俯仰等非正脸图片。最后,存储剩余的人脸图片数据,得到质量为1级(最高)的人脸图片数据库作为真实图像样本集D1。For example, a visible light camera can be used to collect a clear face image database A, and each image in the face image database A can be detected using an open-source face detection algorithm to obtain the region of interest (ROI) of the corresponding image. ), crop the original image to obtain the corresponding clear face picture database B, normalize the size of each picture in the clear face picture database B, and obtain a set of face pictures with a size of H*W, such as where H=120, W=160, and finally, non-frontal face pictures such as profile and pitch can be removed by using key point detection and pose estimation algorithms. Finally, the remaining face image data is stored, and a face image database with a quality of 1 (highest) is obtained as a real image sample set D1.
步骤102、利用真实图像样本集对预先构建的生成对抗网络进行迭代训练,收集生成对抗网络中的生成网络在多个迭代轮次中分别生成的多个伪图像样本集。Step 102 , iteratively train the pre-built generative adversarial network using the real image sample set, and collect multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds.
参考图2,示出了对预先构建的生成对抗网络进行迭代训练的过程。在训练过程中,生成网络和判别网络有相反的目标:判别网络试图从真实图像中分辨出虚假图像,而生成网络则试图产生看起来足够真实的图像来欺骗判别网络。由于生成对抗网络由不同目标的两个网络组成,因此每个训练迭代可以分为两个阶段:在第一阶段,训练判别网络,从真实图像样本集D1中采样一批真实图像,生成网络接收随机噪音R并生成伪图像样本R’,真实图像样本集D1和伪图像样本R’组成训练批次,其中将伪图像样本的标签设置为0(伪),真实图像样本的标签设置为1(真),并使用二元交叉熵损失在该被标签的批次上对判别网络进行训练。在这个阶段反向传播只能优化判别网络的权重。在第二阶段,训练生成网络,首先使用生成网络生成另一批伪图像样本,然后再次使用判别网络来判断图像是伪图像样本还是真实图像样本,在这个阶段中所有标签都设置为1(真)。换言之,希望判别网络会错误地判定生成网络产生的伪图像样本为真。至关重要的是,在此步骤中,判别网络的权重会被固定,因此反向传播只会影响生成网络的权重。Referring to Figure 2, the process of iterative training of a pre-built generative adversarial network is shown. During training, the generative and discriminative networks have opposite goals: the discriminative network tries to distinguish fake images from real images, while the generative network tries to produce images that look real enough to fool the discriminative network. Since the generative adversarial network consists of two networks with different objectives, each training iteration can be divided into two stages: In the first stage, the discriminative network is trained, a batch of real images is sampled from the real image sample set D1, and the generative network receives Random noise R and generate fake image samples R', real image sample set D1 and fake image samples R' to form a training batch, where the labels of fake image samples are set to 0 (pseudo) and the labels of real image samples are set to 1 ( true), and train the discriminative network on this labeled batch using a binary cross-entropy loss. Backpropagation at this stage can only optimize the weights of the discriminative network. In the second stage, train the generative network, first use the generative network to generate another batch of fake image samples, and then use the discriminative network again to judge whether the image is a fake image sample or a real image sample, in this stage all labels are set to 1 (true image sample) ). In other words, it is hoped that the discriminative network will falsely judge the fake image samples produced by the generative network to be true. Crucially, in this step, the weights of the discriminative network are fixed, so backpropagation only affects the weights of the generative network.
可以理解,通过上述生成对抗网络的迭代训练过程,生成网络实际上从未生成任何真实的图像,但是随着训练迭代的推进,生成网络生成的伪图像样本逐渐和真实图像样本的质量差距越来越小。It can be understood that through the above-mentioned iterative training process of the generative adversarial network, the generative network never actually generates any real images, but as the training iteration progresses, the quality gap between the fake image samples generated by the generative network and the real image samples gradually increases. smaller.
在一实施例中,步骤102还包:预先构建生成对抗网络,生成对抗网络包括生成网络和判别网络;其中,生成网络包括线性映射层、多个卷积层以及位于多个卷积层的每个卷积层之后的批标准化函数和ReLU激活函数,生成网络用于根据随机噪音并生成伪图像样本。判别网络包括多个卷积层和位于多个卷积层的每个卷积层之后的LeakyRelu激活函数层和池化层,以及位于多个卷积层之后的全连接层、LeakyRelu激活函数层和sigmoid激活函数层,判别网络用于对真实图像样本和伪图像样本进行真伪判定。In one embodiment, step 102 further includes: pre-constructing a generative adversarial network, where the generative adversarial network includes a generative network and a discriminant network; wherein the generative network includes a linear mapping layer, a plurality of convolutional layers, and each of the plurality of convolutional layers. A batch normalization function and a ReLU activation function after a convolutional layer, the generative network is used to generate fake image samples based on random noise. The discriminative network includes multiple convolutional layers and a LeakyRelu activation function layer and a pooling layer located after each convolutional layer of the multiple convolutional layers, and a fully connected layer, a LeakyRelu activation function layer and a fully connected layer located after the multiple convolutional layers. The sigmoid activation function layer, the discriminant network is used to determine the authenticity of real image samples and fake image samples.
例如,参见图3,生成网络的输入为20维长度为3*H*2*W*2的随机噪声,第一层为线性映射,将输入映射为1*3*(H*2)*(W*2)的四维数据;第二层为卷积运算,将第一层输 出结果与50*3*3的Kernel进行卷积,其中步长为1,padding为1;第三层为卷积运算,将第二层输出结果与25*3*3的Kernel进行卷积,其中步长为1,padding为1;第四层为卷积运算,将第三层输出结果与16*3*3的Kernel进行卷积,其中步长为2,padding为1;第五层为卷积运算,将第四层输出结果与16*3*3的Kernel进行卷积,其中步长为1,padding为1;第六层为卷积运算,将第五层输出结果与16*3*3的Kernel进行卷积,其中步长为1,padding为1;第七层为卷积运算,将第六层输出结果与8*3*3的Kernel进行卷积,其中步长为1,padding为1;第七层为卷积运算,将第六层输出结果与3*3*3的Kernel进行卷积,其中步长为1,padding为1。对以上每一层网络的输出均添加批标准化(BatchNormlization)层和ReLU激活函数层。在一实施例中,该生成网络的损失函数采用交叉熵函数。具体来说,该损失函数的计算使用的是对抗网络对伪图像样本的预测结果与真实标签的交叉熵函数。For example, referring to Figure 3, the input of the generator network is 20-dimensional random noise of length 3*H*2*W*2, the first layer is a linear mapping, and the input is mapped as 1*3*(H*2)*( W*2) four-dimensional data; the second layer is convolution operation, the output result of the first layer is convolved with the Kernel of 50*3*3, where the step size is 1, the padding is 1; the third layer is convolution Operation, convolve the output result of the second layer with the 25*3*3 Kernel, where the step size is 1 and the padding is 1; the fourth layer is convolution operation, and the output result of the third layer is convolved with 16*3*3 The Kernel is convolved, where the step size is 2, and the padding is 1; the fifth layer is convolution operation, and the output result of the fourth layer is convolved with the 16*3*3 Kernel, where the step size is 1, and the padding is 1; The sixth layer is a convolution operation, convolving the output result of the fifth layer with a 16*3*3 Kernel, where the step size is 1 and the padding is 1; the seventh layer is a convolution operation, and the sixth layer is convolved. The output result is convolved with the 8*3*3 Kernel, where the step size is 1 and the padding is 1; the seventh layer is the convolution operation, and the sixth layer output result is convolved with the 3*3*3 Kernel, where the step size is 1 and the padding is 1. A BatchNormlization layer and a ReLU activation function layer are added to the output of each of the above layers. In one embodiment, the loss function of the generating network adopts a cross-entropy function. Specifically, the calculation of the loss function uses the cross-entropy function between the prediction results of the adversarial network on the fake image samples and the real labels.
例如,参见图4,判别网络的输入为真实图像样本集D1和伪图像样本集R’,将真实图像样本集D1的标签设置为1(真),将伪图像样本集R’的标签设置为0(伪)使用单目标二分类交叉熵函数作为损失函数,其中,判别网络中第一层为卷积运算,将输入的1*3*H*W的图像数据与32*7*7的Kernel进行卷积,其中步长为1,padding为3,通过LeakyRelu激活函数对卷积结果进行处理,紧接着是步长为2的2*2的平均池化处理;第二层为卷积运算,将第一层的输出结果与32*3*3的Kernel进行卷积,其中步长为1,padding为1,通过LeakyRelu激活函数对卷积结果进行处理,紧接着是步长为2的2*2的平均池化处理;第三层为卷积运算,将第二层的输出结果与16*3*3的Kernel进行卷积,其中步长为1,padding为1,通过LeakyRelu激活函数对卷积结果进行处理,紧接着是步长为2的2*2的平均池化处理;第四层为2个全连接层,将第三层的输出映射为1*1024维,通过LeakyRelu激活函数对卷积结果进行处理,将1*1024维映射为1*1维,最后接sigmoid激活函数得到一个0-1之前的概率,从而进行二分类。For example, referring to Figure 4, the input of the discriminant network is the real image sample set D1 and the fake image sample set R', the label of the real image sample set D1 is set to 1 (true), and the label of the fake image sample set R' is set to 0 (pseudo) uses the single-target binary cross-entropy function as the loss function, in which the first layer in the discriminant network is the convolution operation, and the input 1*3*H*W image data and the 32*7*7 Kernel Perform convolution with a step size of 1 and padding of 3. The convolution result is processed by the LeakyRelu activation function, followed by a 2*2 average pooling process with a step size of 2; the second layer is a convolution operation, Convolve the output result of the first layer with a 32*3*3 Kernel, where the step size is 1, the padding is 1, and the convolution result is processed by the LeakyRelu activation function, followed by 2* with a step size of 2 The average pooling processing of 2; the third layer is convolution operation, the output result of the second layer is convolved with the 16*3*3 Kernel, where the step size is 1, the padding is 1, and the volume is adjusted by the LeakyRelu activation function. The product results are processed, followed by a 2*2 average pooling process with a step size of 2; the fourth layer is 2 fully connected layers, the output of the third layer is mapped to 1*1024 dimensions, and the LeakyRelu activation function is used to pair the The convolution result is processed, the 1*1024 dimension is mapped to 1*1 dimension, and finally the sigmoid activation function is connected to obtain a probability before 0-1, so as to perform two-classification.
步骤103、生成由真实图像样本集和多个伪图像样本集组成的第一训练样本库,根据多个预设的图像质量级别对第一训练样本库的每个第一训练样本进行自动分级标注,得到带标签的第一训练样本库。Step 103: Generate a first training sample library consisting of a real image sample set and a plurality of pseudo-image sample sets, and automatically grade and label each first training sample in the first training sample library according to a plurality of preset image quality levels. , to get the first training sample library with labels.
在一实施例中,步骤103中对第一训练样本库的每个第一训练样本进行自动分级标注,包括:根据每个伪图像样本集对应的迭代轮次数将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更高的迭代次数对应于更高的图像质量级别;将真实图像样本集包含的真实图像样本标注为最高的图像质量级别。In one embodiment, in step 103, the automatic grading and labeling of each first training sample in the first training sample library includes: according to the number of iterations corresponding to each pseudo image sample set, Multiple fake image samples are marked as corresponding image quality levels, wherein higher iteration times correspond to higher image quality levels; the real image samples included in the real image sample set are marked as the highest image quality level.
例如,预设的图像质量级别从高到低可以分为6种,包括“Ⅰ级”、“Ⅱ级”、…、“Ⅵ级”。可以通过保存训练中间过程的伪图像样本,来对伪图像样本和真实图像样板按照质量 进行分级存储,如迭代500次时,生成网络生成的伪图像样本集的图像质量等级为“Ⅵ级”,迭代1000次时,成网络生成的伪图像样本集的图像质量等级为“Ⅴ级”,……随着迭代次数增加,生成网络生成的伪图像样本与采集的真实图像样本之前的区分度更低,也即成网络生成的伪图像样本集的图像质量等级更高,质量更好,此处可以生成多级不同质量的伪图像样本集。换言之,可以根据每个伪图像样本集对应的迭代轮次数(如上述500次、1000次等等)将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更高的迭代次数对应于更高的图像质量级别;将真实图像样本集包含的真实图像样本标注为最高的图像质量级别“Ⅰ级”。For example, the preset image quality levels can be divided into 6 types from high to low, including "level I", "level II", ..., "level VI". The pseudo-image samples and real image samples can be stored in stages according to their quality by saving the pseudo-image samples in the intermediate training process. For example, when iterating 500 times, the image quality level of the pseudo-image sample set generated by the generation network is "level VI". When iteratively 1000 times, the image quality level of the fake image sample set generated by the network is "V level". As the number of iterations increases, the distinction between the fake image samples generated by the network and the real image samples collected is lower. , that is, the pseudo-image sample set generated by the network has a higher image quality level and better quality. Here, multiple levels of pseudo-image sample sets with different qualities can be generated. In other words, the multiple pseudo-image samples included in each pseudo-image sample set can be marked as corresponding image quality levels according to the number of iteration rounds corresponding to each pseudo-image sample set (such as the above-mentioned 500 times, 1000 times, etc.), wherein, A higher number of iterations corresponds to a higher image quality level; the real image samples contained in the real image sample set are marked as the highest image quality level "Class I".
在另外一实施例中,步骤103中的基于图像质量对第一训练样本库的每个第一训练样本进行自动分级标注,包括:计算每个伪图像样本集与真实图像样本集的弗雷歇距离(Frech Inception Distance);根据计算结果将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更小的弗雷歇距离对应于更高的图像质量级别;以及,将真实图像样本集包含的多个真实图像样本标注为最高的图像质量级别。In another embodiment, the automatic grading and labeling of each first training sample in the first training sample library based on the image quality in step 103 includes: calculating the Fraser difference between each fake image sample set and the real image sample set Distance (Frech Inception Distance); according to the calculation results, multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, where a smaller Frech distance corresponds to a higher image quality level; And, multiple real image samples included in the real image sample set are marked as the highest image quality level.
例如,预设的图像质量级别从高到低可以分为6种,包括“Ⅰ级”、“Ⅱ级”、…、“Ⅵ级”。训练过程中不同迭代次数时对应的伪图像样本可以分文件夹保存,比如文件夹F1存储训练到第10个轮次时对应的伪图像样本,文件夹F2存储训练到第20个轮次时对应的伪图像样本……以此类推;可以计算多个文件夹中的数据与真实图像样本集D1的弗雷歇距离来衡量生成的伪图像样本与清晰的真实图像样本图片的质量差异,并按照计算结果,根据弗雷歇距离的分布情况将第二步生成文件夹归并为5类,分别按照距离从小到大排列得到质量为“Ⅰ级”的真实图像样本集D1、质量为“Ⅱ级”的伪图像样本集,…,质量为“Ⅵ级”的伪图像样本集。For example, the preset image quality levels can be divided into 6 types from high to low, including "level I", "level II", ..., "level VI". During the training process, the corresponding pseudo-image samples for different iterations can be saved in folders. For example, folder F1 stores the pseudo-image samples corresponding to the 10th round of training, and folder F2 stores the corresponding pseudo-image samples when the training reaches the 20th round. The fake image samples of ... and so on; the Frecher distance between the data in multiple folders and the real image sample set D1 can be calculated to measure the quality difference between the generated fake image samples and the clear real image samples, and according to Calculation results, according to the distribution of Frecher distance, the folders generated in the second step are grouped into 5 categories, and they are arranged according to the distance from small to large to obtain a real image sample set D1 with a quality of "level I" and a quality of "level II". The fake image sample set, ..., the fake image sample set of quality "VI".
可选地,也可以采用诸如余弦相似度、KL散度(Kullback-Leibler divergence)等用于评价相似度程度的参数代替上述弗雷歇距离。比如,可以计算多个文件夹中的数据与真实图像样本集D1的余弦相似度来衡量生成的伪图像样本与清晰的真实图像样本图片的质量差异,并按照相似度从大到小排列得到质量为“Ⅰ级”的真实图像样本集D1、质量为“Ⅱ级”的伪图像样本集,…,质量为“Ⅵ级”的伪图像样本集。Optionally, parameters for evaluating the degree of similarity, such as cosine similarity, KL divergence (Kullback-Leibler divergence), etc., may also be used instead of the above-mentioned Frechet distance. For example, the cosine similarity between the data in multiple folders and the real image sample set D1 can be calculated to measure the quality difference between the generated fake image samples and the clear real image sample pictures, and the quality can be sorted in descending order of similarity. It is a real image sample set D1 of "level I", a fake image sample set of "level II" quality, ..., a fake image sample set of "level VI" quality.
可选地,可以基于多个文件夹中的数据与真实图像样本集D1的部分图像信息进行上述相似度程度的评价,或者也可以基于多个文件夹中的数据与真实图像样本集D1的全部图像信息进行上述相似度程度的评价,本申请对此不作具体限定。Optionally, the evaluation of the degree of similarity may be performed based on the data in multiple folders and some image information of the real image sample set D1, or may also be based on the data in multiple folders and all of the real image sample set D1. The image information is evaluated for the above degree of similarity, which is not specifically limited in this application.
完成以上步骤之后,即可得到自动标注的第一训练样本库D,其中第一训练样本库D包含6个子文件夹,分别可以对应质量为“Ⅰ级”的真实图像样本D1,质量为“Ⅱ级”的伪 图像样本集D2,质量为“Ⅲ级”的伪图像样本集D3,质量为“Ⅳ级”的伪图像样本集数据库D4,质量为“Ⅴ级”的人脸图片数据库D5;质量为“Ⅵ级”的人脸图片数据库D6;After completing the above steps, the automatically labeled first training sample library D can be obtained, wherein the first training sample library D contains 6 subfolders, which can correspond to real image samples D1 with a quality of "Class I" and a quality of "II". The fake image sample set D2 of "level", the fake image sample set D3 of "level III" quality, the database D4 of fake image sample set of "level IV", and the face image database D5 of "level V" quality; It is the face image database D6 of "level VI";
在又一实施例中,步骤103中的基于图像质量对第一训练样本库的每个第一训练样本进行自动分级标注,包括:计算每个伪图像样本与真实图像样本的均方误差(MSE)值;根据均方误差(MSE)值将每个伪图像样本标注为对应的图像质量级别,其中,更低的均方误差(MSE)值对应于更高的图像质量级别;以及,将真实图像样本集包含的多个真实图像样本标注为最高的图像质量级别。In yet another embodiment, the automatic grading and labeling of each first training sample in the first training sample library based on the image quality in step 103 includes: calculating the mean square error (MSE) between each fake image sample and the real image sample ) value; label each fake image sample as a corresponding image quality level according to a mean squared error (MSE) value, where lower mean squared error (MSE) values correspond to higher image quality levels; The image sample set contains multiple real image samples annotated with the highest image quality level.
步骤104、利用第一训练样本库对预设多分类网络进行训练,获得图像质量评估模型。Step 104: Use the first training sample library to train a preset multi-classification network to obtain an image quality evaluation model.
第一训练样本库由真实图像样本集和多个伪图像样本集组成,且其中的每个第一训练样本都携带用于指示图像质量的标签,比如,假设图像质量从高到低分为6种,包括“Ⅰ级”、“Ⅱ级”、…、“Ⅵ级”,其中作为真实图像样本的第一训练样本的标签为“Ⅰ级”,作为伪图像样本的第一训练样本依据其图像质量从好到坏依次为“Ⅱ级”、…、“Ⅵ级”。因此,可以利用该第一训练样本库对预设多分类网络进行训练直至收敛,所获得的图像质量评估模型能够依据输入图像确定其图像质量为“Ⅰ级”、“Ⅱ级”、…、“Ⅵ级”中的其中一种。The first training sample library consists of a real image sample set and a plurality of fake image sample sets, and each of the first training samples carries a label used to indicate the image quality. For example, it is assumed that the image quality is divided into 6 from high to low. The first training sample, which is a real image sample, is labeled as "Class I", and the first training sample, which is a fake image sample, is based on its image The order of quality from good to bad is "Class II", ..., "Class VI". Therefore, the preset multi-classification network can be trained by using the first training sample library until it converges, and the obtained image quality evaluation model can determine the image quality according to the input image as "level I", "level II", ..., " One of the "Class VI".
在一实施例中,步骤104具体可以包括:获取第一训练样本库中的每个带标签的第一训练样本,标签用于指示第一训练样本的图像质量级别;对每个第一训练样本进行行方向滤波处理,得到第一滤波图像;对每个第一训练样本进行列方向滤波处理,得到第二滤波图像;将每个第一训练样本和对应的第一滤波图像和第二滤波图像进行拼接合并,分别生成带标签的第二训练样本;分别获取多个第一训练样本对应的多个第二训练样本,并将多个第二训练样本输入预设多分类网络进行迭代训练,以获取图像质量评估模型。In one embodiment, step 104 may specifically include: acquiring each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; Perform row direction filtering processing to obtain a first filtered image; perform column direction filtering processing on each first training sample to obtain a second filtered image; combine each first training sample with the corresponding first filtered image and the second filtered image Perform splicing and merging to generate labeled second training samples respectively; respectively obtain multiple second training samples corresponding to multiple first training samples, and input multiple second training samples into a preset multi-classification network for iterative training to Obtain an image quality assessment model.
参考图5,针对第一训练样本库的多个带标签的第一训练样本Img,分别对其进行行方向滤波和列方向滤波,也即将第一训练样本Img与1*N的卷积核进行卷积得到行方向上的滤波后图像Img1(第一滤波图像),将第一训练样本Img与N*1的另一卷积核进行卷积得到列方向上的滤波后的图像Img2(第二滤波图像)。将上述Img,Img1,Img2合并成一张H*(3*W)的图片(也即第二训练样本),如图5所示,在该第二训练样本中,第一训练样本Img在左边、第一滤波图像Img1在中间、第二滤波图像Img2在右边。对应于多个第一训练样本的多个第二训练样本组成第二训练样本库。将该多个第二训练样本输入预设多分类网络进行迭代训练,可以获取图像质量评估模型。Referring to Fig. 5, for a plurality of labeled first training samples Img of the first training sample library, row-direction filtering and column-direction filtering are respectively performed on them, that is, the first training samples Img and the convolution kernel of 1*N are performed. The filtered image Img1 in the row direction (the first filtered image) is obtained by convolution, and the first training sample Img is convolved with another convolution kernel of N*1 to obtain the filtered image Img2 in the column direction (the second filtered image). image). The above-mentioned Img, Img1, and Img2 are merged into a picture of H*(3*W) (that is, the second training sample), as shown in Figure 5, in the second training sample, the first training sample Img is on the left, The first filtered image Img1 is in the middle and the second filtered image Img2 is on the right. A plurality of second training samples corresponding to the plurality of first training samples constitute a second training sample library. The plurality of second training samples are input into a preset multi-classification network for iterative training, and an image quality evaluation model can be obtained.
可以理解,利用本实施例所训练得到的图像质量评估模型,在完成数据标注后进行深度学习训练图像质量评估模型时,对于训练质量评估的模型输入进行传统数字图像的预处理,增加模型输入的特征信息,提升模型的稳定性和泛化能力It can be understood that, using the image quality evaluation model trained in this embodiment, when performing deep learning to train the image quality evaluation model after completing the data labeling, the traditional digital image preprocessing is performed for the model input of the training quality evaluation, and the input of the model is increased. Feature information to improve the stability and generalization ability of the model
在一实施例中,上述预设多分类网络为ResNet网络,预设多分类网络使用二分类交叉熵函数作为损失函数且利用softmax函数进行二分类。可选的,该预设多分类网络也可以采用ResNet以外的网络。In one embodiment, the above-mentioned preset multi-classification network is a ResNet network, and the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification. Optionally, the preset multi-classification network may also use a network other than ResNet.
基于相同的技术构思,本发明实施例还提供一种图像质量评估方法,其利用上述实施例的模型训练方法执行图像质量评估方法,具体包括:接收待评估图像;利用如上述实施例所描述的训练方法训练得到的图像质量评估模型对待评估图像进行图像质量评估,以确认待评估图像为多个预设的图像质量级别之一。Based on the same technical concept, an embodiment of the present invention also provides an image quality assessment method, which uses the model training method of the above embodiment to perform the image quality assessment method, which specifically includes: receiving an image to be assessed; using the method described in the above embodiment The image quality evaluation model trained by the training method performs image quality evaluation on the image to be evaluated, so as to confirm that the image to be evaluated is one of a plurality of preset image quality levels.
在一些实施方式中,待评估图像为待评估人脸图像,图像质量评估模型用于对人脸图像进行质量评估,方法还包括:接收待评估图像之后,利用人脸检测算法确定待评估人脸图像中的感兴趣区域(ROI),并根据确定的感兴趣区域(ROI)对待评估人脸图像进行裁剪处理;根据第一训练样本的尺寸对裁剪处理后的待评估人脸图像进行尺寸归一化;利用关键点检测算法和/或姿态估计算法确定尺寸归一化之后的待评估人脸图像是否为正脸图像;其中,若待评估人脸图像不是正脸图像则停止评估,若待评估人脸图像是正脸图像则利用图像质量评估模型对尺寸归一化之后的待评估图像进行图像质量评估。In some embodiments, the image to be evaluated is a face image to be evaluated, the image quality evaluation model is used to perform quality evaluation on the face image, and the method further includes: after receiving the image to be evaluated, using a face detection algorithm to determine the face to be evaluated The region of interest (ROI) in the image, and the face image to be evaluated is cropped according to the determined region of interest (ROI); the size of the cropped face image to be evaluated is normalized according to the size of the first training sample use key point detection algorithm and/or pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped. If the face image is a frontal image, the image quality evaluation model is used to evaluate the image quality of the image to be evaluated after the size is normalized.
在一些实施方式中,利用图像质量评估模型对待评估图像进行图像质量评估,包括:对待评估图像进行行方向滤波处理,得到第一滤波待评估图像;对待评估图像进行列方向滤波处理,得到第二滤波待评估图像;将待评估图像、第一滤波待评估图像和第二滤波待评估图像的合并图像输入图像质量评估模型进行评估,以确定待评估图像为预设的多个图像质量级别之一。In some embodiments, using an image quality evaluation model to perform image quality assessment on the image to be evaluated includes: performing row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; performing column-direction filtering on the image to be evaluated to obtain a second Filter the image to be evaluated; input the combined image of the image to be evaluated, the first filtered image to be evaluated and the second filtered image to be evaluated into an image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels .
基于相同的技术构思,本发明实施例还提供一种模型训练装置,用于执行上述图1所提供的图像质量评估模型训练方法。图6为本发明实施例提供的一种模型训练装置结构示意图。Based on the same technical concept, an embodiment of the present invention further provides a model training device for executing the image quality assessment model training method provided in FIG. 1 above. FIG. 6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention.
如图6所示,模型训练装置包括:As shown in Figure 6, the model training device includes:
获取模块601,用于获取真实图像样本集,其中,真实图像样本集包括多个真实图像样本。The obtaining module 601 is configured to obtain a real image sample set, wherein the real image sample set includes a plurality of real image samples.
生成对抗网络模块602,用于利用真实图像样本集对预先构建的生成对抗网络进行迭代训练,收集生成对抗网络中的生成网络在多个迭代轮次中分别生成的多个伪图像样本集;The generative adversarial network module 602 is used to iteratively train the pre-built generative adversarial network by using the real image sample set, and collect multiple pseudo-image sample sets respectively generated by the generative network in the generative adversarial network in multiple iteration rounds;
自动标注模块603,用于生成由真实图像样本集和多个伪图像样本集组成的第一训练样本库,根据多个预设的图像质量级别对第一训练样本库的每个第一训练样本进行自动分级标注,得到带标签的第一训练样本库;The automatic labeling module 603 is used to generate a first training sample library consisting of a real image sample set and a plurality of pseudo-image sample sets, and according to a plurality of preset image quality levels, each first training sample of the first training sample library is Perform automatic grading and labeling to obtain the first training sample library with labels;
模型训练模块604,用于利用第一训练样本库对预设多分类网络进行训练,获得图像质量评估模型。The model training module 604 is configured to use the first training sample library to train a preset multi-classification network to obtain an image quality evaluation model.
在一些实施方式中,自动标注模块还用于:将真实图像样本集包含的真实图像样本标注为最高的图像质量级别;根据每个伪图像样本集对应的迭代轮次数将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更高的迭代次数对应于更高的图像质量级别。In some embodiments, the automatic labeling module is further used to: label the real image samples included in the real image sample set as the highest image quality level; The multiple pseudo-image samples included are annotated with corresponding image quality levels, where a higher number of iterations corresponds to a higher image quality level.
在一些实施方式中,自动标注模块还用于:将真实图像样本集包含的多个真实图像样本标注为最高的图像质量级别;计算每个伪图像样本集与真实图像样本集的弗雷歇距离,根据计算结果将每个伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更小的弗雷歇距离对应于更高的图像质量级别。In some embodiments, the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the Frecher distance between each fake image sample set and the real image sample set , according to the calculation results, the multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, wherein a smaller Frecher distance corresponds to a higher image quality level.
在一些实施方式中,自动标注模块还用于:将真实图像样本集包含的多个真实图像样本标注为最高的图像质量级别;计算每个伪图像样本与真实图像样本的均方误差(MSE)值,根据均方误差(MSE)值将每个伪图像样本标注为对应的图像质量级别,其中,更低的均方误差(MSE)值对应于更高的图像质量级别。In some embodiments, the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the mean square error (MSE) between each fake image sample and the real image sample value, and label each pseudo-image sample as a corresponding image quality level according to the mean square error (MSE) value, where lower mean square error (MSE) values correspond to higher image quality levels.
在一些实施方式中,获取模块还用于:采集多个真实图像,对多个真实图像进行以下预处理操作:利用对象检测算法确定每个真实图像中的感兴趣区域(ROI),并根据确定的感兴趣区域(ROI)对每个真实图像进行裁剪处理;以及,对多个真实图像进行尺寸归一化,以得到真实图像样本集。In some embodiments, the acquisition module is further configured to: collect multiple real images, and perform the following preprocessing operations on the multiple real images: determine a region of interest (ROI) in each real image by using an object detection algorithm, and determine The region of interest (ROI) of each real image is cropped; and the size of multiple real images is normalized to obtain a real image sample set.
在一些实施方式中,真实图像样本为人脸图像,对象检测算法为人脸检测算法。In some embodiments, the real image sample is a face image, and the object detection algorithm is a face detection algorithm.
在一些实施方式中,获取真实图像样本集之后,获取模块还用于:利用关键点检测算法和/或姿态估计算法去除真实图像样本集中的非正脸图片。In some embodiments, after acquiring the real image sample set, the acquiring module is further configured to: remove non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
在一些实施方式中,模型训练模块还用于:获取第一训练样本库中的每个带标签的第一训练样本,标签用于指示第一训练样本的图像质量级别;对每个第一训练样本进行行方向滤波处理,得到第一滤波图像;对每个第一训练样本进行列方向滤波处理,得到第二滤波图像;将每个第一训练样本和对应的第一滤波图像和第二滤波图像进行拼接合并,分别生成带标签的第二训练样本;分别获取多个第一训练样本对应的多个第二训练样本,并将多个第二训练样本输入预设多分类网络进行迭代训练,以获取图像质量评估模型。In some embodiments, the model training module is further configured to: obtain each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; for each first training sample The sample is subjected to row direction filtering processing to obtain a first filtered image; each first training sample is subjected to column direction filtering processing to obtain a second filtered image; each first training sample and the corresponding first filtered image and the second filtered image are The images are spliced and merged to respectively generate labeled second training samples; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and the plurality of second training samples are input into a preset multi-classification network for iterative training, to obtain an image quality assessment model.
在一些实施方式中,预设多分类网络为ResNet网络,预设多分类网络使用二分类交叉熵函数作为损失函数且利用softmax函数进行二分类。In some embodiments, the preset multi-classification network is a ResNet network, and the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
在一些实施方式中,生成对抗网络模块还用于:预先构建生成对抗网络,生成对抗网络包括生成网络和判别网络;其中,生成网络包括线性映射层、多个卷积层以及位于多个卷 积层的每个卷积层之后的批标准化函数和ReLU激活函数,生成网络用于接收随机噪音并生成伪图像样本;判别网络包括多个卷积层和位于多个卷积层的每个卷积层之后的LeakyRelu激活函数层和池化层,以及位于多个卷积层之后的全连接层、LeakyRelu激活函数层和sigmoid激活函数层,判别网络用于对真实图像样本和伪图像样本进行真伪判定。In some embodiments, the generative adversarial network module is further configured to: construct a generative adversarial network in advance, and the generative adversarial network includes a generative network and a discriminant network; wherein, the generative network includes a linear mapping layer, a plurality of convolutional layers, and a plurality of convolutional layers located in a plurality of convolutional layers. The batch normalization function and ReLU activation function after each convolutional layer of the layer, the generative network is used to receive random noise and generate fake image samples; the discriminative network includes multiple convolutional layers and each convolutional layer located in multiple convolutional layers The LeakyRelu activation function layer and the pooling layer after the layer, as well as the fully connected layer, the LeakyRelu activation function layer and the sigmoid activation function layer after multiple convolutional layers, the discriminant network is used for real image samples and fake image samples. determination.
在一些实施方式中,生成网络的损失函数采用交叉熵函数。In some embodiments, the loss function of the generative network employs a cross-entropy function.
需要说明的是,本申请实施例中的模型训练装置可以实现前述模型训练方法的实施例的各个过程,并达到相同的效果和功能,这里不再赘述。It should be noted that, the model training apparatus in the embodiment of the present application can implement each process of the foregoing embodiment of the model training method, and achieve the same effects and functions, which will not be repeated here.
基于相同的技术构思,本发明实施例还提供一种图像质量评估装置,用于执行上述实施例提供的图像质量评估方法。具体包括:接收模块,用于接收待评估图像;评估模块,用于利用如权第一方面的方法训练得到的图像质量评估模型对待评估图像进行图像质量评估,以确认待评估图像为多个预设的图像质量级别之一。Based on the same technical concept, the embodiments of the present invention further provide an image quality evaluation apparatus, which is used to execute the image quality evaluation methods provided by the above embodiments. Specifically, it includes: a receiving module for receiving an image to be evaluated; an evaluation module for performing image quality evaluation on the image to be evaluated by using the image quality evaluation model trained by the method in the first aspect, to confirm that the image to be evaluated is a plurality of pre-evaluated images. one of the set image quality levels.
在一些实施方式中,待评估图像为待评估人脸图像,图像质量评估模型用于对人脸图像进行质量评估,评估模块还用于:接收待评估图像之后,利用人脸检测算法确定待评估人脸图像中的感兴趣区域(ROI),并根据确定的感兴趣区域(ROI)对待评估人脸图像进行裁剪处理;根据第一训练样本的尺寸对裁剪处理后的待评估人脸图像进行尺寸归一化;利用关键点检测算法和/或姿态估计算法确定尺寸归一化之后的待评估人脸图像是否为正脸图像;其中,若待评估人脸图像不是正脸图像则停止评估,若待评估人脸图像是正脸图像则利用图像质量评估模型对尺寸归一化之后的待评估图像进行图像质量评估。In some embodiments, the image to be evaluated is a face image to be evaluated, the image quality evaluation model is used to perform quality evaluation on the face image, and the evaluation module is further configured to: after receiving the image to be evaluated, use a face detection algorithm to determine the image to be evaluated The region of interest (ROI) in the face image, and the face image to be evaluated is cropped according to the determined region of interest (ROI); according to the size of the first training sample, the cropped face image to be evaluated is sized. Normalization; use the key point detection algorithm and/or the pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped, if If the face image to be evaluated is a frontal face image, the image quality evaluation model is used to evaluate the image quality of the to-be-evaluated image after size normalization.
在一些实施方式中,评估模块还用于:对待评估图像进行行方向滤波处理,得到第一滤波待评估图像;对待评估图像进行列方向滤波处理,得到第二滤波待评估图像;将待评估图像、第一滤波待评估图像和第二滤波待评估图像的合并图像输入图像质量评估模型进行评估,以确定待评估图像为预设的多个图像质量级别之一。In some embodiments, the evaluation module is further configured to: perform row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; perform column-direction filtering on the image to be evaluated to obtain a second filtered image to be evaluated; , the combined image of the first filtered image to be evaluated and the second filtered image to be evaluated is input to the image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels.
图7为根据本申请一实施例的模型训练装置,用于执行图1所示出的模型训练方法,该装置包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行上述实施例所述的模型训练方法。FIG. 7 is a model training apparatus according to an embodiment of the present application, for executing the model training method shown in FIG. 1 , the apparatus includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein, The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the model training method described in the above embodiments.
图8为根据本申请一实施例的图像质量评估装置,用于执行上述实施例所示出的图像质量评估方法,该装置包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行上述实施例所述的图像质量评估方法。FIG. 8 is an image quality evaluation apparatus according to an embodiment of the present application, which is used for executing the image quality evaluation method shown in the above-mentioned embodiment, the apparatus includes: at least one processor; and a memory connected in communication with the at least one processor ; wherein, the memory stores instructions executable by at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the image quality assessment method described in the above embodiments.
根据本申请的一些实施例,提供了模型训练方法和图像质量评估方法的非易失性计算机存储介质,其上存储有计算机可执行指令,该计算机可执行指令设置为在由处理器运行时执行:上述实施例所述的方法。According to some embodiments of the present application, there is provided a non-volatile computer storage medium of a model training method and an image quality assessment method having computer-executable instructions stored thereon, the computer-executable instructions being configured to be executed when executed by a processor : the method described in the above embodiment.
本申请中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、设备和计算机可读存储介质实施例而言,由于其基本相似于方法实施例,所以其描述进行了简化,相关之处可参见方法实施例的部分说明即可。Each embodiment in this application is described in a progressive manner, and the same and similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the apparatus, device, and computer-readable storage medium embodiments, since they are basically similar to the method embodiments, the description thereof is simplified, and reference may be made to the partial descriptions of the method embodiments for related parts.
本申请实施例提供的装置、设备和计算机可读存储介质与方法是一一对应的,因此,装置、设备和计算机可读存储介质也具有与其对应的方法类似的有益技术效果,由于上面已经对方法的有益技术效果进行了详细说明,因此,这里不再赘述装置、设备和计算机可读存储介质的有益技术效果。The apparatuses, devices, and computer-readable storage media and methods provided in the embodiments of the present application are in one-to-one correspondence. Therefore, the apparatuses, devices, and computer-readable storage media also have beneficial technical effects similar to those of the corresponding methods. The beneficial technical effects of the method have been described in detail, therefore, the beneficial technical effects of the apparatus, equipment and computer-readable storage medium will not be repeated here.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-persistent memory in computer readable media, random access memory (RAM) and/or non-volatile memory in the form of, for example, read only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。此外,尽管在附图中以特定顺序描述了本发明方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device. Furthermore, although the operations of the methods of the present invention are depicted in the figures in a particular order, this does not require or imply that the operations must be performed in the particular order, or that all illustrated operations must be performed to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps may be combined to be performed as one step, and/or one step may be decomposed into multiple steps to be performed.
虽然已经参考若干具体实施方式描述了本发明的精神和原理,但是应该理解,本发明并不限于所公开的具体实施方式,对各方面的划分也不意味着这些方面中的特征不能组合以进行受益,这种划分仅是为了表述的方便。本发明旨在涵盖所附权利要求的精神和范围内所包括的各种修改和等同布置。While the spirit and principles of the present invention have been described with reference to a number of specific embodiments, it should be understood that the invention is not limited to the specific embodiments disclosed, nor does the division of aspects imply that features of these aspects cannot be combined to perform Benefit, this division is only for convenience of presentation. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (31)

  1. 一种模型训练方法,其特征在于,包括:A model training method, comprising:
    获取真实图像样本集,其中,所述真实图像样本集包括多个真实图像样本;obtaining a real image sample set, wherein the real image sample set includes a plurality of real image samples;
    利用所述真实图像样本集对预先构建的生成对抗网络进行迭代训练,收集所述生成对抗网络中的生成网络在多个迭代轮次中分别生成的多个伪图像样本集;Iteratively trains the pre-built generative adversarial network by using the real image sample set, and collects multiple pseudo-image sample sets that are respectively generated by the generative network in the generative adversarial network in multiple iteration rounds;
    生成由所述真实图像样本集和所述多个伪图像样本集组成的第一训练样本库,根据多个预设的图像质量级别对所述第一训练样本库的每个第一训练样本进行自动分级标注,得到带标签的所述第一训练样本库;Generate a first training sample library consisting of the real image sample set and the plurality of pseudo image sample sets, and perform each first training sample in the first training sample library according to a plurality of preset image quality levels. Automatic grading and labeling to obtain the labeled first training sample library;
    利用所述第一训练样本库对预设多分类网络进行训练,获得图像质量评估模型。The preset multi-classification network is trained by using the first training sample library to obtain an image quality evaluation model.
  2. 根据权利要求1所述的方法,其特征在于,对所述第一训练样本库的每个第一训练样本进行自动分级标注,包括:The method according to claim 1, wherein the automatic grading and labeling of each first training sample in the first training sample library comprises:
    将所述真实图像样本集包含的所述真实图像样本标注为最高的图像质量级别;Marking the real image samples included in the real image sample set as the highest image quality level;
    根据每个所述伪图像样本集对应的迭代轮次数将每个所述伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更高的迭代次数对应于更高的图像质量级别。The plurality of pseudo-image samples included in each of the pseudo-image sample sets are marked as corresponding image quality levels according to the number of iteration rounds corresponding to each of the pseudo-image sample sets, wherein a higher iteration number corresponds to a higher Image quality level.
  3. 根据权利要求1或2所述的方法,其特征在于,对所述第一训练样本库的每个第一训练样本进行自动分级标注,包括:The method according to claim 1 or 2, wherein the automatic grading and labeling of each first training sample in the first training sample library comprises:
    将所述真实图像样本集包含的多个所述真实图像样本标注为最高的图像质量级别;Marking a plurality of the real image samples included in the real image sample set as the highest image quality level;
    计算每个所述伪图像样本集与所述真实图像样本集的弗雷歇距离,根据计算结果将每个所述伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更小的弗雷歇距离对应于更高的图像质量级别。Calculate the Frecher distance between each of the fake image sample sets and the real image sample set, and mark the multiple fake image samples included in each of the fake image sample sets as corresponding image quality levels according to the calculation results, wherein , a smaller Frechet distance corresponds to a higher level of image quality.
  4. 根据权利要求1-3中任意一项所述的方法,其特征在于,对所述第一训练样本库的每个第一训练样本进行自动分级标注,包括:The method according to any one of claims 1-3, wherein the automatic grading and labeling of each first training sample in the first training sample library comprises:
    将所述真实图像样本集包含的多个所述真实图像样本标注为最高的图像质量级别;Marking a plurality of the real image samples included in the real image sample set as the highest image quality level;
    计算每个伪图像样本与所述真实图像样本的均方误差(MSE)值,根据所述均方误差(MSE)值将每个所述伪图像样本标注为对应的图像质量级别,其中,更低的均方误差(MSE)值对应于更高的图像质量级别。Calculate the mean square error (MSE) value of each fake image sample and the real image sample, and mark each fake image sample as a corresponding image quality level according to the mean square error (MSE) value, wherein the more Lower mean squared error (MSE) values correspond to higher image quality levels.
  5. 根据权利要求1-4中任意一项所述的方法,其特征在于,所述获取真实图像样本集还包括:The method according to any one of claims 1-4, wherein the acquiring a real image sample set further comprises:
    采集多个真实图像,对多个所述真实图像进行以下预处理操作:Collect multiple real images, and perform the following preprocessing operations on the multiple real images:
    利用对象检测算法确定每个所述真实图像中的感兴趣区域(ROI),并根据确定的所述感兴趣区域(ROI)对每个所述真实图像进行裁剪处理;以及,对多个所述真实图像进行尺寸归一化,以得到所述真实图像样本集。Determine a region of interest (ROI) in each of the real images by using an object detection algorithm, and perform cropping processing on each of the real images according to the determined region of interest (ROI); The real images are size-normalized to obtain the real image sample set.
  6. 根据权利要求5所述的方法,其特征在于,所述真实图像样本为人脸图像,所述对象检测算法为人脸检测算法。The method according to claim 5, wherein the real image sample is a face image, and the object detection algorithm is a face detection algorithm.
  7. 根据权利要求6所述的方法,其特征在于,所述获取真实图像样本集之后,所述方法还包括:The method according to claim 6, wherein after acquiring the real image sample set, the method further comprises:
    利用关键点检测算法和/或姿态估计算法去除所述真实图像样本集中的非正脸图片。Use a key point detection algorithm and/or a pose estimation algorithm to remove non-frontal face pictures in the real image sample set.
  8. 根据权利要求1-7中任意一项所述的方法,其特征在于,利用所述第一训练样本库对预设多分类网络进行分类训练,获得图像质量评估模型,包括:The method according to any one of claims 1-7, characterized in that, using the first training sample library to perform classification training on a preset multi-classification network to obtain an image quality evaluation model, comprising:
    获取所述第一训练样本库中的每个带标签的所述第一训练样本,所述标签用于指示所述第一训练样本的图像质量级别;acquiring each labeled first training sample in the first training sample library, where the label is used to indicate an image quality level of the first training sample;
    对每个所述第一训练样本进行行方向滤波处理,得到第一滤波图像;Perform row direction filtering processing on each of the first training samples to obtain a first filtered image;
    对每个所述第一训练样本进行列方向滤波处理,得到第二滤波图像;Perform column-direction filtering processing on each of the first training samples to obtain a second filtered image;
    将每个所述第一训练样本和对应的所述第一滤波图像和所述第二滤波图像进行拼接合并,分别生成带标签的第二训练样本;splicing and merging each of the first training samples and the corresponding first filtered images and the second filtered images to generate a labeled second training sample respectively;
    分别获取多个所述第一训练样本对应的多个所述第二训练样本,并将多个所述第二训练样本输入所述预设多分类网络进行迭代训练,以获取所述图像质量评估模型。Respectively obtain multiple second training samples corresponding to multiple first training samples, and input multiple second training samples into the preset multi-classification network for iterative training to obtain the image quality assessment Model.
  9. 根据权利要求1-8中任意一项所述的方法,其特征在于,The method according to any one of claims 1-8, wherein,
    所述预设多分类网络为ResNet网络,所述预设多分类网络使用二分类交叉熵函数作为损失函数且利用softmax函数进行二分类。The preset multi-classification network is a ResNet network, and the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  10. 根据权利要求1-9中任意一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-9, wherein the method further comprises:
    预先构建所述生成对抗网络,所述生成对抗网络包括所述生成网络和判别网络;其中,The generative adversarial network is pre-built, and the generative adversarial network includes the generative network and the discriminant network; wherein,
    所述生成网络包括线性映射层、多个卷积层以及位于所述多个卷积层的每个卷积层之后的批标准化函数和ReLU激活函数,所述生成网络用于接收随机噪音并生成所述伪图像样本;The generation network includes a linear mapping layer, a plurality of convolutional layers, and a batch normalization function and a ReLU activation function after each of the plurality of convolutional layers, the generation network is used to receive random noise and generate the pseudo-image sample;
    所述判别网络包括多个卷积层和位于所述多个卷积层的每个卷积层之后的LeakyRelu激活函数层和池化层,以及位于所述多个卷积层之后的全连接层、LeakyRelu激活函数层和sigmoid激活函数层,所述判别网络用于对所述真实图像样本和所述伪图像样本进行真伪判定。The discriminative network includes a plurality of convolutional layers and a LeakyRelu activation function layer and a pooling layer after each of the plurality of convolutional layers, and a fully connected layer after the plurality of convolutional layers , LeakyRelu activation function layer and sigmoid activation function layer, and the discriminant network is used to perform authenticity determination on the real image samples and the fake image samples.
  11. 根据权利要求10所述的方法,其特征在于,所述方法还包括:The method of claim 10, wherein the method further comprises:
    所述生成网络的损失函数采用交叉熵函数。The loss function of the generating network adopts a cross entropy function.
  12. 一种图像质量评估方法,其特征在于,包括:An image quality assessment method, comprising:
    接收待评估图像;receive images to be evaluated;
    利用如权利要求1-11中任意一项所述的方法训练得到的所述图像质量评估模型对所述待评估图像进行图像质量评估,以确认所述待评估图像为多个预设的图像质量级别之一。Perform image quality evaluation on the image to be evaluated by using the image quality evaluation model trained by the method according to any one of claims 1-11 to confirm that the image to be evaluated is of a plurality of preset image qualities one of the levels.
  13. 根据权利要求12所述的方法,其特征在于,所述待评估图像为待评估人脸图像,所述图像质量评估模型用于对人脸图像进行质量评估,所述方法还包括:The method according to claim 12, wherein the image to be evaluated is a face image to be evaluated, the image quality evaluation model is used to perform quality evaluation on the face image, and the method further comprises:
    接收所述待评估图像之后,利用人脸检测算法确定所述待评估人脸图像中的感兴趣区域(ROI),并根据确定的所述感兴趣区域(ROI)对所述待评估人脸图像进行裁剪处理;After receiving the to-be-evaluated image, a face detection algorithm is used to determine a region of interest (ROI) in the to-be-evaluated face image, and the to-be-evaluated face image is analyzed according to the determined region of interest (ROI). cut out;
    根据所述第一训练样本的尺寸对裁剪处理后的所述待评估人脸图像进行尺寸归一化;Perform size normalization on the cropped face image to be evaluated according to the size of the first training sample;
    利用关键点检测算法和/或姿态估计算法确定尺寸归一化之后的所述待评估人脸图像是否为正脸图像;Utilize key point detection algorithm and/or pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal face image;
    其中,若所述待评估人脸图像不是正脸图像则停止所述评估,若所述待评估人脸图像是正脸图像则利用所述图像质量评估模型对尺寸归一化之后的所述待评估图像进行图像质量评估。Wherein, if the to-be-evaluated face image is not a frontal image, the evaluation is stopped; if the to-be-evaluated face image is a frontal image, the image quality evaluation model is used to normalize the size of the to-be-evaluated image for image quality assessment.
  14. 根据权利要求12或13所述的方法,其特征在于,利用所述图像质量评估模型对所述待评估图像进行图像质量评估,包括:The method according to claim 12 or 13, characterized in that, using the image quality evaluation model to perform image quality evaluation on the image to be evaluated, comprising:
    对所述待评估图像进行行方向滤波处理,得到第一滤波待评估图像;Perform row direction filtering processing on the image to be evaluated to obtain a first filtered image to be evaluated;
    对所述待评估图像进行列方向滤波处理,得到第二滤波待评估图像;Performing column-direction filtering processing on the image to be evaluated to obtain a second filtered image to be evaluated;
    将所述待评估图像、所述第一滤波待评估图像和所述第二滤波待评估图像的合并图像输入所述图像质量评估模型进行评估,以确定所述待评估图像为预设的多个所述图像质量级别之一。The combined image of the image to be evaluated, the first filtered image to be evaluated and the second filtered image to be evaluated is input into the image quality evaluation model for evaluation, so as to determine that the image to be evaluated is a preset multiple One of the image quality levels.
  15. 一种模型训练装置,其特征在于,包括:A model training device, comprising:
    获取模块,用于获取真实图像样本集,其中,所述真实图像样本集包括多个真实图像样本;an acquisition module, configured to acquire a real image sample set, wherein the real image sample set includes a plurality of real image samples;
    生成对抗网络模块,用于利用所述真实图像样本集对预先构建的生成对抗网络进行迭代训练,收集所述生成对抗网络中的生成网络在多个迭代轮次中分别生成的多个伪图像样本集;A generative adversarial network module is used to iteratively train a pre-built generative adversarial network using the real image sample set, and collect multiple pseudo-image samples generated by the generative network in the generative adversarial network in multiple iteration rounds. set;
    自动标注模块,用于生成由所述真实图像样本集和所述多个伪图像样本集组成的第一训练样本库,根据多个预设的图像质量级别对所述第一训练样本库的每个第一训练样本进行自动分级标注,得到带标签的所述第一训练样本库;An automatic labeling module, configured to generate a first training sample library consisting of the real image sample set and the plurality of pseudo-image sample sets, and assign each of the first training sample library according to a plurality of preset image quality levels. The first training samples are automatically graded and labeled to obtain the labeled first training sample library;
    模型训练模块,用于利用所述第一训练样本库对预设多分类网络进行训练,获得图像质量评估模型。The model training module is used for training a preset multi-classification network by using the first training sample library to obtain an image quality evaluation model.
  16. 根据权利要求15所述的装置,其特征在于,所述自动标注模块还用于:The device according to claim 15, wherein the automatic labeling module is further configured to:
    将所述真实图像样本集包含的所述真实图像样本标注为最高的图像质量级别;Marking the real image samples included in the real image sample set as the highest image quality level;
    根据每个所述伪图像样本集对应的迭代轮次数将每个所述伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更高的迭代次数对应于更高的图像质量级别。The plurality of pseudo-image samples included in each of the pseudo-image sample sets are marked as corresponding image quality levels according to the number of iteration rounds corresponding to each of the pseudo-image sample sets, wherein a higher iteration number corresponds to a higher Image quality level.
  17. 根据权利要求15或16所述的装置,其特征在于,所述自动标注模块还用于:The device according to claim 15 or 16, wherein the automatic labeling module is further used for:
    将所述真实图像样本集包含的多个所述真实图像样本标注为最高的图像质量级别;Marking a plurality of the real image samples included in the real image sample set as the highest image quality level;
    计算每个所述伪图像样本集与所述真实图像样本集的弗雷歇距离,根据计算结果将每个所述伪图像样本集包含的多个伪图像样本标注为对应的图像质量级别,其中,更小的弗雷歇距离对应于更高的图像质量级别。Calculate the Frecher distance between each of the fake image sample sets and the real image sample set, and mark the multiple fake image samples included in each of the fake image sample sets as corresponding image quality levels according to the calculation results, wherein , a smaller Frechet distance corresponds to a higher level of image quality.
  18. 根据权利要求15-17中任意一项所述的装置,其特征在于,所述自动标注模块还用于:The device according to any one of claims 15-17, wherein the automatic labeling module is further configured to:
    将所述真实图像样本集包含的多个所述真实图像样本标注为最高的图像质量级别;Marking a plurality of the real image samples included in the real image sample set as the highest image quality level;
    计算每个伪图像样本与所述真实图像样本的均方误差(MSE)值,根据所述均方误差(MSE)值将每个所述伪图像样本标注为对应的图像质量级别,其中,更低的均方误差(MSE)值对应于更高的图像质量级别。Calculate the mean square error (MSE) value of each fake image sample and the real image sample, and mark each fake image sample as a corresponding image quality level according to the mean square error (MSE) value, wherein the more Lower mean squared error (MSE) values correspond to higher image quality levels.
  19. 根据权利要求15-18中任意一项所述的装置,其特征在于,所述获取模块还用于:The device according to any one of claims 15-18, wherein the acquiring module is further configured to:
    采集多个真实图像,对多个所述真实图像进行以下预处理操作:Collect multiple real images, and perform the following preprocessing operations on the multiple real images:
    利用对象检测算法确定每个所述真实图像中的感兴趣区域(ROI),并根据确定的所述感兴趣区域(ROI)对每个所述真实图像进行裁剪处理;以及,对多个所述真实图像进行尺寸归一化,以得到所述真实图像样本集。Determine a region of interest (ROI) in each of the real images by using an object detection algorithm, and perform cropping processing on each of the real images according to the determined region of interest (ROI); The real images are size-normalized to obtain the real image sample set.
  20. 根据权利要求19所述的装置,其特征在于,所述真实图像样本为人脸图像,所述对象检测算法为人脸检测算法。The device according to claim 19, wherein the real image sample is a face image, and the object detection algorithm is a face detection algorithm.
  21. 根据权利要求20所述的装置,其特征在于,所述获取真实图像样本集之后,所述获取模块还用于:The device according to claim 20, wherein after the acquisition of the real image sample set, the acquisition module is further configured to:
    利用关键点检测算法和/或姿态估计算法去除所述真实图像样本集中的非正脸图片。Use a key point detection algorithm and/or a pose estimation algorithm to remove non-frontal face pictures in the real image sample set.
  22. 根据权利要求15-21中任意一项所述的装置,其特征在于,所述模型训练模块还用于:The device according to any one of claims 15-21, wherein the model training module is further configured to:
    获取所述第一训练样本库中的每个带标签的所述第一训练样本,所述标签用于指示所述第一训练样本的图像质量级别;acquiring each labeled first training sample in the first training sample library, where the label is used to indicate an image quality level of the first training sample;
    对每个所述第一训练样本进行行方向滤波处理,得到第一滤波图像;Perform row direction filtering processing on each of the first training samples to obtain a first filtered image;
    对每个所述第一训练样本进行列方向滤波处理,得到第二滤波图像;Perform column-direction filtering processing on each of the first training samples to obtain a second filtered image;
    将每个所述第一训练样本和对应的所述第一滤波图像和所述第二滤波图像进行拼接合并,分别生成带标签的第二训练样本;splicing and merging each of the first training samples and the corresponding first filtered images and the second filtered images to generate a labeled second training sample respectively;
    分别获取多个所述第一训练样本对应的多个所述第二训练样本,并将多个所述第二训练样本输入所述预设多分类网络进行迭代训练,以获取所述图像质量评估模型。Respectively obtain multiple second training samples corresponding to multiple first training samples, and input multiple second training samples into the preset multi-classification network for iterative training to obtain the image quality assessment Model.
  23. 根据权利要求15-22中任意一项所述的装置,其特征在于,The device according to any one of claims 15-22, characterized in that,
    所述预设多分类网络为ResNet网络,所述预设多分类网络使用二分类交叉熵函数作为损失函数且利用softmax函数进行二分类。The preset multi-classification network is a ResNet network, and the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  24. 根据权利要求15-23中任意一项所述的装置,其特征在于,所述生成对抗网络模块还用于:The apparatus according to any one of claims 15-23, wherein the generative adversarial network module is further configured to:
    预先构建所述生成对抗网络,所述生成对抗网络包括所述生成网络和判别网络;其中,The generative adversarial network is pre-built, and the generative adversarial network includes the generative network and the discriminant network; wherein,
    所述生成网络包括线性映射层、多个卷积层以及位于所述多个卷积层的每个卷积层之后的批标准化函数和ReLU激活函数,所述生成网络用于接收随机噪音并生成所述伪图像样本;The generation network includes a linear mapping layer, a plurality of convolutional layers, and a batch normalization function and a ReLU activation function after each of the plurality of convolutional layers, the generation network is used to receive random noise and generate the pseudo-image sample;
    所述判别网络包括多个卷积层和位于所述多个卷积层的每个卷积层之后的LeakyRelu激活函数层和池化层,以及位于所述多个卷积层之后的全连接层、LeakyRelu激活函数层和sigmoid激活函数层,所述判别网络用于对所述真实图像样本和所述伪图像样本进行真伪判定。The discriminative network includes a plurality of convolutional layers and a LeakyRelu activation function layer and a pooling layer after each of the plurality of convolutional layers, and a fully connected layer after the plurality of convolutional layers , LeakyRelu activation function layer and sigmoid activation function layer, and the discriminant network is used to perform authenticity determination on the real image samples and the fake image samples.
  25. 根据权利要求24所述的装置,其特征在于,所述生成网络的损失函数采用交叉熵函数。The apparatus according to claim 24, wherein the loss function of the generation network adopts a cross-entropy function.
  26. 一种图像质量评估装置,其特征在于,包括:A device for evaluating image quality, comprising:
    接收模块,用于接收待评估图像;a receiving module for receiving the image to be evaluated;
    评估模块,用于利用如权利要求1-11任意一项所述的方法训练得到的所述图像质量评估模型对所述待评估图像进行图像质量评估,以确认所述待评估图像为多个预设的图像质量级别之一。The evaluation module is used to perform image quality evaluation on the image to be evaluated by using the image quality evaluation model trained by the method according to any one of claims 1-11, to confirm that the image to be evaluated is a plurality of pre-images. one of the set image quality levels.
  27. 根据权利要求26所述的装置,其特征在于,所述待评估图像为待评估人脸图像,所述图像质量评估模型用于对人脸图像进行质量评估,所述评估模块还用于:The device according to claim 26, wherein the image to be evaluated is a face image to be evaluated, the image quality evaluation model is used to perform quality evaluation on the face image, and the evaluation module is further used for:
    接收所述待评估图像之后,利用人脸检测算法确定所述待评估人脸图像中的感兴趣区域(ROI),并根据确定的所述感兴趣区域(ROI)对所述待评估人脸图像进行裁剪处理;After receiving the to-be-evaluated image, a face detection algorithm is used to determine a region of interest (ROI) in the to-be-evaluated face image, and the to-be-evaluated face image is analyzed according to the determined region of interest (ROI). cut out;
    根据所述第一训练样本的尺寸对裁剪处理后的所述待评估人脸图像进行尺寸归一化;Perform size normalization on the cropped face image to be evaluated according to the size of the first training sample;
    利用关键点检测算法和/或姿态估计算法确定尺寸归一化之后的所述待评估人脸图像是否为正脸图像;Utilize key point detection algorithm and/or pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal face image;
    其中,若所述待评估人脸图像不是正脸图像则停止所述评估,若所述待评估人脸图像是正脸图像则利用所述图像质量评估模型对尺寸归一化之后的所述待评估图像进行图像质量评估。Wherein, if the to-be-evaluated face image is not a frontal image, the evaluation is stopped; if the to-be-evaluated face image is a frontal image, the image quality evaluation model is used to normalize the size of the to-be-evaluated image for image quality assessment.
  28. 根据权利要求26或27所述的装置,其特征在于,所述评估模块还用于:The device according to claim 26 or 27, wherein the evaluation module is further used for:
    对所述待评估图像进行行方向滤波处理,得到第一滤波待评估图像;Perform row direction filtering processing on the image to be evaluated to obtain a first filtered image to be evaluated;
    对所述待评估图像进行列方向滤波处理,得到第二滤波待评估图像;Performing column-direction filtering processing on the image to be evaluated to obtain a second filtered image to be evaluated;
    将所述待评估图像、所述第一滤波待评估图像和所述第二滤波待评估图像的合并图像输入所述图像质量评估模型进行评估,以确定所述待评估图像为预设的多个所述图像质量级别之一。The combined image of the image to be evaluated, the first filtered image to be evaluated and the second filtered image to be evaluated is input into the image quality evaluation model for evaluation, so as to determine that the image to be evaluated is a preset multiple One of the image quality levels.
  29. 一种模型训练装置,其特征在于,包括:A model training device, comprising:
    至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如权利要求1-11任一项所述的方法。at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to execute: such as The method of any one of claims 1-11.
  30. 一种图像质量评估方法,其特征在于,包括:An image quality assessment method, comprising:
    至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如权利要求12-14任一项所述的方法。at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to execute: such as The method of any of claims 12-14.
  31. 一种计算机可读存储介质,所述计算机可读存储介质存储有程序,当所述程序被多核处理器执行时,使得所述多核处理器执行如权利要求1-11中任一项所述的方法,或执行如权利要求12-14中任一项所述的方法。A computer-readable storage medium, the computer-readable storage medium stores a program, when the program is executed by a multi-core processor, the multi-core processor is made to execute the method according to any one of claims 1-11 method, or performing a method as claimed in any one of claims 12-14.
PCT/CN2021/116766 2020-12-28 2021-09-06 Model training method, and image quality evaluation method and apparatus WO2022142445A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011578791.8 2020-12-28
CN202011578791.8A CN112700408B (en) 2020-12-28 2020-12-28 Model training method, image quality evaluation method and device

Publications (1)

Publication Number Publication Date
WO2022142445A1 true WO2022142445A1 (en) 2022-07-07

Family

ID=75512748

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/116766 WO2022142445A1 (en) 2020-12-28 2021-09-06 Model training method, and image quality evaluation method and apparatus

Country Status (2)

Country Link
CN (1) CN112700408B (en)
WO (1) WO2022142445A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115661619A (en) * 2022-11-03 2023-01-31 北京安德医智科技有限公司 Network model training method, ultrasonic image quality evaluation method, device and electronic equipment
CN116630465A (en) * 2023-07-24 2023-08-22 海信集团控股股份有限公司 Model training and image generating method and device
CN116958122A (en) * 2023-08-24 2023-10-27 北京东远润兴科技有限公司 SAR image evaluation method, SAR image evaluation device, SAR image evaluation equipment and readable storage medium
CN117218485A (en) * 2023-09-05 2023-12-12 安徽省第二测绘院 Deep learning model-based multi-source remote sensing image interpretation sample library creation method

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112700408B (en) * 2020-12-28 2023-09-08 中国银联股份有限公司 Model training method, image quality evaluation method and device
CN113569627B (en) * 2021-06-11 2024-06-14 北京旷视科技有限公司 Human body posture prediction model training method, human body posture prediction method and device
CN113780101A (en) * 2021-08-20 2021-12-10 京东鲲鹏(江苏)科技有限公司 Obstacle avoidance model training method and device, electronic equipment and storage medium
CN114970670A (en) * 2022-04-12 2022-08-30 阿里巴巴(中国)有限公司 Model fairness assessment method and device
CN115620079A (en) * 2022-09-19 2023-01-17 虹软科技股份有限公司 Sample label obtaining method and lens failure detection model training method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109102029A (en) * 2018-08-23 2018-12-28 重庆科技学院 Information, which maximizes, generates confrontation network model synthesis face sample quality appraisal procedure
CN111027439A (en) * 2019-12-03 2020-04-17 西北工业大学 SAR target recognition method for generating countermeasure network based on auxiliary classification
WO2020118584A1 (en) * 2018-12-12 2020-06-18 Microsoft Technology Licensing, Llc Automatically generating training data sets for object recognition
CN111476294A (en) * 2020-04-07 2020-07-31 南昌航空大学 Zero sample image identification method and system based on generation countermeasure network
CN112700408A (en) * 2020-12-28 2021-04-23 中国银联股份有限公司 Model training method, image quality evaluation method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10885383B2 (en) * 2018-05-16 2021-01-05 Nec Corporation Unsupervised cross-domain distance metric adaptation with feature transfer network
CN108874763A (en) * 2018-06-08 2018-11-23 深圳勇艺达机器人有限公司 A kind of corpus data mask method and system based on gunz
EP3611699A1 (en) * 2018-08-14 2020-02-19 Siemens Healthcare GmbH Image segmentation using deep learning techniques
CN109829894B (en) * 2019-01-09 2022-04-26 平安科技(深圳)有限公司 Segmentation model training method, OCT image segmentation method, device, equipment and medium
CN110110745A (en) * 2019-03-29 2019-08-09 上海海事大学 Based on the semi-supervised x-ray image automatic marking for generating confrontation network
CN110634108B (en) * 2019-08-30 2023-01-20 北京工业大学 Composite degraded network live broadcast video enhancement method based on element-cycle consistency confrontation network
CN111814875B (en) * 2020-07-08 2023-08-01 西安电子科技大学 Ship sample expansion method in infrared image based on pattern generation countermeasure network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109102029A (en) * 2018-08-23 2018-12-28 重庆科技学院 Information, which maximizes, generates confrontation network model synthesis face sample quality appraisal procedure
WO2020118584A1 (en) * 2018-12-12 2020-06-18 Microsoft Technology Licensing, Llc Automatically generating training data sets for object recognition
CN111027439A (en) * 2019-12-03 2020-04-17 西北工业大学 SAR target recognition method for generating countermeasure network based on auxiliary classification
CN111476294A (en) * 2020-04-07 2020-07-31 南昌航空大学 Zero sample image identification method and system based on generation countermeasure network
CN112700408A (en) * 2020-12-28 2021-04-23 中国银联股份有限公司 Model training method, image quality evaluation method and device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115661619A (en) * 2022-11-03 2023-01-31 北京安德医智科技有限公司 Network model training method, ultrasonic image quality evaluation method, device and electronic equipment
CN116630465A (en) * 2023-07-24 2023-08-22 海信集团控股股份有限公司 Model training and image generating method and device
CN116630465B (en) * 2023-07-24 2023-10-24 海信集团控股股份有限公司 Model training and image generating method and device
CN116958122A (en) * 2023-08-24 2023-10-27 北京东远润兴科技有限公司 SAR image evaluation method, SAR image evaluation device, SAR image evaluation equipment and readable storage medium
CN117218485A (en) * 2023-09-05 2023-12-12 安徽省第二测绘院 Deep learning model-based multi-source remote sensing image interpretation sample library creation method

Also Published As

Publication number Publication date
CN112700408A (en) 2021-04-23
CN112700408B (en) 2023-09-08

Similar Documents

Publication Publication Date Title
WO2022142445A1 (en) Model training method, and image quality evaluation method and apparatus
CN112381098A (en) Semi-supervised learning method and system based on self-learning in target segmentation field
CN112215822B (en) Face image quality evaluation method based on lightweight regression network
CN111581405A (en) Cross-modal generalization zero sample retrieval method for generating confrontation network based on dual learning
CN109271958B (en) Face age identification method and device
CN113887661B (en) Image set classification method and system based on representation learning reconstruction residual analysis
CN112949408B (en) Real-time identification method and system for target fish passing through fish channel
CN113191969A (en) Unsupervised image rain removing method based on attention confrontation generation network
Dong Optimal Visual Representation Engineering and Learning for Computer Vision
CN109086794B (en) Driving behavior pattern recognition method based on T-LDA topic model
CN113361646A (en) Generalized zero sample image identification method and model based on semantic information retention
Wu et al. Audio-visual kinship verification in the wild
CN111127400A (en) Method and device for detecting breast lesions
CN110442736B (en) Semantic enhancer spatial cross-media retrieval method based on secondary discriminant analysis
CN113095158A (en) Handwriting generation method and device based on countermeasure generation network
CN115147632A (en) Image category automatic labeling method and device based on density peak value clustering algorithm
CN114187467B (en) Method and device for classifying benign and malignant lung nodules based on CNN model
Jobin et al. Plant identification based on fractal refinement technique (FRT)
TW201828156A (en) Image identification method, measurement learning method, and image source identification method and device capable of effectively dealing with the problem of asymmetric object image identification so as to possess better robustness and higher accuracy
Zeng et al. Semantic invariant multi-view clustering with fully incomplete information
Zhang et al. Part-Aware Correlation Networks for Few-shot Learning
CN112597997A (en) Region-of-interest determining method, image content identifying method and device
CN111967383A (en) Age estimation method, and training method and device of age estimation model
Sameer et al. Source camera identification model: Classifier learning, role of learning curves and their interpretation
CN116343294A (en) Pedestrian re-identification method suitable for generalization of field

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913192

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21913192

Country of ref document: EP

Kind code of ref document: A1