WO2022142445A1 - Procédé d'entraînement de modèle et procédé et appareil d'évaluation de qualité d'image - Google Patents

Procédé d'entraînement de modèle et procédé et appareil d'évaluation de qualité d'image Download PDF

Info

Publication number
WO2022142445A1
WO2022142445A1 PCT/CN2021/116766 CN2021116766W WO2022142445A1 WO 2022142445 A1 WO2022142445 A1 WO 2022142445A1 CN 2021116766 W CN2021116766 W CN 2021116766W WO 2022142445 A1 WO2022142445 A1 WO 2022142445A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
evaluated
training
real
image quality
Prior art date
Application number
PCT/CN2021/116766
Other languages
English (en)
Chinese (zh)
Inventor
于文海
郭伟
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2022142445A1 publication Critical patent/WO2022142445A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the invention belongs to the field of computers, and in particular relates to a model training method, an image quality evaluation method and a device.
  • Image quality assessment is mainly divided into full reference quality assessment, semi-reference quality assessment, and no reference quality assessment.
  • the face image quality assessment is subject to individual differences in facial features, including but not limited to hairstyles, wearing glasses, makeup, etc., which will lead to large changes in content, which is a non-reference quality assessment.
  • the no-reference quality assessment methods most of the current methods still need to utilize subjective quality scores to train quality assessment models.
  • the training process of the existing image quality assessment model mainly includes image data collection, manual data cleaning and labeling of the collected data, and then detecting the region of interest through the detection model, and expanding the boundary margin to retain the object area with complete content, Object regions and human-annotated quality labels are input to the deep learning network for training learning.
  • Image quality assessment model training needs to collect a large amount of image data and label the quality score labels corresponding to the image data, which is a huge workload.
  • image data due to the individual subjectivity of those who perform the labeling work and the differences in the richness of the content contained in the images themselves, it is difficult to formulate a unified set of standards to perform the labeling work.
  • Different people are observing the same image. Due to differences in cognition, there will be differences in the quality level annotation of the same image. Therefore, data collection and annotation for quality assessment has always been a difficult problem in face image quality assessment.
  • the present invention provides the following solutions.
  • a method for training an image quality assessment model including: acquiring a real image sample set, wherein the real image sample set includes multiple real image samples; and using the real image sample set to iteratively train a pre-built generative adversarial network , collect multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds; generate a first training sample library consisting of real image sample sets and multiple fake image sample sets, according to multiple
  • the preset image quality level is automatically graded and labeled for each first training sample in the first training sample library to obtain a labeled first training sample library; the preset multi-classification network is trained by using the first training sample library to obtain Image quality assessment model.
  • automatically grading and labeling each first training sample in the first training sample library includes: labeling the real image samples included in the real image sample set as the highest image quality level; The number of iteration rounds corresponding to the set marks the multiple pseudo-image samples included in each pseudo-image sample set as the corresponding image quality level, wherein a higher iteration number corresponds to a higher image quality level.
  • automatically grading and labeling each first training sample in the first training sample library includes: labeling a plurality of real image samples included in the real image sample set as the highest image quality level; The Frescher distance between the image sample set and the real image sample set, according to the calculation result, the multiple fake image samples included in each fake image sample set are marked as the corresponding image quality level, wherein, the smaller Frescher distance corresponds to Higher image quality levels.
  • automatically grading and labeling each first training sample in the first training sample library includes: labeling a plurality of real image samples included in the real image sample set as the highest image quality level; The mean square error (MSE) value of the image sample and the real image sample, according to the mean square error (MSE) value, each fake image sample is marked as the corresponding image quality level, where the lower mean square error (MSE) value corresponds to at a higher image quality level.
  • MSE mean square error
  • acquiring a sample set of real images further includes: acquiring a plurality of real images, and performing the following preprocessing operations on the plurality of real images: determining a region of interest (ROI) in each real image by using an object detection algorithm, and Each real image is cropped according to the determined region of interest (ROI); and the size of the multiple real images is normalized to obtain a real image sample set.
  • ROI region of interest
  • the real image sample is a face image
  • the object detection algorithm is a face detection algorithm
  • the method further includes: removing non-frontal face pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
  • using the first training sample library to perform classification training on a preset multi-classification network to obtain an image quality assessment model includes: acquiring each labeled first training sample in the first training sample library, and the label is used to indicate the image quality level of the first training sample; perform row direction filtering processing on each first training sample to obtain a first filtered image; perform column direction filtering processing on each first training sample to obtain a second filtered image; Each first training sample and the corresponding first filtered image and the second filtered image are spliced and merged to generate a labeled second training sample respectively; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and A plurality of second training samples are input into a preset multi-classification network for iterative training to obtain an image quality evaluation model.
  • the preset multi-classification network is a ResNet network
  • the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  • the method further includes: pre-constructing a generative adversarial network, the generative adversarial network includes a generative network and a discriminant network; wherein the generative network includes a linear mapping layer, a plurality of convolutional layers, and each of the plurality of convolutional layers.
  • the generative network is used to receive random noise and generate fake image samples;
  • the discriminative network includes multiple convolutional layers and LeakyRelu after each convolutional layer of the multiple convolutional layers The activation function layer and pooling layer, as well as the fully connected layer, the LeakyRelu activation function layer and the sigmoid activation function layer after multiple convolutional layers, the discriminant network is used to determine the authenticity of real image samples and fake image samples.
  • the method further includes: generating a loss function of the network using a cross-entropy function.
  • an image quality evaluation method including: receiving an image to be evaluated; using an image quality evaluation model trained by the method of the first aspect to perform image quality evaluation on the image to be evaluated, so as to confirm that the image to be evaluated is a plurality of pre-evaluated images. one of the set image quality levels.
  • the image to be evaluated is a face image to be evaluated
  • the image quality evaluation model is used to perform quality evaluation on the face image
  • the method further includes: after receiving the image to be evaluated, using a face detection algorithm to determine the face to be evaluated
  • the region of interest (ROI) in the image, and the face image to be evaluated is cropped according to the determined region of interest (ROI);
  • the size of the cropped face image to be evaluated is normalized according to the size of the first training sample use key point detection algorithm and/or pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped. If the face image is a frontal image, the image quality evaluation model is used to evaluate the image quality of the image to be evaluated after the size is normalized.
  • using an image quality evaluation model to perform image quality assessment on the image to be evaluated includes: performing row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; performing column-direction filtering on the image to be evaluated to obtain a second Filter the image to be evaluated; input the combined image of the image to be evaluated, the first filtered image to be evaluated and the second filtered image to be evaluated into an image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels .
  • a model training device comprising: an acquisition module for acquiring a real image sample set, wherein the real image sample set includes a plurality of real image samples; a generative adversarial network module for using the real image sample set to pair The pre-built generative adversarial network is iteratively trained, and multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds are collected; the automatic labeling module is used to generate real image sample sets and multiple The first training sample library composed of pseudo-image sample sets is automatically graded and labeled for each first training sample of the first training sample library according to a plurality of preset image quality levels, so as to obtain a labeled first training sample library; the model The training module is used to train the preset multi-classification network by using the first training sample library to obtain an image quality evaluation model.
  • the automatic labeling module is further used to: label the real image samples included in the real image sample set as the highest image quality level;
  • the multiple pseudo-image samples included are annotated with corresponding image quality levels, where a higher number of iterations corresponds to a higher image quality level.
  • the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the Frecher distance between each fake image sample set and the real image sample set , according to the calculation results, the multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, wherein a smaller Frecher distance corresponds to a higher image quality level.
  • the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the mean square error (MSE) between each fake image sample and the real image sample value, and label each pseudo-image sample as a corresponding image quality level according to the mean square error (MSE) value, where lower mean square error (MSE) values correspond to higher image quality levels.
  • MSE mean square error
  • the acquisition module is further configured to: collect multiple real images, and perform the following preprocessing operations on the multiple real images: determine a region of interest (ROI) in each real image by using an object detection algorithm, and determine The region of interest (ROI) of each real image is cropped; and the size of multiple real images is normalized to obtain a real image sample set.
  • ROI region of interest
  • ROI region of interest
  • the real image sample is a face image
  • the object detection algorithm is a face detection algorithm
  • the acquiring module is further configured to: remove non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
  • the model training module is further configured to: obtain each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; for each first training sample The sample is subjected to row direction filtering processing to obtain a first filtered image; each first training sample is subjected to column direction filtering processing to obtain a second filtered image; each first training sample and the corresponding first filtered image and the second filtered image are The images are spliced and merged to respectively generate labeled second training samples; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and the plurality of second training samples are input into a preset multi-classification network for iterative training, to obtain an image quality assessment model.
  • the preset multi-classification network is a ResNet network
  • the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  • the generative adversarial network module is further configured to: construct a generative adversarial network in advance, and the generative adversarial network includes a generative network and a discriminant network; wherein, the generative network includes a linear mapping layer, a plurality of convolutional layers, and a plurality of convolutional layers located in a plurality of convolutional layers.
  • the batch normalization function and ReLU activation function after each convolutional layer of the layer is used to receive random noise and generate fake image samples;
  • the discriminative network includes multiple convolutional layers and each convolutional layer located in multiple convolutional layers
  • the loss function of the generative network employs a cross-entropy function.
  • an image quality evaluation device comprising: a receiving module for receiving the image to be evaluated; Evaluation to confirm that the image to be evaluated is one of several preset image quality levels.
  • the image to be evaluated is a face image to be evaluated
  • the image quality evaluation model is used to perform quality evaluation on the face image
  • the evaluation module is further configured to: after receiving the image to be evaluated, use a face detection algorithm to determine the image to be evaluated
  • the region of interest (ROI) in the face image, and the face image to be evaluated is cropped according to the determined region of interest (ROI); according to the size of the first training sample, the cropped face image to be evaluated is sized.
  • Normalization use the key point detection algorithm and/or the pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped, if If the face image to be evaluated is a frontal face image, the image quality evaluation model is used to evaluate the image quality of the to-be-evaluated image after size normalization.
  • the evaluation module is further configured to: perform row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; perform column-direction filtering on the image to be evaluated to obtain a second filtered image to be evaluated; , the combined image of the first filtered image to be evaluated and the second filtered image to be evaluated is input to the image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels.
  • a model training apparatus comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are processed by the at least one processor The processor executes to enable at least one processor to execute: the method of the first aspect.
  • an image quality assessment method comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor.
  • the processor executes to enable at least one processor to execute: the method of the second aspect.
  • a computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method of the first aspect and/or the second aspect.
  • FIG. 1 is a schematic flowchart of a model training method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a generative adversarial network according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a generation network according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a discrimination network according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of splicing a first training sample and a corresponding first filtered image and a second filtered image according to an embodiment of the present invention
  • FIG. 6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a model training apparatus according to another embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of an image quality assessment apparatus according to an embodiment of the present invention.
  • Embodiments of the present invention provide a model training method, an image quality assessment method, and an apparatus.
  • the inventive concept of the model training method is first introduced.
  • Embodiments of the present invention provide a model training method for training an image quality assessment model. Specifically, firstly, a real image sample set including multiple real image samples is obtained, a pre-built generative adversarial network is used, and the real image sample is used. Iteratively trains the generative adversarial network, and collects multiple fake image sample sets generated by the generation network in multiple iteration rounds, and generates a first training sample consisting of a real image sample set and multiple fake image sample sets.
  • each first training sample in the first training sample library can be automatically graded and labeled according to multiple preset image quality levels , obtain the first training sample library with labels, and further use the first training sample library to train the preset multi-class network to obtain an image quality evaluation model, and finally use the trained image quality evaluation model to evaluate the image quality of the image to be evaluated.
  • This embodiment only needs to collect a small number of clear real image samples to generate a large number of fake image samples of different quality levels, complete the labeling during the generation process, avoid manual intervention, reduce labor costs, and improve the quality of data labeling.
  • the training of the image quality assessment model is completed at a lower cost.
  • FIG. 1 is a schematic flowchart of a model training method 100 according to an embodiment of the present application, which is used to evaluate the quality of an image.
  • the execution subject may be one or more electronic devices; from a program perspective
  • the execution body may be a program mounted on these electronic devices accordingly.
  • the method 100 may include:
  • Step 101 Obtain a real image sample set, wherein the real image sample set includes a plurality of real image samples.
  • step 101 may further include: collecting multiple real images, and performing the following preprocessing operations on the multiple real images: using an object detection algorithm to determine the Each real image is cropped according to the determined region of interest (ROI); and the size of multiple real images is normalized to obtain a real image sample set.
  • ROI region of interest
  • the real image may be image data for a specific object, such as a face image, an animal image, a vehicle image, and the like.
  • Object detection algorithms are used to detect target objects from real images to obtain regions of interest (ROIs).
  • the real image sample is a face image
  • the object detection algorithm is a face detection algorithm
  • the method may further include: removing non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm. In this way, it can be avoided that the non-frontal face pictures will adversely affect the subsequent training.
  • ROI region of interest
  • Step 102 iteratively train the pre-built generative adversarial network using the real image sample set, and collect multiple fake image sample sets generated by the generative network in the generative adversarial network in multiple iteration rounds.
  • the process of iterative training of a pre-built generative adversarial network is shown.
  • the generative and discriminative networks have opposite goals: the discriminative network tries to distinguish fake images from real images, while the generative network tries to produce images that look real enough to fool the discriminative network.
  • each training iteration can be divided into two stages: In the first stage, the discriminative network is trained, a batch of real images is sampled from the real image sample set D1, and the generative network receives Random noise R and generate fake image samples R', real image sample set D1 and fake image samples R' to form a training batch, where the labels of fake image samples are set to 0 (pseudo) and the labels of real image samples are set to 1 ( true), and train the discriminative network on this labeled batch using a binary cross-entropy loss. Backpropagation at this stage can only optimize the weights of the discriminative network.
  • the second stage train the generative network, first use the generative network to generate another batch of fake image samples, and then use the discriminative network again to judge whether the image is a fake image sample or a real image sample, in this stage all labels are set to 1 (true image sample) ). In other words, it is hoped that the discriminative network will falsely judge the fake image samples produced by the generative network to be true.
  • the weights of the discriminative network are fixed, so backpropagation only affects the weights of the generative network.
  • step 102 further includes: pre-constructing a generative adversarial network, where the generative adversarial network includes a generative network and a discriminant network; wherein the generative network includes a linear mapping layer, a plurality of convolutional layers, and each of the plurality of convolutional layers.
  • the generative network is used to generate fake image samples based on random noise.
  • the discriminative network includes multiple convolutional layers and a LeakyRelu activation function layer and a pooling layer located after each convolutional layer of the multiple convolutional layers, and a fully connected layer, a LeakyRelu activation function layer and a fully connected layer located after the multiple convolutional layers.
  • the sigmoid activation function layer, the discriminant network is used to determine the authenticity of real image samples and fake image samples.
  • the input of the generator network is 20-dimensional random noise of length 3*H*2*W*2, the first layer is a linear mapping, and the input is mapped as 1*3*(H*2)*( W*2) four-dimensional data;
  • the second layer is convolution operation, the output result of the first layer is convolved with the Kernel of 50*3*3, where the step size is 1, the padding is 1;
  • the third layer is convolution Operation, convolve the output result of the second layer with the 25*3*3 Kernel, where the step size is 1 and the padding is 1;
  • the fourth layer is convolution operation, and the output result of the third layer is convolved with 16*3*3
  • the Kernel is convolved, where the step size is 2, and the padding is 1;
  • the fifth layer is convolution operation, and the output result of the fourth layer is convolved with the 16*3*3 Kernel, where the step size is 1, and the padding is 1;
  • the sixth layer is a convolution
  • the output result is convolved with the 8*3*3 Kernel, where the step size is 1 and the padding is 1; the seventh layer is the convolution operation, and the sixth layer output result is convolved with the 3*3*3 Kernel, where the step size is 1 and the padding is 1.
  • a BatchNormlization layer and a ReLU activation function layer are added to the output of each of the above layers.
  • the loss function of the generating network adopts a cross-entropy function. Specifically, the calculation of the loss function uses the cross-entropy function between the prediction results of the adversarial network on the fake image samples and the real labels.
  • the input of the discriminant network is the real image sample set D1 and the fake image sample set R'
  • the label of the real image sample set D1 is set to 1 (true)
  • the label of the fake image sample set R' is set to 0 (pseudo)
  • the convolution result is processed by the LeakyRelu activation function, followed by a 2*2 average pooling process with a step size of 2;
  • the second layer is a convolution operation, Convolve the output result of the first layer with a 32*3*3 Kernel, where the step size is 1, the padding is 1, and the convolution result is processed by the LeakyRelu activation function, followed by 2* with a step size of 2
  • the third layer is convolution operation, the output result of the second layer is convolved with the 16*3*3 Kernel, where the step size is 1, the padding is 1, and the volume is adjusted by the LeakyRelu activation function.
  • the product results are processed, followed by a 2*2 average pooling process with a step size of 2; the fourth layer is 2 fully connected layers, the output of the third layer is mapped to 1*1024 dimensions, and the LeakyRelu activation function is used to pair the
  • the convolution result is processed, the 1*1024 dimension is mapped to 1*1 dimension, and finally the sigmoid activation function is connected to obtain a probability before 0-1, so as to perform two-classification.
  • Step 103 Generate a first training sample library consisting of a real image sample set and a plurality of pseudo-image sample sets, and automatically grade and label each first training sample in the first training sample library according to a plurality of preset image quality levels. , to get the first training sample library with labels.
  • the automatic grading and labeling of each first training sample in the first training sample library includes: according to the number of iterations corresponding to each pseudo image sample set, Multiple fake image samples are marked as corresponding image quality levels, wherein higher iteration times correspond to higher image quality levels; the real image samples included in the real image sample set are marked as the highest image quality level.
  • the preset image quality levels can be divided into 6 types from high to low, including "level I", “level II”, ..., "level VI".
  • the pseudo-image samples and real image samples can be stored in stages according to their quality by saving the pseudo-image samples in the intermediate training process. For example, when iterating 500 times, the image quality level of the pseudo-image sample set generated by the generation network is "level VI". When iteratively 1000 times, the image quality level of the fake image sample set generated by the network is "V level”. As the number of iterations increases, the distinction between the fake image samples generated by the network and the real image samples collected is lower. , that is, the pseudo-image sample set generated by the network has a higher image quality level and better quality.
  • multiple levels of pseudo-image sample sets with different qualities can be generated.
  • the multiple pseudo-image samples included in each pseudo-image sample set can be marked as corresponding image quality levels according to the number of iteration rounds corresponding to each pseudo-image sample set (such as the above-mentioned 500 times, 1000 times, etc.), wherein, A higher number of iterations corresponds to a higher image quality level; the real image samples contained in the real image sample set are marked as the highest image quality level "Class I".
  • the automatic grading and labeling of each first training sample in the first training sample library based on the image quality in step 103 includes: calculating the Fraser difference between each fake image sample set and the real image sample set Distance (Frech Inception Distance); according to the calculation results, multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, where a smaller Frech distance corresponds to a higher image quality level; And, multiple real image samples included in the real image sample set are marked as the highest image quality level.
  • the preset image quality levels can be divided into 6 types from high to low, including "level I", “level II”, ..., "level VI”.
  • the corresponding pseudo-image samples for different iterations can be saved in folders.
  • folder F1 stores the pseudo-image samples corresponding to the 10th round of training
  • folder F2 stores the corresponding pseudo-image samples when the training reaches the 20th round.
  • the Frecher distance between the data in multiple folders and the real image sample set D1 can be calculated to measure the quality difference between the generated fake image samples and the clear real image samples, and according to Calculation results, according to the distribution of Frecher distance, the folders generated in the second step are grouped into 5 categories, and they are arranged according to the distance from small to large to obtain a real image sample set D1 with a quality of "level I" and a quality of "level II".
  • the fake image sample set ..., the fake image sample set of quality "VI”.
  • parameters for evaluating the degree of similarity such as cosine similarity, KL divergence (Kullback-Leibler divergence), etc., may also be used instead of the above-mentioned Frechet distance.
  • the cosine similarity between the data in multiple folders and the real image sample set D1 can be calculated to measure the quality difference between the generated fake image samples and the clear real image sample pictures, and the quality can be sorted in descending order of similarity. It is a real image sample set D1 of "level I", a fake image sample set of "level II” quality, ..., a fake image sample set of "level VI" quality.
  • the evaluation of the degree of similarity may be performed based on the data in multiple folders and some image information of the real image sample set D1, or may also be based on the data in multiple folders and all of the real image sample set D1.
  • the image information is evaluated for the above degree of similarity, which is not specifically limited in this application.
  • the automatically labeled first training sample library D can be obtained, wherein the first training sample library D contains 6 subfolders, which can correspond to real image samples D1 with a quality of "Class I” and a quality of "II".
  • the automatic grading and labeling of each first training sample in the first training sample library based on the image quality in step 103 includes: calculating the mean square error (MSE) between each fake image sample and the real image sample ) value; label each fake image sample as a corresponding image quality level according to a mean squared error (MSE) value, where lower mean squared error (MSE) values correspond to higher image quality levels;
  • MSE mean square error
  • the image sample set contains multiple real image samples annotated with the highest image quality level.
  • Step 104 Use the first training sample library to train a preset multi-classification network to obtain an image quality evaluation model.
  • the first training sample library consists of a real image sample set and a plurality of fake image sample sets, and each of the first training samples carries a label used to indicate the image quality. For example, it is assumed that the image quality is divided into 6 from high to low.
  • the first training sample which is a real image sample, is labeled as "Class I”
  • the first training sample which is a fake image sample, is based on its image
  • the order of quality from good to bad is "Class II", ..., "Class VI”. Therefore, the preset multi-classification network can be trained by using the first training sample library until it converges, and the obtained image quality evaluation model can determine the image quality according to the input image as "level I", “level II”, ..., " One of the "Class VI”.
  • step 104 may specifically include: acquiring each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; Perform row direction filtering processing to obtain a first filtered image; perform column direction filtering processing on each first training sample to obtain a second filtered image; combine each first training sample with the corresponding first filtered image and the second filtered image Perform splicing and merging to generate labeled second training samples respectively; respectively obtain multiple second training samples corresponding to multiple first training samples, and input multiple second training samples into a preset multi-classification network for iterative training to Obtain an image quality assessment model.
  • first training samples Img of the first training sample library row-direction filtering and column-direction filtering are respectively performed on them, that is, the first training samples Img and the convolution kernel of 1*N are performed.
  • the filtered image Img1 in the row direction (the first filtered image) is obtained by convolution, and the first training sample Img is convolved with another convolution kernel of N*1 to obtain the filtered image Img2 in the column direction (the second filtered image). image).
  • the above-mentioned Img, Img1, and Img2 are merged into a picture of H*(3*W) (that is, the second training sample), as shown in Figure 5, in the second training sample, the first training sample Img is on the left, The first filtered image Img1 is in the middle and the second filtered image Img2 is on the right.
  • a plurality of second training samples corresponding to the plurality of first training samples constitute a second training sample library.
  • the plurality of second training samples are input into a preset multi-classification network for iterative training, and an image quality evaluation model can be obtained.
  • the above-mentioned preset multi-classification network is a ResNet network
  • the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  • the preset multi-classification network may also use a network other than ResNet.
  • an embodiment of the present invention also provides an image quality assessment method, which uses the model training method of the above embodiment to perform the image quality assessment method, which specifically includes: receiving an image to be assessed; using the method described in the above embodiment
  • the image quality evaluation model trained by the training method performs image quality evaluation on the image to be evaluated, so as to confirm that the image to be evaluated is one of a plurality of preset image quality levels.
  • the image to be evaluated is a face image to be evaluated
  • the image quality evaluation model is used to perform quality evaluation on the face image
  • the method further includes: after receiving the image to be evaluated, using a face detection algorithm to determine the face to be evaluated
  • the region of interest (ROI) in the image, and the face image to be evaluated is cropped according to the determined region of interest (ROI);
  • the size of the cropped face image to be evaluated is normalized according to the size of the first training sample use key point detection algorithm and/or pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped. If the face image is a frontal image, the image quality evaluation model is used to evaluate the image quality of the image to be evaluated after the size is normalized.
  • using an image quality evaluation model to perform image quality assessment on the image to be evaluated includes: performing row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; performing column-direction filtering on the image to be evaluated to obtain a second Filter the image to be evaluated; input the combined image of the image to be evaluated, the first filtered image to be evaluated and the second filtered image to be evaluated into an image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels .
  • FIG. 6 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention.
  • the model training device includes:
  • the obtaining module 601 is configured to obtain a real image sample set, wherein the real image sample set includes a plurality of real image samples.
  • the generative adversarial network module 602 is used to iteratively train the pre-built generative adversarial network by using the real image sample set, and collect multiple pseudo-image sample sets respectively generated by the generative network in the generative adversarial network in multiple iteration rounds;
  • the automatic labeling module 603 is used to generate a first training sample library consisting of a real image sample set and a plurality of pseudo-image sample sets, and according to a plurality of preset image quality levels, each first training sample of the first training sample library is Perform automatic grading and labeling to obtain the first training sample library with labels;
  • the model training module 604 is configured to use the first training sample library to train a preset multi-classification network to obtain an image quality evaluation model.
  • the automatic labeling module is further used to: label the real image samples included in the real image sample set as the highest image quality level;
  • the multiple pseudo-image samples included are annotated with corresponding image quality levels, where a higher number of iterations corresponds to a higher image quality level.
  • the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the Frecher distance between each fake image sample set and the real image sample set , according to the calculation results, the multiple pseudo-image samples included in each pseudo-image sample set are marked as corresponding image quality levels, wherein a smaller Frecher distance corresponds to a higher image quality level.
  • the automatic labeling module is further configured to: label multiple real image samples included in the real image sample set as the highest image quality level; calculate the mean square error (MSE) between each fake image sample and the real image sample value, and label each pseudo-image sample as a corresponding image quality level according to the mean square error (MSE) value, where lower mean square error (MSE) values correspond to higher image quality levels.
  • MSE mean square error
  • the acquisition module is further configured to: collect multiple real images, and perform the following preprocessing operations on the multiple real images: determine a region of interest (ROI) in each real image by using an object detection algorithm, and determine The region of interest (ROI) of each real image is cropped; and the size of multiple real images is normalized to obtain a real image sample set.
  • ROI region of interest
  • ROI region of interest
  • the real image sample is a face image
  • the object detection algorithm is a face detection algorithm
  • the acquiring module is further configured to: remove non-frontal pictures in the real image sample set by using a key point detection algorithm and/or a pose estimation algorithm.
  • the model training module is further configured to: obtain each labeled first training sample in the first training sample library, where the label is used to indicate the image quality level of the first training sample; for each first training sample The sample is subjected to row direction filtering processing to obtain a first filtered image; each first training sample is subjected to column direction filtering processing to obtain a second filtered image; each first training sample and the corresponding first filtered image and the second filtered image are The images are spliced and merged to respectively generate labeled second training samples; a plurality of second training samples corresponding to a plurality of first training samples are obtained respectively, and the plurality of second training samples are input into a preset multi-classification network for iterative training, to obtain an image quality assessment model.
  • the preset multi-classification network is a ResNet network
  • the preset multi-classification network uses a binary cross-entropy function as a loss function and uses a softmax function for binary classification.
  • the generative adversarial network module is further configured to: construct a generative adversarial network in advance, and the generative adversarial network includes a generative network and a discriminant network; wherein, the generative network includes a linear mapping layer, a plurality of convolutional layers, and a plurality of convolutional layers located in a plurality of convolutional layers.
  • the batch normalization function and ReLU activation function after each convolutional layer of the layer is used to receive random noise and generate fake image samples;
  • the discriminative network includes multiple convolutional layers and each convolutional layer located in multiple convolutional layers
  • the loss function of the generative network employs a cross-entropy function.
  • model training apparatus in the embodiment of the present application can implement each process of the foregoing embodiment of the model training method, and achieve the same effects and functions, which will not be repeated here.
  • the embodiments of the present invention further provide an image quality evaluation apparatus, which is used to execute the image quality evaluation methods provided by the above embodiments. Specifically, it includes: a receiving module for receiving an image to be evaluated; an evaluation module for performing image quality evaluation on the image to be evaluated by using the image quality evaluation model trained by the method in the first aspect, to confirm that the image to be evaluated is a plurality of pre-evaluated images. one of the set image quality levels.
  • the image to be evaluated is a face image to be evaluated
  • the image quality evaluation model is used to perform quality evaluation on the face image
  • the evaluation module is further configured to: after receiving the image to be evaluated, use a face detection algorithm to determine the image to be evaluated
  • the region of interest (ROI) in the face image, and the face image to be evaluated is cropped according to the determined region of interest (ROI); according to the size of the first training sample, the cropped face image to be evaluated is sized.
  • Normalization use the key point detection algorithm and/or the pose estimation algorithm to determine whether the face image to be evaluated after size normalization is a frontal image; wherein, if the face image to be evaluated is not a frontal image, the evaluation is stopped, if If the face image to be evaluated is a frontal face image, the image quality evaluation model is used to evaluate the image quality of the to-be-evaluated image after size normalization.
  • the evaluation module is further configured to: perform row-direction filtering on the image to be evaluated to obtain a first filtered image to be evaluated; perform column-direction filtering on the image to be evaluated to obtain a second filtered image to be evaluated; , the combined image of the first filtered image to be evaluated and the second filtered image to be evaluated is input to the image quality evaluation model for evaluation, to determine that the image to be evaluated is one of a plurality of preset image quality levels.
  • FIG. 7 is a model training apparatus according to an embodiment of the present application, for executing the model training method shown in FIG. 1 , the apparatus includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein, The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the model training method described in the above embodiments.
  • FIG. 8 is an image quality evaluation apparatus according to an embodiment of the present application, which is used for executing the image quality evaluation method shown in the above-mentioned embodiment, the apparatus includes: at least one processor; and a memory connected in communication with the at least one processor ; wherein, the memory stores instructions executable by at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the image quality assessment method described in the above embodiments.
  • a non-volatile computer storage medium of a model training method and an image quality assessment method having computer-executable instructions stored thereon, the computer-executable instructions being configured to be executed when executed by a processor : the method described in the above embodiment.
  • the apparatuses, devices, and computer-readable storage media and methods provided in the embodiments of the present application are in one-to-one correspondence. Therefore, the apparatuses, devices, and computer-readable storage media also have beneficial technical effects similar to those of the corresponding methods.
  • the beneficial technical effects of the method have been described in detail, therefore, the beneficial technical effects of the apparatus, equipment and computer-readable storage medium will not be repeated here.
  • embodiments of the present invention may be provided as a method, system or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-persistent memory in computer readable media, random access memory (RAM) and/or non-volatile memory in the form of, for example, read only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash memory
  • Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • PRAM phase-change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read only memory
  • EEPROM Electrically Erasable Programmable Read Only Memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé d'entraînement de modèle, ainsi qu'un procédé et un appareil d'évaluation de qualité d'image. Le procédé d'entraînement consiste à : acquérir un ensemble d'échantillons d'images réelles, l'ensemble d'échantillons d'images réelles comprenant une pluralité d'échantillons d'images réelles (101) ; effectuer un entraînement itératif sur un réseau antagoniste génératif pré-construit à l'aide de l'ensemble d'échantillons d'images réelles et collecter une pluralité d'ensembles de pseudo-échantillons d'images qui sont respectivement générés à l'intérieur d'une pluralité de cycles d'itération par un réseau génératif dans le réseau antagoniste génératif (102) ; générer une première bibliothèque d'échantillons d'entraînement composée de l'ensemble d'échantillons d'images réelles et de la pluralité d'ensembles de pseudo-échantillons d'images et catégoriser et étiqueter automatiquement chaque premier échantillon d'entraînement de la première bibliothèque d'échantillons d'entraînement selon une pluralité de niveaux de qualité d'image prédéfinis pour obtenir la première bibliothèque d'échantillons d'entraînement (103) ; et entraîner un réseau multi-classification prédéfini à l'aide de la première bibliothèque d'échantillons d'entraînement de façon à obtenir un modèle d'évaluation de qualité d'image (104). En utilisant le procédé mentionné ci-dessus, seul un petit nombre d'échantillons d'images réelles claires doit être collecté pour générer un grand nombre de pseudo-échantillons d'images de différents niveaux de qualité et l'étiquetage automatique réduit les coûts manuels tout en améliorant la qualité d'étiquetage des données, de telle sorte que le modèle d'évaluation de qualité d'image soit entraîné à un coût inférieur.
PCT/CN2021/116766 2020-12-28 2021-09-06 Procédé d'entraînement de modèle et procédé et appareil d'évaluation de qualité d'image WO2022142445A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011578791.8 2020-12-28
CN202011578791.8A CN112700408B (zh) 2020-12-28 2020-12-28 模型训练方法、图像质量评估方法及装置

Publications (1)

Publication Number Publication Date
WO2022142445A1 true WO2022142445A1 (fr) 2022-07-07

Family

ID=75512748

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/116766 WO2022142445A1 (fr) 2020-12-28 2021-09-06 Procédé d'entraînement de modèle et procédé et appareil d'évaluation de qualité d'image

Country Status (2)

Country Link
CN (1) CN112700408B (fr)
WO (1) WO2022142445A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115661619A (zh) * 2022-11-03 2023-01-31 北京安德医智科技有限公司 网络模型训练、超声图像质量评估方法及装置、电子设备
CN116630465A (zh) * 2023-07-24 2023-08-22 海信集团控股股份有限公司 一种模型训练、图像生成方法及设备
CN116958122A (zh) * 2023-08-24 2023-10-27 北京东远润兴科技有限公司 Sar图像评估方法、装置、设备及可读存储介质
CN117218485A (zh) * 2023-09-05 2023-12-12 安徽省第二测绘院 基于深度学习模型的多源遥感影像解译样本库创建方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112700408B (zh) * 2020-12-28 2023-09-08 中国银联股份有限公司 模型训练方法、图像质量评估方法及装置
CN113569627B (zh) * 2021-06-11 2024-06-14 北京旷视科技有限公司 人体姿态预测模型训练方法、人体姿态预测方法及装置
CN113780101A (zh) * 2021-08-20 2021-12-10 京东鲲鹏(江苏)科技有限公司 避障模型的训练方法、装置、电子设备及存储介质
CN114970670A (zh) * 2022-04-12 2022-08-30 阿里巴巴(中国)有限公司 模型公平性评估方法及装置
CN115620079A (zh) * 2022-09-19 2023-01-17 虹软科技股份有限公司 样本标签的获取方法和镜头失效检测模型的训练方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109102029A (zh) * 2018-08-23 2018-12-28 重庆科技学院 信息最大化生成对抗网络模型合成人脸样本质量评估方法
CN111027439A (zh) * 2019-12-03 2020-04-17 西北工业大学 基于辅助分类生成对抗网络的sar目标识别方法
WO2020118584A1 (fr) * 2018-12-12 2020-06-18 Microsoft Technology Licensing, Llc Génération automatique d'ensembles de données d'apprentissage pour reconnaissance d'objets
CN111476294A (zh) * 2020-04-07 2020-07-31 南昌航空大学 一种基于生成对抗网络的零样本图像识别方法及系统
CN112700408A (zh) * 2020-12-28 2021-04-23 中国银联股份有限公司 模型训练方法、图像质量评估方法及装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10885383B2 (en) * 2018-05-16 2021-01-05 Nec Corporation Unsupervised cross-domain distance metric adaptation with feature transfer network
CN108874763A (zh) * 2018-06-08 2018-11-23 深圳勇艺达机器人有限公司 一种基于群智的语料库数据标注方法及系统
EP3611699A1 (fr) * 2018-08-14 2020-02-19 Siemens Healthcare GmbH Segmentation d'image par techniques d'apprentissage profond
CN109829894B (zh) * 2019-01-09 2022-04-26 平安科技(深圳)有限公司 分割模型训练方法、oct图像分割方法、装置、设备及介质
CN110110745A (zh) * 2019-03-29 2019-08-09 上海海事大学 基于生成对抗网络的半监督x光图像自动标注
CN110634108B (zh) * 2019-08-30 2023-01-20 北京工业大学 一种基于元-循环一致性对抗网络的复合降质网络直播视频增强方法
CN111814875B (zh) * 2020-07-08 2023-08-01 西安电子科技大学 基于样式生成对抗网络的红外图像中舰船样本扩充方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109102029A (zh) * 2018-08-23 2018-12-28 重庆科技学院 信息最大化生成对抗网络模型合成人脸样本质量评估方法
WO2020118584A1 (fr) * 2018-12-12 2020-06-18 Microsoft Technology Licensing, Llc Génération automatique d'ensembles de données d'apprentissage pour reconnaissance d'objets
CN111027439A (zh) * 2019-12-03 2020-04-17 西北工业大学 基于辅助分类生成对抗网络的sar目标识别方法
CN111476294A (zh) * 2020-04-07 2020-07-31 南昌航空大学 一种基于生成对抗网络的零样本图像识别方法及系统
CN112700408A (zh) * 2020-12-28 2021-04-23 中国银联股份有限公司 模型训练方法、图像质量评估方法及装置

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115661619A (zh) * 2022-11-03 2023-01-31 北京安德医智科技有限公司 网络模型训练、超声图像质量评估方法及装置、电子设备
CN116630465A (zh) * 2023-07-24 2023-08-22 海信集团控股股份有限公司 一种模型训练、图像生成方法及设备
CN116630465B (zh) * 2023-07-24 2023-10-24 海信集团控股股份有限公司 一种模型训练、图像生成方法及设备
CN116958122A (zh) * 2023-08-24 2023-10-27 北京东远润兴科技有限公司 Sar图像评估方法、装置、设备及可读存储介质
CN117218485A (zh) * 2023-09-05 2023-12-12 安徽省第二测绘院 基于深度学习模型的多源遥感影像解译样本库创建方法

Also Published As

Publication number Publication date
CN112700408A (zh) 2021-04-23
CN112700408B (zh) 2023-09-08

Similar Documents

Publication Publication Date Title
WO2022142445A1 (fr) Procédé d'entraînement de modèle et procédé et appareil d'évaluation de qualité d'image
CN112381098A (zh) 基于目标分割领域自学习的半监督学习方法和系统
CN112215822B (zh) 一种基于轻量级回归网络的人脸图像质量评估方法
CN111581405A (zh) 基于对偶学习生成对抗网络的跨模态泛化零样本检索方法
CN109271958B (zh) 人脸年龄识别方法及装置
CN113887661B (zh) 一种基于表示学习重构残差分析的图像集分类方法及系统
CN112949408B (zh) 一种过鱼通道目标鱼类实时识别方法和系统
CN113191969A (zh) 一种基于注意力对抗生成网络的无监督图像除雨方法
Dong Optimal Visual Representation Engineering and Learning for Computer Vision
CN109086794B (zh) 一种基于t-lda主题模型的驾驶行为模式识方法
CN113361646A (zh) 基于语义信息保留的广义零样本图像识别方法及模型
Wu et al. Audio-visual kinship verification in the wild
CN111127400A (zh) 一种乳腺病变检测方法和装置
CN110442736B (zh) 一种基于二次判别分析的语义增强子空间跨媒体检索方法
CN113095158A (zh) 一种基于对抗生成网络的笔迹生成方法及装置
CN115147632A (zh) 基于密度峰值聚类算法的图像类别自动标注方法及装置
CN114187467B (zh) 基于cnn模型的肺结节良恶性分类方法及装置
Jobin et al. Plant identification based on fractal refinement technique (FRT)
TW201828156A (zh) 圖像識別方法、度量學習方法、圖像來源識別方法及裝置
Zeng et al. Semantic invariant multi-view clustering with fully incomplete information
Zhang et al. Part-Aware Correlation Networks for Few-shot Learning
CN112597997A (zh) 感兴趣区域确定方法、图像内容识别方法及装置
CN111967383A (zh) 年龄估计方法、年龄估计模型的训练方法和装置
Sameer et al. Source camera identification model: Classifier learning, role of learning curves and their interpretation
CN116343294A (zh) 一种适用于领域泛化的行人重识别方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913192

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21913192

Country of ref document: EP

Kind code of ref document: A1