WO2023284465A1 - 图像检测方法、装置、计算机可读存储介质及计算机设备 - Google Patents

图像检测方法、装置、计算机可读存储介质及计算机设备 Download PDF

Info

Publication number
WO2023284465A1
WO2023284465A1 PCT/CN2022/098383 CN2022098383W WO2023284465A1 WO 2023284465 A1 WO2023284465 A1 WO 2023284465A1 CN 2022098383 W CN2022098383 W CN 2022098383W WO 2023284465 A1 WO2023284465 A1 WO 2023284465A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample image
neural network
image
sample
loss parameter
Prior art date
Application number
PCT/CN2022/098383
Other languages
English (en)
French (fr)
Inventor
张博深
王亚彪
汪铖杰
李季檩
黄飞跃
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023284465A1 publication Critical patent/WO2023284465A1/zh
Priority to US18/302,265 priority Critical patent/US20230259739A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/043Architecture, e.g. interconnection topology based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present invention relates to the technical field of image processing, in particular to an image detection method, device, computer readable storage medium and computer equipment.
  • Convolutional Neural Networks is a type of Feed Forward Neural Networks (FFNN) that includes convolution calculations and has a deep structure. It is one of the representative algorithms of Deep Learning (DL). .
  • the convolutional neural network has the ability of representation learning (Representation Learning, RL), and can classify the input information according to its hierarchical structure, so it is also called "Shift-Invariant Artificial Neural Networks (SIANN). )".
  • convolutional neural network related technologies have developed rapidly and are widely used. For example, in the scene of image blurring detection, using convolutional neural network to construct image detection model can improve the efficiency of image blurring detection.
  • the labels of the training sample images used in the model training stage are simple binary labels, and the inaccurate binary labels will affect the performance of the trained model, which will lead to The accuracy of image detection is not high.
  • Embodiments of the present application provide an image detection method, device, computer-readable storage medium, and computer equipment to improve the training effect of an image detection model and improve the accuracy of image detection.
  • Step A For each sample image in the first set of multiple sample images,
  • the sample image is respectively input into at least two neural network models to obtain a fuzzy probability value set of the sample image, and the fuzzy probability value set includes the fuzzy output of each neural network model in the at least two neural network models probability value;
  • Step B selecting a target sample image from the multiple sample images according to the distribution of loss parameters of the multiple sample images, and updating the at least two neural network models based on the target sample image to obtain an updated At least two neural network models of ;
  • steps A and B are sequentially performed on the updated at least two neural network models using at least two second groups of multiple sample images until the at least two neural network models converge, and at least two trained neural network models are obtained.
  • neural network model
  • At least one of the trained at least two neural network models is provided to perform blur detection on the image to be detected to obtain a blur detection result.
  • the input unit is used for inputting the sample image into at least two neural network models respectively for each sample image in the first group of multiple sample images to obtain a set of fuzzy probability values of the sample image, the fuzzy probability
  • the set of values includes a fuzzy probability value output by each of the at least two neural network models;
  • a calculation unit configured to calculate a loss parameter of the sample image according to the set of fuzzy probability values and the preset label information of the sample image
  • a selecting unit configured to select a target sample image from the multiple sample images according to the distribution of loss parameters of the multiple sample images, and update the at least two neural network models based on the target sample image, to obtain Updated at least two neural network models;
  • a training unit configured to use the input unit, the calculation unit and the selection unit to perform iterative training on the updated at least two neural network models in sequence using at least two second groups of multiple sample images until the at least Two neural network models converge to obtain at least two trained neural network models;
  • a unit is provided for providing at least one of the trained at least two neural network models for performing blur detection on the image to be detected to obtain a blur detection result.
  • a computer-readable storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor to execute the image detection method according to each embodiment of the present application.
  • a computer device including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor executes the computer program, various implementations of the present application are realized.
  • a computer program product or computer program according to an embodiment of the present application includes computer instructions, and the computer instructions are stored in a storage medium.
  • the processor of the computer device reads the computer instructions from the storage medium, and the processor executes the computer instructions, so that the computer device executes the image detection method of each embodiment of the present application.
  • the image detection scheme provided by the embodiment of the present application improves the effect of model training by using multi-model collaboration to screen noise samples in training samples, thereby improving the accuracy of image detection.
  • Fig. 1 is the scene schematic diagram of image detection model training in the present application
  • Fig. 2 is a schematic flow chart of the image detection method provided by the present application.
  • Fig. 3 is another schematic flow chart of the image detection method provided by the present application.
  • Fig. 4 is a schematic diagram of the calculation framework of the sample image loss parameter provided by the present application.
  • Fig. 5 is a schematic structural diagram of an image detection device provided by the present application.
  • FIG. 6 is a schematic structural diagram of a terminal provided by the present application.
  • FIG. 7 is a schematic structural diagram of a server provided by the present application.
  • Embodiments of the present invention provide an image detection method, device, computer-readable storage medium, and computer equipment.
  • the image detection method can be used in an image detection device.
  • the image detection device can be integrated in a computer device, and the computer device can be a terminal or a server.
  • the terminal may be a mobile phone, a tablet computer, a notebook computer, a smart TV, a wearable smart device, a personal computer (PC, Personal Computer) and the like.
  • the server can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, intermediate Cloud servers for basic cloud computing services such as mail service, domain name service, security service, network acceleration service (Content Delivery Network, CDN), and big data and artificial intelligence platforms.
  • multiple servers can form a block chain, and the server is a node on the block chain.
  • Figure 1 is a schematic diagram of the training scene of the image detection model provided for this application.
  • Computer device A can execute the methods of the various embodiments of the present application.
  • the method of the embodiment of the present application may include:
  • Step A For each sample image in the first set of multiple sample images,
  • the sample image is respectively input into at least two neural network models to obtain a fuzzy probability value set of the sample image, and the fuzzy probability value set includes the fuzzy output of each neural network model in the at least two neural network models probability value;
  • Step B selecting a target sample image from the multiple sample images according to the distribution of loss parameters of the multiple sample images, and updating the at least two neural network models based on the target sample image to obtain an updated At least two neural network models of ;
  • steps A and B are sequentially performed on the updated at least two neural network models using at least two second groups of multiple sample images until the at least two neural network models converge, and at least two trained neural network models are obtained.
  • neural network model
  • At least one of the trained at least two neural network models is provided to perform blur detection on the image to be detected to obtain a blur detection result.
  • the first group of multiple sample images and the second group of multiple sample images may have partially identical images, or may be two groups of completely different images.
  • computer device A After computer device A obtains the training sample data, it extracts a plurality of sample images and a fuzzy label value corresponding to each sample image from the training sample data. Then, each sample image extracted is input into at least two neural network models for detection, and a set of fuzzy probability values output by each sample image under at least two neural network models is obtained; according to the fuzzy probability of each sample image
  • the value set and the label information corresponding to each sample image are calculated to obtain the loss parameter corresponding to each sample image; the target sample image is determined according to the loss parameter, and at least two neural network models are updated based on these target sample images to obtain the updated At least two neural network models; returning to input multiple sample images into the updated at least two neural network models to obtain the set of fuzzy probability values output by each sample image under the updated at least two neural network models and Corresponding to the updated target image, iterative training is performed until at least two neural network model parameters converge, and at least two trained neural network models are obtained. In this way, the training of the neural network model used for image detection in this application
  • the image detection device may be integrated into a computer device.
  • the computer device may be a terminal or a server, and the terminal may be a mobile phone, a tablet computer, a notebook computer, a smart TV, a wearable smart device, a personal computer (PC, Personal Computer) and the like.
  • Figure 2 it is a schematic flow chart of the image detection method provided by the present application, which method includes:
  • Step 101 acquiring training sample data.
  • Image blurring refers to the blurred phenomenon in the image that makes it difficult to distinguish the content of the image. This phenomenon is similar to the abnormal display of the image when the computer screen appears blurry, so it is called image blurring.
  • Machine Learning is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. Specializes in the study of how computers simulate or implement human learning behaviors to acquire new knowledge or skills, and reorganize existing knowledge structures to continuously improve their performance.
  • Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent, and its application pervades all fields of artificial intelligence.
  • Machine learning and deep learning usually include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching and learning.
  • Machine learning technology is used to detect blurred screens in images, and convolutional neural network models can be used for detection.
  • the training images with labels can be input into a convolutional neural network to train the convolutional neural network, and then the image to be recognized is input into the trained convolutional neural network model for feature extraction, and then After classification by the fully connected layer, the detection result is obtained.
  • the label information of the image is the binary label information marked manually; that is, the label information of the image is any one of the two labels of blurred screen or non-painted screen.
  • blurred images are not a simple binary classification. Many blurred images are only slightly blurred or partially blurred. If the image is simply determined as blurred or not blurred, it will have a lot of subjectivity, which will make the manual labeling inaccurate, and then facilitate It will lead to a decline in the detection performance of the trained neural network model, resulting in inaccurate image detection results.
  • this application proposes an image detection method.
  • the image detection method provided by this application will be further introduced in detail below.
  • sample data can be stored on the blockchain.
  • the sample data includes multiple sample images and label information corresponding to each sample image.
  • the label information corresponding to the sample image is a binary label of the sample, that is, whether the sample image is blurred or not.
  • the label information of the sample image contains some noise, that is, some labels are not accurate enough.
  • each sample image is input to at least two neural network models to obtain a set of fuzzy probability values output by each sample image under at least two neural network models.
  • multiple neural network models are used for collaborative training.
  • a plurality here is at least two, specifically two, three or more neural network models.
  • the neural network model may also be a convolutional neural network model of any structure.
  • the at least two neural network models can be untrained neural network models, or artificial neural network models that have undergone certain pre-training.
  • a plurality of sample images included in the sample data are input to at least two neural network models one by one to perform blur detection.
  • the blur detection here is to detect the blur probability of the image, or to detect the blurred screen probability of the image.
  • the corresponding output result is the blur probability value of the image, where the blur probability value of the image is the probability value that the image is a blurry image.
  • the fuzzy probability value output by each neural network model will be obtained, and at least two fuzzy probability values corresponding to the target sample image will be obtained.
  • At least two blur probability values constitute a set of blur probability values corresponding to the target sample image.
  • inputting them into at least two neural network models can also obtain the fuzzy probability value sets output by at least two neural network models, and then obtain the fuzzy probability value sets corresponding to each sample image.
  • step 103 the loss parameter corresponding to each sample image is calculated according to the blur probability value set of each sample image and the label information corresponding to each sample image.
  • the loss parameter corresponding to each sample image is calculated according to the label information corresponding to the fuzzy probability value set and each sample image.
  • the loss parameter here is a parameter to evaluate the difference between the label value of the sample image and the output result of the model. As the model is continuously updated during the training process, the loss parameter corresponding to the sample image will gradually decrease, that is, the output of the model The result will continue to approach the label value of the sample image. Since in the embodiment of the present application, multi-model collaboration is used for training, the loss parameter here is a parameter for evaluating the difference between the integrated result of output results of multiple models and the label value of the sample image.
  • the loss parameter corresponding to each sample image is calculated according to the blur probability value set of each sample image and the label information corresponding to each sample image, including:
  • the loss parameter corresponding to the sample image can be determined according to the cross entropy of the probability value sequence composed of elements in the fuzzy probability value set corresponding to each sample image and the label sequence composed of the label of the sample image.
  • the label sequence composed of the labels of the sample images is a numerical sequence composed of the label values of multiple sample images, where the number of numerical values in the numerical sequence is the number of at least two neural network models. For example, when the number of neural network models used for collaborative training is 5, and the label value of the target sample image is 1, then the label sequence is ⁇ 1, 1, 1, 1, 1 ⁇ .
  • Cross entropy (Cross Entropy, CE) is an important concept in information theory, which is mainly used to measure the difference information between two probability distributions.
  • Cross-entropy can be used as a loss function in neural networks to measure the similarity between the model's predicted distribution and the real distribution of samples.
  • One advantage of cross entropy as a loss function is that it can avoid the problem of low learning speed of the mean square error loss function during gradient descent, thereby improving the efficiency of model training.
  • multiple cross-entropies corresponding to the target sample image are obtained. Then, the multiple cross-entropies corresponding to the target sample image are summed to obtain a first sub-loss parameter corresponding to the target sample image, and the first sub-loss parameter is determined as the loss parameter of the target sample image. Then, further, the loss parameter corresponding to each sample image can be similarly determined according to the above method.
  • the image detection method provided in the embodiment of the present application further includes:
  • the first sub-loss parameter corresponding to each sample image and the second sub-loss parameter are weighted and summed to obtain the loss parameter corresponding to each sample image.
  • relative entropy between blur probability values output by the same sample image under different models
  • relative entropy also known as KL divergence (Kullback-Leibler Divergence) or information divergence (Information Divergence, ID)
  • KL divergence Kullback-Leibler Divergence
  • ID Information Divergence
  • the relative entropy corresponding to the sample image is one; when the number of neural network models used for collaborative training is 3, the relative entropy corresponding to the sample image is 3 ;
  • the relative entropy corresponding to the sample image is n*(n-1)/2.
  • the above-mentioned first sub-loss parameter and the second sub-loss parameter are weighted and summed to obtain the loss parameter corresponding to the sample image, and then the loss parameter corresponding to each sample image can be further determined.
  • the relative entropy of the output values of the same sample image in different neural network models is added to the loss parameters of the sample image, so that the outputs of different neural network models are constantly approaching during model training, thereby improving the accuracy of model training.
  • the method also includes:
  • a weighted summation is performed on the first sub-loss parameter, the second sub-loss parameter and the third sub-loss parameter corresponding to each sample image to obtain a loss parameter corresponding to each sample image.
  • the label information of multiple sample images may be determined first, and then the probability distribution information of the label information in the sample data may be obtained according to the label information of the multiple sample images. For example, when the number of sample images is 10, the number of samples with label 1 is 5, and the number of samples with label 0 is 5, then it can be determined that the probability distribution of label information in the sample data is [0.5, 0.5]. Further, the corresponding feature vectors can be generated according to the probability distribution information so as to calculate the cross-entropy. Further, the cross entropy between the probability distribution and the fuzzy probability value set corresponding to each sample image can be calculated, and then the obtained cross entropy can be summed to obtain the third sub-loss parameter corresponding to each sample image. Further, the first sub-loss parameter, the second sub-loss parameter and the third sub-loss parameter may be weighted and summed to obtain a loss parameter corresponding to each sample image.
  • any one or any combination of the first sub-loss parameter, the second sub-loss parameter and the third sub-loss parameter may be used.
  • the first sub-loss parameter, the second sub-loss parameter or the third sub-loss parameter may be used independently as the loss parameter corresponding to the sample image.
  • weighted summation may be performed on the first sub-loss parameter and the third sub-loss parameter to obtain the loss parameter corresponding to the sample image.
  • Step 104 Select a target sample image from multiple sample images according to the distribution of the loss parameter corresponding to each sample image, and update at least two neural network models based on the target sample image to obtain at least two updated neural network models .
  • a certain number of target sample images with smaller loss parameter values are selected from the sample images according to the distribution of the loss parameter corresponding to the sample image. Then, use the certain number of target sample images to train at least two neural network models, and use the at least two neural network models obtained from training to update the initial at least two neural network models to obtain at least two updated neural network models. neural network model.
  • the smaller the loss parameter value is for the sample image the closer the output value obtained by the model detection is to the label of the sample image, and the higher the accuracy of the label value is.
  • the target sample image is selected from multiple sample images according to the distribution of loss parameters corresponding to each sample image, including:
  • a certain number of target sample images are determined in a plurality of sample images, and at least two neural network models are trained and updated based on the target sample images, and at least two neural network models after updating are used for each sample
  • the image is detected again to obtain the set of fuzzy probability values corresponding to each sample image; then, based on the new set of fuzzy probability values and the label value of each sample image, the new loss parameter value of each sample image is calculated, and based on the new loss parameter Re-determining the target sample image and retraining and updating the updated at least two neural network models based on the new target sample image, so that multiple iterations are performed on the at least two neural network models.
  • the number of target sample images determined during each iteration of at least two neural network models is related to the number of iterations of model training. That is, in each cycle of iterative training of the model, the number of target sample images is different. The more iterative training times, the fewer sample images can be used, so that in the process of continuous iterative training, the training samples with inaccurate label values are gradually eliminated. Therefore, each time the target sample image is determined, the current number of training iterations for at least two neural network models can be obtained first. For example, if at least two neural network models are trained for the fifth time, then the number of iteration training times is determined to be 5. Then calculate the target number of target sample images that need to be retained according to the number of training times.
  • the target number of sample images is selected in order from small to large to obtain the target sample image. That is, it is determined that among the multiple sample images, a sample image with a smaller target number of loss parameter values is the target sample image.
  • calculating the target number of target sample images according to the training times of iterative training includes:
  • a preset screening rate may be obtained first.
  • the screening rate is a ratio that controls the number of target sample images selected from multiple sample images.
  • the number of target sample images may be the product of the number of multiple sample images and the preset screening rate. Therefore, after the preset screening rate is obtained, the ratio of the number of target sample images selected in this iterative training to the number of multiple sample images can be calculated according to the preset screening rate and the number of iterative training. Then, the target number of target sample images can be further calculated according to the proportion and the number of multiple sample images. In this way, the number of target sample images can be controlled by setting a preset screening rate, so as to ensure that enough sample images with inaccurate label values can be screened out, and that there are enough sample images to train the model.
  • Step 105 return to the execution of inputting a plurality of sample images into at least two updated neural network models respectively, and obtain the set of fuzzy probability values output by each sample image under the updated at least two neural network models and the corresponding updated
  • the target image is iteratively trained until at least two neural network models converge, and at least two trained neural network models are obtained.
  • another group of multiple sample images may be acquired and input into the updated neural network model for iteration.
  • the other group of multiple sample images are sample images that have not been used to train at least two neural network models.
  • the training set has a total of 800 sample images, assuming that 8 sample images are used for each iteration training, and 8 unused sample images can be selected from the training set during each iteration training. In this way, if iterative training is performed 100 times, all sample images in the training set will be used once, which is called an epoch. In some embodiments, multiple epochs are trained.
  • step 102 to step 104 is a cyclic process in the iterative training of the model. That is, at least two neural networks are used to perform blur detection on multiple sample images, output a set of blur probability values corresponding to each sample image, and calculate each sample image based on the set of blur probability values of each sample image and the label value of each sample image The loss parameter corresponding to the image, and then determine the target sample image based on the loss parameter of each sample image, and further use the target sample image to train at least two neural network models and update these steps, which is to iteratively train at least two neural network models a cyclic process.
  • the updated at least two neural network models need to be substituted into step 102 for the next cycle of processing. That is, the plurality of sample images are respectively input into the updated at least two neural network models, and a set of fuzzy probability values output by each sample image under the updated at least two neural network models is obtained. Then, based on the fuzzy probability value set and the label value of each sample image, a new loss parameter corresponding to each sample image is calculated again. A new target sample image is further determined based on the loss parameter of each sample image and the number of iteration training, and then the updated at least two neural network models are retrained and updated using the new target sample image. In this way, at least two neural network models are iteratively trained until the model parameters of the at least two neural network models converge, and at least two trained neural network models are obtained.
  • Step 106 using at least two trained neural network models to perform blur detection on the image to be detected to obtain a blur detection result.
  • At least two neural network models are trained to obtain at least two trained neural network models, and at least two trained neural network models are used to perform fuzzy detection on the image to be detected to obtain a fuzzy detection result.
  • At least two neural network models after training are used to perform blur detection on the image to be detected to obtain a blur detection result, including:
  • the binary result of the blur detection can be further determined according to the blur probability value obtained by performing blur detection on the image to be detected by at least two neural network models after training, that is, it is determined that the image to be detected is blurred according to the blur probability value.
  • the image is also a non-blurred image.
  • At least two trained neural network models are used to perform blur detection on the image to be detected to obtain a blur detection result, including:
  • A. Obtain the prediction accuracy rates of at least two neural network models after training, and obtain at least two prediction accuracy rates;
  • the model prediction accuracy of each of the at least two neural network models after training is acquired, and then the neural network model with the highest prediction accuracy is determined as the target neural network model.
  • the target neural network model is used to perform fuzzy detection on the image to be detected, and the fuzzy probability value output by the target neural network is obtained, and the fuzzy probability value output by the target neural network is determined to be the detection result of the fuzzy detection of the image to be detected.
  • the image detection method acquires training sample data, which includes a plurality of sample images and label information corresponding to each sample image; each sample image is input into at least two The neural network model is used to obtain the fuzzy probability value set output by each sample image under at least two neural network models; the corresponding label information of each sample image is calculated according to the fuzzy probability value set of each sample image and the label information corresponding to each sample image.
  • the loss parameter select the target sample image from multiple sample images according to the distribution of the loss parameter corresponding to each sample image, and update at least two neural network models based on the target sample image to obtain at least two updated neural networks Model; return to execute multiple sample images into at least two updated neural network models respectively, and obtain the set of fuzzy probability values output by each sample image under the updated at least two neural network models and the corresponding updated target
  • the image is subjected to iterative training until at least two neural network models converge, and at least two trained neural network models are obtained; at least two trained neural network models are used to perform fuzzy detection on the image to be detected to obtain a fuzzy detection result. In this way, by using multi-model collaboration to screen noise samples in training samples, the effect of model training is improved, thereby improving the accuracy of image detection.
  • the embodiment of the present application will further describe the image detection method provided in the present application in detail from the perspective of computer equipment, where the computer equipment can be a terminal or a server.
  • the computer equipment can be a terminal or a server.
  • Figure 3 it is another schematic flow chart of the image detection method provided by the present application, the method includes:
  • step 201 the computer device acquires training sample data including a plurality of sample images and a label of each sample image.
  • the label corresponding to the sample image in the sample data for training the image detection model is a manually annotated label, where the label of the sample image can be the blurred binary label of the sample image.
  • Binary tags such as simple artifacts or no artifacts can be accurately marked, and there are still intermediate states of image artifacts such as slight artifacts or partial artifacts.
  • image blurring refers to a situation where part or all of the content of the image cannot be recognized due to blurred image. Therefore, using a simple binary label to mark the blurring state of the sample image will make the label information of the sample image inaccurate.
  • this application proposes an image detection method.
  • the image detection method provided in this application will be further described in detail below.
  • the detection model is still trained using sample images with blurred binary labels. Therefore, firstly, the training sample data is acquired.
  • the training sample data includes multiple sample images and the corresponding blurred binary value label.
  • the blurred binary label of the sample image is whether the sample image is a blurred image or not. When the sample image is a blurred image, the binary label of the sample image is 1; when the sample image is not a blurred image, the sample image's The binary label is then 0.
  • step 202 the computer device inputs a plurality of sample images into two neural network models to detect artifacts, and obtains two artifact probability values output by each sample image in the two neural network models.
  • a multi-model collaborative training method may be used to train a model for image artifact detection. Since different neural network models have different decision boundaries, the parameters of the neural network model are randomly initialized each time training is started. Therefore, different models have different abilities to exclude noise samples (that is, samples with inaccurate labels), so the collaborative training of multiple models can well inherit the advantages of each model and complement each other, thereby improving the ability to screen noise samples.
  • the multi-model can be two neural network models, three neural network models or a greater number of neural network models. In the embodiment of the present application, a detailed description will be made by taking two neural network models for collaborative training as an example.
  • the multiple sample images are respectively input into two neural network models, and the output of each sample image in two neural network models is obtained.
  • a probability value of a blurry screen can be recorded as the first neural network model and the second neural network model respectively, the blurred screen probability value output by the first neural network model can be recorded as p1, and the blurred screen probability value output by the second neural network model can be recorded as for p2.
  • step 203 the computer device calculates the cross-entropy between the two artifact probability values and the sample label to obtain the first sub-loss parameter.
  • the cross entropy corresponding to each sample image is calculated by using the blurred screen probability value of each sample image and the sample label, and the specific calculation formula is as follows:
  • Lc1 is the cross entropy corresponding to the first neural network model
  • y is the label value corresponding to the sample image, that is, 0 or 1
  • p1 is the blurring probability value obtained by the first neural network to detect the blurring of the sample image
  • Lc2 is the cross entropy corresponding to the second neural network model
  • p2 is the blurring probability value obtained by the second neural network performing blurring detection on the sample image.
  • Lc is the obtained first sub-loss parameter, or may also be called classification loss.
  • step 204 the computer device calculates the relative entropy between the two blurred screen probability values to obtain the second sub-loss parameter.
  • the relative entropy can also be called KL divergence, and the correlation between two blurred screen probability values is calculated Or it can be called cross regularization loss.
  • the purpose of calculating the cross regularization loss is to constrain the probability distribution similarity between the blurred screen probability values output by the two models. It is hoped that as the model training proceeds, the probability values output by the two models for the same sample image can be closer.
  • Step 205 the computer device calculates the relative entropy between the two blurred screen probability values and the sample image label distribution to obtain the third sub-loss parameter.
  • L p1 is the cross entropy corresponding to the first neural network model
  • L p2 is the cross entropy corresponding to the second neural network model
  • L p is the third sub-loss parameter, or called prior loss.
  • the purpose of adding a priori loss is to hope that as the model training progresses, the output probability value distribution of the two models can continue to approach the distribution of artificial label values.
  • FIG. 4 it is a schematic diagram of a calculation framework of a sample image loss parameter provided by the embodiment of the present application.
  • the sample image 10 is detected by the first neural network model 21 to output a first artifact probability value p1
  • the sample image 10 is detected by the second neural network model 22 to output a second artifact probability value p2.
  • the first classification loss and the first prior loss are calculated based on the first artifact probability value p1
  • the second classification loss and the second prior loss are obtained based on the second artifact probability value p2.
  • the second blurred screen probability value p2 is calculated to obtain the cross regularization loss, and finally the first classification loss, the first prior loss, the second classification loss, the second prior loss and the cross regularization loss are weighted and summed to obtain the corresponding The loss parameter.
  • Step 206 the computer device calculates a loss parameter corresponding to each sample image according to the first sub-loss parameter, the second sub-loss parameter and the third sub-loss parameter.
  • is the weight coefficient controlling the cross regularization loss
  • is the weight coefficient controlling the prior loss.
  • Step 207 the computer device determines the target sample image according to the loss parameters of each sample image.
  • the sample image needs to be screened according to the loss parameter of the sample image, so as to eliminate samples with large noise (not accurate enough label value).
  • the larger the loss parameter value of the sample output the greater the sample noise. Therefore, it is necessary to eliminate some sample images with larger loss parameters, and use the target sample images with smaller loss parameter values to train the model.
  • the proportion of the target sample image can be calculated by the following formula:
  • R(t) is the proportion of the target sample image in multiple sample images
  • t is the number of iterations of the current training
  • T k is a hyperparameter, which is used to control the corresponding screening rate under the current number of training iterations t
  • is A preset filter rate.
  • R(t) when t is small at the initial stage of iterative training, the value of R(t) is large, and more sample images will be used to train the two neural network models, and the screening ratio of noise samples is relatively small. Small.
  • R(t) gradually decreases, that is, the number of target samples is also gradually reduced, and the screening ratio of noise samples becomes larger, so that most of the noise sample images will be eliminated .
  • the sample image with the smallest loss parameter R(t) ratio is selected from the multiple sample images according to the ratio as the target sample image.
  • Step 208 the computer device uses the target sample image to train the two neural network models, and uses the trained two neural network models to update the two neural network models.
  • the target sample image and its corresponding label value are used to train the two neural network models, and the model parameters of the two neural network models are updated to obtain the updated Two neural network models. Then the updated two neural network models are used for further training and updating.
  • Step 209 the computer device judges whether the number of iterative training reaches the preset number.
  • the computer device needs to judge the number of iterative training to determine whether the preset number of iterative training is reached. If it is not reached, return to step 202, use the updated two neural network models to perform artifact detection on each sample image again to obtain a new artifact probability value, and then further calculate the new artifact probability value of each sample image according to the new artifact probability value Loss parameters, and then re-determine the new target sample image, and use the new target sample image to train and update the updated two neural network models again.
  • Step 210 the computer device determines that the updated two neural network models are the two trained neural network models.
  • the finally obtained two neural network models are the final trained neural network models.
  • Step 211 the computer device uses the two trained neural network models to perform blurred screen detection on the image to be detected, and obtains a blurred screen detection result.
  • the two trained neural network models may be used to perform blurring detection on the image to be detected. Specifically, a target neural network model with a better detection result may be determined from the two trained neural network models to detect the image to be detected. The detection effect of the two trained neural network models can be verified by using images that have been labeled with accurate labels.
  • the target neural network model uses the target neural network model to detect artifacts on the image to be detected, input the artifact probability value of the image to be detected, and then further determine the binary result of the artifact on the image to be detected according to the artifact probability value, that is, whether it is an image with artifacts or not.
  • the binary result of the blurred screen of the image to be detected may be determined according to a comparison result between the blurred screen probability value output by the detection and the preset probability value. For example, when the artifact detection output of the image to be detected by the target neural network model is 0.9, and the preset artifact probability value is 0.95, then the image to be detected is determined to be an artifact image.
  • the image detection method acquires training sample data, which includes a plurality of sample images and label information corresponding to each sample image; each sample image is input into at least two The neural network model is used to obtain the fuzzy probability value set output by each sample image under at least two neural network models; the corresponding label information of each sample image is calculated according to the fuzzy probability value set of each sample image and the label information corresponding to each sample image.
  • the loss parameter select the target sample image from multiple sample images according to the distribution of the loss parameter corresponding to each sample image, and update at least two neural network models based on the target sample image to obtain at least two updated neural networks Model; return to execute multiple sample images into at least two updated neural network models respectively, and obtain the set of fuzzy probability values output by each sample image under the updated at least two neural network models and the corresponding updated target
  • the image is subjected to iterative training until at least two neural network models converge, and at least two trained neural network models are obtained; at least two trained neural network models are used to perform fuzzy detection on the image to be detected to obtain a fuzzy detection result. In this way, by using multi-model collaboration to screen noise samples in training samples, the effect of model training is improved, and the accuracy of image detection is further improved.
  • an embodiment of the present invention further provides an image detection device, and the image detection device may be integrated in a terminal.
  • the image detection device may include an acquisition unit 301, an input unit 302, a calculation unit 303, a selection unit 304, a training unit 305, and a detection unit 306, as follows:
  • An acquisition unit 301 configured to acquire training sample data, where the training sample data includes a plurality of sample images and label information corresponding to each sample image;
  • the input unit 302 is configured to input each sample image into at least two neural network models, and obtain a set of fuzzy probability values output by each sample image under at least two neural network models;
  • a calculation unit 303 configured to calculate a loss parameter corresponding to each sample image according to the blur probability value set of each sample image and the label information corresponding to each sample image;
  • the selection unit 304 is configured to select a target sample image from a plurality of sample images according to the distribution of loss parameters corresponding to each sample image, and update at least two neural network models based on the target sample image to obtain at least two updated neural network models.
  • neural network model
  • the training unit 305 is configured to return to input a plurality of sample images into the updated at least two neural network models, and obtain the set of fuzzy probability values output by each sample image under the updated at least two neural network models and the corresponding
  • the updated target image is subjected to iterative training until at least two neural network models converge, and at least two trained neural network models are obtained;
  • the detection unit 306 is configured to perform blur detection on the image to be detected by using at least two trained neural network models to obtain a blur detection result.
  • the computing unit includes:
  • the second summation subunit is used to sum the calculated first cross-entropy to obtain the first sub-loss parameter corresponding to each sample image;
  • the determination subunit is configured to determine the loss parameter corresponding to each sample image according to the first sub-loss parameter corresponding to each sample image.
  • the image detection device provided by the present application also includes:
  • the second calculation subunit is used to calculate the relative entropy between every two fuzzy probability values in the fuzzy probability value set corresponding to each sample image:
  • the second summation subunit is used to sum the relative entropy to obtain the second sub-loss parameter corresponding to each sample image
  • a weighted summation is performed on the first sub-loss parameter corresponding to each sample image and the second sub-loss parameter to obtain a loss parameter corresponding to each sample image.
  • the image detection device provided by the present application also includes:
  • the first acquisition subunit is configured to acquire probability distribution information of tag information in the sample data, and generate corresponding feature vectors based on the probability distribution information;
  • the third calculation subunit is used to calculate the second cross entropy between the feature vector and the fuzzy probability value set corresponding to each sample image:
  • the third summation subunit is used to sum the calculated second cross-entropy to obtain the third sub-loss parameter corresponding to each sample image;
  • a weighted summation is performed on the first sub-loss parameter, the second sub-loss parameter and the third sub-loss parameter corresponding to each sample image to obtain a loss parameter corresponding to each sample image.
  • selected units include:
  • the second obtaining subunit is used to obtain the number of training times for iterative training of at least two neural network models
  • the fourth calculation subunit is used to calculate the target number of target sample images according to the training times of iterative training
  • the selection subunit is used to select a target number of sample images in order of loss parameters from small to large to obtain target sample images.
  • the fourth computing subunit includes:
  • the obtaining module is used to obtain a preset screening rate, and the screening rate is used to control the screening of multiple sample images;
  • the first calculation module is used to calculate the proportion of the target sample image in multiple sample images according to the screening rate and the number of training iterations;
  • the second calculation module is used to calculate the target number of target sample images according to the proportion and the number of multiple sample images.
  • the detection unit includes:
  • the first input subunit is used to input the image to be detected to at least two trained neural network models for fuzzy detection to obtain at least two fuzzy probability values;
  • the fifth calculation subunit is used to calculate the average value of at least two blur probability values to obtain the blur probability corresponding to the image to be detected.
  • the detection unit includes:
  • the third obtaining subunit is used to obtain the prediction accuracy rates of at least two neural network models after training, and obtain at least two prediction accuracy rates;
  • the sorting subunit is used to sort at least two prediction accuracy rates from high to low, and determine the neural network model with the highest prediction accuracy rate as the target neural network model;
  • the detection subunit is used to input the image to be detected to the target neural network model for blur detection, and obtain the blur probability corresponding to the image to be detected.
  • each of the above units may be implemented as an independent entity, or may be combined arbitrarily as the same or several entities.
  • the specific implementation of each of the above units may refer to the previous method embodiments, and will not be repeated here.
  • the training sample data is obtained through the acquisition unit 301, and the training sample data includes a plurality of sample images and label information corresponding to each sample image;
  • the input unit 302 converts each sample image to Respectively input to at least two neural network models to obtain the fuzzy probability value sets output by each sample image under at least two neural network models;
  • the calculation unit 303 corresponds to each sample image according to the fuzzy probability value set of each sample image
  • the label information is calculated to obtain the loss parameter corresponding to each sample image;
  • the selection unit 304 selects the target sample image from multiple sample images according to the distribution of the loss parameter corresponding to each sample image, and based on the target sample image, at least two neural network models Perform an update to obtain at least two updated neural network models;
  • the training unit 305 returns to input multiple sample images into the updated at least two neural network models to obtain at least two updated neural network models for each sample image.
  • the fuzzy probability value set output under the network model and the corresponding updated target image are iteratively trained until at least two neural network models converge, and at least two neural network models after training are obtained; the detection unit 306 uses at least two trained neural network models A neural network model performs fuzzy detection on the image to be detected, and obtains the fuzzy detection result. In this way, by using multi-model collaboration to screen noise samples in training samples, the effect of model training is improved, and the accuracy of image detection is further improved.
  • the embodiment of the present application also provides a computer device, the computer device may be a terminal, as shown in Figure 6, the terminal may include a radio frequency (RF, Radio Frequency) circuit 401, including one or more computer-readable storage media Memory 402, input component 403, display unit 404, sensor 405, audio circuit 406, wireless fidelity (WiFi, Wireless Fidelity) module 407, processor 408 including one or more processing cores, and power supply 409 and other components.
  • RF Radio Frequency
  • the terminal structure shown in FIG. 6 does not constitute a limitation on the terminal, and may include more or less components than those shown in the figure, or combine some components, or arrange different components.
  • the memory 402 can be used to store software programs and modules, and the processor 408 executes various functional applications and information interaction by running the software programs and modules stored in the memory 402 .
  • the processor 408 in the terminal loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the executable file stored in the memory 402 is executed by the processor 408. in the application, so as to realize various functions:
  • the training sample data includes a plurality of sample images and label information corresponding to each sample image; input each sample image into at least two neural network models, and obtain each sample image in at least two neural network models
  • the fuzzy probability value set output below; calculate the loss parameter corresponding to each sample image according to the fuzzy probability value set of each sample image and the label information corresponding to each sample image; according to the distribution of the loss parameter corresponding to each sample image from Select a target sample image from a plurality of sample images, and update at least two neural network models based on the target sample image to obtain at least two updated neural network models; return to perform inputting multiple sample images into the updated at least Two neural network models, obtain the set of fuzzy probability values output by each sample image under the updated at least two neural network models and the corresponding updated target image, and perform iterative training until at least two neural network models converge, and obtain The trained at least two neural network models; using the trained at least two neural network models to perform fuzzy detection on the image to be detected to obtain a fuzzy detection result.
  • the embodiment of the present application also provides a computer device, which may be a server, as shown in FIG. 7 , which is a schematic structural diagram of the computer device provided in the present application. Specifically:
  • the computer device may include a processing unit 501 of one or more processing cores, a storage unit 502 of one or more storage media, a power module 503, an input module 504 and other components.
  • a processing unit 501 of one or more processing cores may include a storage unit 502 of one or more storage media, a power module 503, an input module 504 and other components.
  • FIG. 7 does not constitute a limitation on the computer device, and may include more or less components than shown in the figure, or combine some components, or arrange different components. in:
  • the processing unit 501 is the control center of the computer equipment, uses various interfaces and lines to connect various parts of the entire computer equipment, runs or executes the software programs and/or modules stored in the storage unit 502, and calls the software programs and/or modules stored in the storage unit 502. Perform various functions of computer equipment and process data, so as to monitor the computer equipment as a whole.
  • the processing unit 501 may include one or more processing cores; preferably, the processing unit 501 may integrate an application processor and a modem processor, wherein the application processor mainly processes operating systems, user interfaces, and application programs, etc. , the modem processor mainly handles wireless communications. It can be understood that the foregoing modem processor may not be integrated into the processing unit 501 .
  • the storage unit 502 can be used to store software programs and modules, and the processing unit 501 executes various functional applications and data processing by running the software programs and modules stored in the storage unit 502 .
  • the storage unit 502 can mainly include a program storage area and a data storage area, wherein the program storage area can store an operating system, at least one application program required by a function (such as a sound playback function, an image playback function, and web page access, etc.); The area may store data and the like created according to use of the computer device.
  • the storage unit 502 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.
  • the storage unit 502 may further include a memory controller to provide the processing unit 501 with access to the storage unit 502 .
  • the computer device also includes a power supply module 503 for supplying power to various components.
  • the power supply module 503 can be logically connected to the processing unit 501 through the power management system, so as to realize functions such as managing charge, discharge, and power consumption through the power management system.
  • the power module 503 may also include one or more DC or AC power supplies, recharging systems, power failure detection circuits, power converters or inverters, power status indicators and other arbitrary components.
  • the computer device can also include an input module 504, which can be used to receive input numbers or character information, and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • an input module 504 can be used to receive input numbers or character information, and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • the computer device may also include a display unit, etc., which will not be repeated here.
  • the processing unit 501 in the computer device will load the executable files corresponding to the process of one or more application programs into the storage unit 502 according to the following instructions, and the processing unit 501 will run the stored
  • the application programs in the storage unit 502, thereby realizing various functions, are as follows:
  • the training sample data includes a plurality of sample images and label information corresponding to each sample image; input each sample image into at least two neural network models, and obtain each sample image in at least two neural network models
  • the fuzzy probability value set output below; calculate the loss parameter corresponding to each sample image according to the fuzzy probability value set of each sample image and the label information corresponding to each sample image; according to the distribution of the loss parameter corresponding to each sample image from Select a target sample image from a plurality of sample images, and update at least two neural network models based on the target sample image to obtain at least two updated neural network models; return to perform inputting multiple sample images into the updated at least Two neural network models, obtain the set of fuzzy probability values output by each sample image under the updated at least two neural network models and the corresponding updated target image, and perform iterative training until at least two neural network models converge, and obtain The trained at least two neural network models; using the trained at least two neural network models to perform fuzzy detection on the image to be detected to obtain a fuzzy detection result.
  • an embodiment of the present invention provides a computer-readable storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in any one of the methods provided in the embodiments of the present invention.
  • the command can perform the following steps:
  • the training sample data includes a plurality of sample images and label information corresponding to each sample image; input each sample image into at least two neural network models, and obtain each sample image in at least two neural network models
  • the fuzzy probability value set output below; calculate the loss parameter corresponding to each sample image according to the fuzzy probability value set of each sample image and the label information corresponding to each sample image; according to the distribution of the loss parameter corresponding to each sample image from Select a target sample image from a plurality of sample images, and update at least two neural network models based on the target sample image to obtain at least two updated neural network models; return to perform inputting multiple sample images into the updated at least Two neural network models, obtain the set of fuzzy probability values output by each sample image under the updated at least two neural network models and the corresponding updated target image, and perform iterative training until at least two neural network models converge, and obtain The trained at least two neural network models; using the trained at least two neural network models to perform fuzzy detection on the image to be detected to obtain a fuzzy detection result.
  • the computer-readable storage medium may include: a read-only memory (ROM, Read Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk or an optical disk, and the like.
  • a computer program product or computer program includes computer instructions, and the computer instructions are stored in a storage medium.
  • the processor of the computer device reads the computer instruction from the storage medium, and the processor executes the computer instruction, so that the computer device executes the method provided in the various optional implementation manners in FIG. 2 or FIG. 3 above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Analysis (AREA)
  • Automation & Control Theory (AREA)
  • Computational Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

一种图像检测方法、装置、计算机可读存储介质及计算机设备。所述方法针对第一组多个样本图像中的每一样本图像,将所述样本图像分别输入到至少两个神经网络模型,得到所述样本图像的模糊概率值集合;根据模糊概率值集合与预设标签信息计算得到所述样本图像的损失参数;根据损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新;依次利用至少两个第二组多个样本图像执行上述步骤,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;提供训练后的至少两个神经网络模型对待检测图像进行检测,得到检测结果。

Description

图像检测方法、装置、计算机可读存储介质及计算机设备
本申请要求于2021年07月16日提交中国专利局、申请号为202110804450.6、发明名称为“图像检测方法、装置、计算机可读存储介质及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及图像处理技术领域,具体涉及一种图像检测方法、装置、计算机可读存储介质及计算机设备。
发明背景
卷积神经网络(Convolutional Neural Networks,CNN)是一类包含卷积计算且具有深度结构的前馈神经网络(Feed Forward Neural Networks,FFNN),是深度学习(Deep Learning,DL)的代表算法之一。卷积神经网络具有表征学习(Representation Learning,RL)能力,能够按其阶层结构对输入信息进行平移不变分类,因此也被称为“平移不变人工神经网络(Shift-Invariant Artificial Neural Networks,SIANN)”。
近年来,卷积神经网络相关技术发展迅速且应用也非常广泛。例如在对图像进行花屏检测的场景中,使用卷积神经网络构建图像检测模型,可以提高对图像进行花屏检测的效率。
然而,目前使用卷积神经网络构建的图像检测模型,在模型训练阶段所使用的训练样本图像的标签为简单的二值标签,二值标签不准确会导致影响训练得到的模型的性能,进而导致对图像检测的准确性不高。
发明内容
本申请实施例提供一种图像检测方法、装置、计算机可读存储介质及计算机设备,改善图像检测模型的训练效果,提升图像检测的准确性。
本申请实施例的一种图像检测方法包括:
步骤A:针对第一组多个样本图像中的每一样本图像,
将所述样本图像分别输入到至少两个神经网络模型,得到所述样本图像的模糊概率值集合,所述模糊概率值集合包括所述至少两个神经网络模型中每个神经网络模型输出的模糊概率值;
根据所述模糊概率值集合与所述样本图像的预设标签信息计算得到所述样本图像的损失参数;
步骤B:根据所述多个样本图像的损失参数的分布从所述多个样本图像中选取目标样本图像,并基于所述目标样本图像对所述至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;
对更新后的所述至少两个神经网络模型利用至少两个第二组多个样本图像依次执行上述步骤A和步骤B,直至所述至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;
提供所述训练后的至少两个神经网络模型中的至少一个神经网络模型用于对待检测图像进行模糊检测以得到模糊检测结果。
本申请实施例的一种图像检测装置包括:
输入单元,用于针对第一组多个样本图像中的每一样本图像,将所述样本图像分别输入到至少两个神经网络模型,得到所述样本图像的模糊概率值集合,所述模糊概率值集合包括所述至少两个神经网络模型中每个神经网络模型输出的模糊概率值;
计算单元,用于根据所述模糊概率值集合与所述样本图像的预设标签信息计算得到所述样本图像的损失参数;
选取单元,用于根据所述多个样本图像的损失参数的分布从所述多个样本图像中选取目标样本图像,并基于所述目标样本图像对所述至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;
训练单元,用于对更新后的至少两个神经网络模型依次利用至少两个第二组多个样本图像使用所述输入单元、所述计算单元和所述选取单元进行迭代训练,直至所述至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;
提供单元,用于提供所述训练后的至少两个神经网络模型中的至少一个神经网络模型用于对待检测图像进行模糊检测以得到模糊检测结果。
本申请实施例的一种计算机可读存储介质,存储有多条指令,所述指令适于处理器进行加载,以执行本申请各实施例的图像检测方法。
本申请实施例的一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可以在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现本申请各实施例的图像检测方法。
本申请实施例的一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机指令,所述计算机指令存储在存储介质中。计算机设备的处理器从存储介质读取所述计算机指令,处理器执行所述计算机指令,使得所述计算机设备执行本申请各实施例的图像检测方法。
本申请实施例提供的图像检测方案,通过采用多模型协同对训练样本中的噪声样本进行筛选,改善了模型训练的效果,进而提高了图像检测的准确率。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请中图像检测模型训练的场景示意图;
图2是本申请提供的图像检测方法的流程示意图;
图3是本申请提供的图像检测方法的另一流程示意图;
图4是本申请提供的样本图像损失参数计算框架示意图;
图5是本申请提供的图像检测装置的结构示意图;
图6是本申请提供的终端的结构示意图;
图7是本申请提供的服务器的结构示意图。
实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例提供一种图像检测方法、装置、计算机可读存储介质及计算机设备。其中,该图像检测方法可以使用于图像检测装置中。该图像检测装置可以集成在计算机设备中,该计算机设备可以是终端也可以是服务器。其中,终端可以为手机、平板电脑、笔记本电脑、智能电视、穿戴式智能设备、个人计算机(PC,Personal Computer)等设备。服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、网络加速服务(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。其中,多个服务器可以组成一区块链,而服务器是区块链上的节点。
请参阅图1,为本申请提供图像检测模型的训练场景示意图。计算机设备A可以执行本申请各实施例的方法。本申请实施例的方法可以包括:
步骤A:针对第一组多个样本图像中的每一样本图像,
将所述样本图像分别输入到至少两个神经网络模型,得到所述样本图像的模糊概率值集合,所述模糊概率值集合包括所述至少两个神经网络模型中每个神经网络模型输出的模糊概率值;
根据所述模糊概率值集合与所述样本图像的预设标签信息计算得到所述样本图像的损失参数;
步骤B:根据所述多个样本图像的损失参数的分布从所述多个样本图像中选取目标样本图像,并基于所述目标样本图像对所述至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;
对更新后的所述至少两个神经网络模型利用至少两个第二组多个样本图像依次执行上述步骤A和步骤B,直至所述至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;
提供所述训练后的至少两个神经网络模型中的至少一个神经网络模型用于对待检测图像进行模糊检测以得到模糊检测结果。
其中,第一组多个样本图像和第二组多个样本图像可以有部分相同的图像,也可以是两组完全不同的图像。
例如,计算机设备A在获取到训练样本数据后,从训练样本数据中提取出多张样本图像以及每个样本图像对应的模糊标签值。然后,将提取出的每一样本图像输 入到至少两个神经网络模型中进行检测,得到每一样本图像在至少两个神经网络模型下输出的模糊概率值集合;根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数;根据损失参数确定目标样本图像,并基于这些目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;返回执行将多个样本图像分别输入至更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直到至少两个神经网络模型参数收敛,得到训练后的至少两个神经网络模型。以此,完成本申请中用于图像检测的神经网络模型的训练。在对模型进行训练后,可以将需要进行模糊检测的待检测图像输入至训练后的至少两个神经网络模型,得到对待检测图像的图像检测结果。
需要说明的是,图1所示的图像检测模型训练的场景示意图仅仅是一个示例,本申请实施例描述的图像检测模型训练场景是为了更加清楚地说明本申请的技术方案,并不构成对于本申请提供的技术方案的限定。本领域普通技术人员可知,随着图像检测模型训练的演变和新业务场景的出现,本申请提供的技术方案对于类似的技术问题,同样适用。
基于上述实施场景以下分别进行详细说明。
本申请实施例将从图像检测装置的角度进行描述,该图像检测装置可以集成在计算机设备中。其中,计算机设备可以是终端也可以为服务器,该终端可以为手机、平板电脑、笔记本电脑、智能电视、穿戴式智能设备、个人计算机(PC,Personal Computer)等设备。如图2所示,为本申请提供的图像检测方法的流程示意图,该方法包括:
步骤101,获取训练样本数据。
其中,在对图像质量或者视频质量进行评价的场景中,往往会采用图像或者视频中每帧图像是否存在花屏现象来评价图像或者视频的质量。图像花屏,是指图像中存在模糊的现象使得图像内容难以辨别,该现象类似于电脑屏幕出现花屏时显示图像的异常,以此称为图像花屏。
在相关技术中,一般由人眼对图像进行人为判断图像是否为花屏图像,但人眼判断的效率非常低下,为此人们提出采用机器学习技术对图像进行花屏检测的方法。机器学习(Machine Learning,ML)是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、示教学习等技术。
采用机器学习技术对图像进行花屏检测,可以采用卷积神经网络模型进行检测。具体地,可以将带有标签的训练图像输入至一个卷积神经网络中对该卷积神经网络进行训练,然后将待识别的图像输入至训练后的卷积神经网络模型中进行特征提取, 随后经全连接层进行分类,得到检测结果。其中,图像的标签信息为人工进行标注的二值标签信息;即图像的标签信息是花屏或者不是花屏这两种标签中的任意一种。然而花屏图像不是简单的二分类,很多花屏图像只是轻微花屏或者局部花屏,如果简单地将图像确定为花屏或者不花屏会具有很多的主观性,如此便会使得人工标注的标签不够准确,进而便会导致训练的到的神经网络模型的检测性能下降,导致图像检测结果不够准确。
为了解决上述人工标注标签不准确导致训练得到的模型的图像检测结果不准确的技术问题,本申请提出一种图像检测方法,下面对本申请提供的图像检测方法进行进一步的详细介绍。
同样地,在本申请实施例中,仍然需要使用到样本数据对检测模型进行训练,因此需要先对样本数据进行获取。其中,样本数据可以存储在区块链上。其中,样本数据包括多张样本图像以及每个样本图像对应的标签信息。其中,样本图像对应的标签信息为样本的二值标签,即该样本图像为花屏或者不花屏。如前所述,该样本图像的二值标签由于是人工进行标注的,鉴于人工标注的主观性,使得样本图像的标签信息中含有部分噪声,亦即存在部分标签不够准确。
步骤102,将每一样本图像分别输入到至少两个神经网络模型,得到每一样本图像在至少两个神经网络模型下输出的模糊概率值集合。
其中,在本申请实施例中,采用多个神经网络模型进行协同训练。其中,此处多个为至少两个,具体可以为两个、三个或者更多个神经网络模型。此处神经网络模型也可以为任意结构的卷积神经网络模型。至少两个神经网络模型可以为未经训练的神经网络模型,也可以是经过一定预训练的人工神经网络模型。
将样本数据中包含的多个样本图像逐一输入到至少两个神经网络模型进行模糊检测。其中,此处模糊检测为对图像的模糊概率进行检测,或者为对图像的花屏概率进行检测。那么对应输出的结果即为图像的模糊概率值,此处图像的模糊概率值即为图像为花屏图像的概率值。可以理解的是,对于任意一个目标样本图像,输入到至少两个神经网络模型中,都会得到每个神经网络模型输出的模糊概率值,得到该目标样本图像对应的至少两个模糊概率值,该至少两个模糊概率值构成该目标样本图像对应的模糊概率值集合。同样地,对于其他样本图像,输入到至少两个神经网络模型中,也可以得到至少两个神经网络模型输出的模糊概率值集合,进而得到每个样本图像对应的模糊概率值集合。
步骤103,根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数。
其中,在计算得到每个样本图像的模糊概率值集合后,根据该模糊概率值集合与每个样本图像对应的标签信息计算每个样本图像对应的损失参数。其中,此处的损失参数为评价样本图像的标签值与模型输出结果之间的差异的参数,随着训练过程中模型不断地更新,样本图像对应的损失参数会逐渐减小,即模型的输出结果会不断向样本图像的标签值靠近。由于在本申请实施例中,采用多模型协同进行训练, 因此此处的损失参数为评价多个模型输出结果的综合结果与样本图像的标签值之间的差异的参数。具体地,该损失参数可以为样本图像的标签值与每个模型的输出值之间的多个差值的总和。例如,当样本图像的标签值为1时,即该样本图像是模糊图像,用于协同训练的神经网络模型的数量为2,两个神经网络模型对该样本图像进行检测得到的模糊概率值分别为0.95和0.98,那么该损失参数可以为(1-0.95)+(1-0.98)=0.07。
在一些实施例中,根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数,包括:
1、计算每个样本图像对应的模糊概率值集合中每个模糊概率值与对应的标签信息之间的第一交叉熵:
2、将计算得到的第一交叉熵进行求和,得到每个样本图像对应的第一子损失参数;
3、根据每个样本图像对应的第一子损失参数确定每个样本图像对应的损失参数。
其中,在本申请实施例中,可以根据每个样本图像对应的模糊概率值集合中的元素组成的概率值序列与样本图像的标签组成的标签序列的交叉熵来确定样本图像对应的损失参数。其中,样本图像的标签组成的标签序列为多个样本图像的标签值组成的数值序列,此处数值序列的数值数量为至少两个神经网络模型的数量。例如,当用于进行协同训练的神经网络模型的数量为5,目标样本图像的标签值为1,那么标签序列便为{1,1,1,1,1}。
其中,交叉熵(Cross Entropy,CE)是信息论中的一个重要概念,主要用于度量两个概率分布间的差异性信息。交叉熵可以在神经网络中作为损失函数,用于衡量模型预测分布与样本真实分布之间的相似性。交叉熵作为损失函数的一个好处是可以在梯度下降时避免均方误差损失函数学习速度低的问题,从而提高模型训练效率。
在计算得到任意一个目标样本图像对应的多个模糊值概率值与对应的标签信息之间的交叉熵之后,得到目标样本图像对应的多个交叉熵。然后将目标样本图像对应的多个交叉熵进行求和,得到目标样本图像对应的第一子损失参数,确定该第一子损失参数为目标样本图像的损失参数。然后,进一步地可以根据上述方法同样地确定每一样本图像对应的损失参数。
在一些实施例中,本申请实施例提供的图像检测方法还包括:
A、计算每个样本图像对应的模糊概率值集合中每两个模糊概率值之间的相对熵:
B、将相对熵进行求和,得到每个样本图像对应的第二子损失参数;
C、根据每个样本图像对应的第一子损失参数确定每个样本图像对应的损失参数,包括:
对每个样本图像对应的第一子损失参数与第二子损失参数进行加权求和,得到 每个样本图像对应的损失参数。
其中,在本申请实施例中,可以进一步计算同一样本图像在不同模型下输出的模糊概率值之间的相对熵。其中,相对熵(Relative Entropy,RE)又被称为KL散度(Kullback-Leibler Divergence)或者信息散度(Information Divergence,ID),是两个概率分布间差异的非对称性度量。当用于协同训练的神经网络模型的数量为2时,则样本图像对应的相对熵为一个;当用于协同训练的神经网络模型的数量为3时,则样本图像对应的相对熵为3个;当用于协同训练的神经网络模型的数量为n时,则样本图像对应的相对熵为n*(n-1)/2个。在计算得到样本图像对应的所有相对熵之后,对这些相对熵的值进行求和,得到样本图像对应的第二子损失参数。进一步地,将上述第一子损失参数和第二子损失参数进行加权求和,得到样本图像对应的损失参数,然后可以进一步确定每个样本图像对应的损失参数。将同一样本图像在不同神经网络模型中输出值的相对熵添加至样本图像的损失参数中,使得在模型训练时不同神经网络模型的输出不断接近,进而提高模型训练的准确性。
在一些实施例中,方法还包括:
a、获取样本数据中标签信息的概率分布信息,并基于概率分布信息生成对应的特征向量;
b、计算特征向量与每个样本图像对应的模糊概率值集合之间的第二交叉熵:
c、对计算得到的第二交叉熵进行求和,得到每个样本图像对应的第三子损失参数;
d、对每个样本图像对应的第一子损失参数与第二子损失参数进行加权求和,得到每个样本图像对应的损失参数,包括:
对每个样本图像对应的第一子损失参数、第二子损失参数以及第三子损失参数进行加权求和,得到每个样本图像对应的损失参数。
其中,在本申请实施例中,可以先对多个样本图像的标签信息进行确定,然后根据该多个样本图像的标签信息获取样本数据中标签信息的概率分布信息。例如,当样本图像的数量为10,其中标签为1的样本数量为5,标签为0的样本数量为5,那么可以确定样本数据中标签信息的概率分布为[0.5,0.5]。进一步地,可以根据该概率分布信息生成其对应的特征向量以便进行交叉熵的计算。进一步地,可以计算该概率分布情况与每个样本图像对应的模糊概率值集合之间的交叉熵,然后再对得到的交叉熵进行求和,得到每个样本图像对应的第三子损失参数。进一步地,可以将上述第一子损失参数、第二子损失参数以及第三子损失参数进行加权求和,得到每个样本图像对应的损失参数。
以上仅为一些实施方式的例子。在另一些实施例中,可以利用第一子损失参数、第二子损失参数以及第三子损失参数的任一个或任意组合。例如,可以单独使用第一子损失参数、第二子损失参数或第三子损失参数作为样本图像对应的损失参数。又例如,可以对第一子损失参数和第三子损失参数进行加权求和,得到样本图像对应的损失参数。
步骤104,根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型。
其中,在计算得到每个样本图像对应的损失参数后,根据样本图像对应的损失参数的分布情况从样本图像中选取出一定数量个损失参数值较小的目标样本图像。然后,采用该一定数量个目标样本图像对至少两个神经网络模型进行训练,并采用训练得到的至少两个神经网络模型对初始的至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型。其中,损失参数值越小的样本图像,经过模型检测得到的输出值与样本图像的标签越接近,其标签值的准确性越高。而损失参数值越大的样本图像,其标签值的准确性便越低。因此可以将部分损失参数值较大的样本图像从样本图像中剔除,使得剩余的样本图像的标签值准确性更高,从而使得模型训练的到的模型的检测准确度越高。
其中,在一些实施例中,根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,包括:
1、获取对至少两个神经网络模型进行迭代训练的训练次数;
2、根据迭代训练的训练次数计算目标样本图像的目标数量;
3、按照损失参数由小至大的顺序选取目标数量个样本图像,得到目标样本图像。
其中,在多个样本图像中确定了一定数量的目标样本图像,并基于该目标样本图像对至少两个神经网络模型进行训练以及更新之后,采用更新后的至少两个神经网络模型对每个样本图像进行再次检测,得到每个样本图像对应的模糊概率值集合;再基于新的模糊概率值集合和每个样本图像的标签值计算每个样本图像新的损失参数值,并基于新的损失参数值重新确定目标样本图像以及基于新的目标样本图像对更新后的至少两个神经网络模型进行再次训练以及更新,如此对至少两个神经网络模型进行多次迭代训练。
在本申请实施例中,对至少两个神经网络模型进行迭代训练的每次迭代过程中确定的目标样本图像的数量跟模型训练的迭代次数相关。即在对模型进行迭代训练的每个循环中,目标样本图像的数量都是不同的。迭代训练次数越多,所采用的样本图像的数量可以越少,如此使得在不断迭代训练的过程中,逐渐将标签值不够准确的训练样本进行剔除。因此,在每次对目标样本图像进行确定时,可以先获取当前对至少两个神经网络模型进行迭代训练的训练次数。例如对至少两个神经网络模型进行第5次训练,那么便确定迭代训练次数为5。然后根据该训练次数计算需要保留的目标样本图像的目标数量。最后,再基于每个样本图像的损失参数由小至大的顺序选取目标数量个样本图像,得到目标样本图像。即确定了多个样本图像中目标数量个损失参数值较小的样本图像为目标样本图像。
在一些实施例中,根据迭代训练的训练次数计算目标样本图像的目标数量,包括:
2.1、获取预设的筛选率,筛选率用于控制对多个样本图像进行筛选;
2.2、根据筛选率与迭代训练的训练次数计算目标样本图像在多个样本图像中的占比;
2.3、根据占比以及多个样本图像的数量计算得到目标样本图像的目标数量。
其中,在本申请实施例中,对目标样本图像的数量进行计算的过程,可以先获取到一个预先设置的筛选率。其中,筛选率为对从多个样本图像中选取目标样本图像的数量进行控制的比例。根据该预设的筛选率,在模型训练的后期,目标样本图像的数量可以为多个样本图像的数量与预设筛选率的乘积。因此,在获取到预设筛选率后,可以根据该预设筛选率与迭代训练的次数计算出本次迭代训练中所选取的目标样本图像的数量在多个样本图像的数量中的占比。然后可以进一步根据该占比和多个样本图像的数量计算得到目标样本图像的目标数量。以此,可以通过设置预设筛选率对目标样本图像的数量进行控制,从而保证既能筛选掉足够多的标签值不够准确的样本图像,又能保证具有足够的样本图像对模型进行训练。
步骤105,返回执行将多个样本图像分别输入到更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型。
一些实施例中,在步骤105中,可以获取另一组多个样本图像,并将其输入更新后的神经网络模型中进行迭代。其中,该另一组多个样本图像是尚未用于对至少两个神经网络模型进行训练的样本图像。例如,训练集共有800张样本图像,假设每次迭代训练使用8张样本图像,每次迭代训练时,可以从训练集中选择未使用过的8张样本图像。这样,如果迭代训练100次,就会将训练集中的所有样本图像使用一遍,这叫做一个轮次(epoch)。一些实施例中,会训练多个轮次。
其中,上述步骤102至步骤104为对模型进行迭代训练中的一个循环过程。即采用至少两个神经网络对多个样本图像进行模糊检测,输出每个样本图像对应的模糊概率值集合,基于每个样本图像的模糊概率值集合与每个样本图像的标签值计算每个样本图像对应的损失参数,然后基于每个样本图像的损失参数确定目标样本图像,进一步采用目标样本图像对至少两个神经网络模型进行训练以及更新这些步骤,是对至少两个神经网络模型进行迭代训练的一个循环过程。
在得到更新后的至少两个神经网络模型之后,还需将更新后的至少两个神经网络模型代入到步骤102中进行下一个循环的处理。即将多个样本图像分别输入到更新后的至少两个神经网络模型中,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合。然后再基于该模糊概率值集合与每个样本图像的标签值再次计算每个样本图像对应的新的损失参数。再进一步基于每个样本图像的损失参数以及迭代训练的次数确定新的目标样本图像,再采用新的目标样本图像对更新后的至少两个神经网络模型进行再次训练和更新。如此,对至少两个神经网络模型进行迭代训练,直至该至少两个神经网络模型的模型参数收敛,得到训练后的至少两个神经网络模型。
步骤106,采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果。
其中,在对至少两个神经网络模型进行训练,得到训练后的至少两个神经网络模型之后,采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果。
其中,在一些实施例中,采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果,包括:
1、将待检测图像输入至训练后的至少两个神经网络模型进行模糊检测,得到至少两个模糊概率值;
2、计算至少两个模糊概率值的平均值,得到待检测图像对应的模糊概率。
其中,在本申请实施例中,在对至少两个神经网络模型进行迭代训练得到训练后的至少两个神经网络模型后,将待检测图像输入至训练后的至少两个神经网络模型进行模糊检测,得到每个训练后的神经网络模型对待检测图像进行模糊检测得到的模糊概率值,即得到了至少两个模糊概率值。然后,对至少两个模糊概率值进行求平均计算,得到最终的模糊概率,该模糊概率便是训练后的至少两个神经网络模型对待检测图像进行模糊检测得到的检测结果。在一些实施例中,还可以进一步根据训练后的至少两个神经网络模型对待检测图像进行模糊检测得到的模糊概率值进一步确定模糊检测的二值结果,即根据模糊概率值确定待检测图像是模糊图像还是非模糊图像。
在一些实施例中,采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果,包括:
A、获取训练后的至少两个神经网络模型的预测准确率,得到至少两个预测准确率;
B、将至少两个预测准确率按照由高至低的顺序进行排序,并确定预测准确率最高的神经网络模型为目标神经网络模型;
C、将待检测图像输入至目标神经网络模型进行模糊检测,得到待检测图像对应的模糊概率。
其中,在本申请实施例中,在对至少两个神经网络模型进行训练得到训练后的至少两个神经网络模型之后,可以无需使用训练后的所有神经网络模型对待检测图像进行图像检测。而是对训练后的至少两个神经网络模型中每个神经网络模型的模型预测准确率进行获取,然后将预测准确率最高的神经网络模型确定为目标神经网络模型。最后采用目标神经网络模型对待检测图像进行模糊检测,得到目标神经网络输出的模糊概率值,确定该目标神经网络输出的模糊概率值为对待检测图像的模糊检测的检测结果。
根据上述描述可知,本申请实施例提供的图像检测方法,通过获取训练样本数据,训练样本数据包括多个样本图像以及每个样本图像对应的标签信息;将每一样本图像分别输入到至少两个神经网络模型,得到每一样本图像在至少两个神经网络 模型下输出的模糊概率值集合;根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数;根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;返回执行将多个样本图像分别输入到更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果。以此,通过采用多模型协同对训练样本中的噪声样本进行筛选,改善了模型训练的效果,进而提高了图像检测的准确率。
相应地,本申请实施例将从计算机设备的角度进一步对本申请提供的图像检测方法进行详细的描述,其中计算机设备可以为终端也可以为服务器。如图3所示,为本申请提供的图像检测方法的另一流程示意图,该方法包括:
步骤201,计算机设备获取包含多张样本图像以及每张样本图像的标签的训练样本数据。
其中,如前述实施例所述,对图像检测模型进行训练的样本数据中样本图像对应的标签为人工标注的标签,此处样本图像的标签可以为样本图像的花屏二值标签,由于图像花屏现象并非简单的花屏或者不花屏这样的二值标签可以准确标注的,图像的花屏还存在着轻微花屏或者局部花屏这样的处于中间态的状态。其中,在本申请实施例中,图像花屏是指图像模糊导致无法对图像的部分或全部内容进行识别的情况。因此采用简单的二值标签对样本图像的花屏状态进行标注,会使得样本图像的标签信息不够准确。为解决上述采用简单二值标签标注样本图像的花屏状态导致样本图像的标签信息不够准确,进而导致训练得到的图像检测模型的检测结果不够准确的技术问题,本申请提出一种图像检测方法。下面对本申请提供的图像检测方法进行进一步的详细描述。
在本申请实施例中,仍然采用具有花屏二值标签的样本图像对检测模型进行训练,因此,首先对训练样本数据进行获取,训练样本数据包括多个样本图像以及每个样本图像对应的花屏二值标签。其中,样本图像的花屏二值标签便是样本图像是花屏图像或者不是花屏图像,当样本图像是花屏图像时,样本图像的二值标签便为1;当样本图像不是花屏图像时,样本图像的二值标签便为0。
步骤202,计算机设备将多个样本图像分别输入到两个神经网络模型中进行花屏检测,得到每个样本图像在两个神经网络模型中输出的两个花屏概率值。
其中,在本申请实施例中,可以采用多模型协同训练方法对图像花屏检测的模型进行训练。由于不同的神经网络模型具有不同的决策边界,具体为每次开始训练的时候,神经网络模型的参数是随机进行初始化的。因此不同模型具有不同的排除噪声样本(即标签不准确的样本)的能力,那么多个模型协同训练可以很好地继承各个模型的优势,进行优势互补,从而提升对噪声样本的筛选能力。具体地,多模 型可以为两个神经网络模型、三个神经网络模型或者更多数量的神经网络模型。在本申请实施例中,以采用两个神经网络模型进行协同训练为例进行详细描述。
在获取到多个样本图像以及每个样本图像的花屏二值标签之后,将该多个样本图像分别输入到两个神经网络模型中,得到每个样本图像在两个神经网络模型中输出的两个花屏概率值。其中,两个神经网络模型可以分别记为第一神经网络模型和第二神经网络模型,第一神经网络模型输出的花屏概率值可以记为p1,第二神经网络模型输出的花屏概率值可以记为p2。
步骤203,计算机设备计算两个花屏概率值和样本标签之间的交叉熵,得到第一子损失参数。
其中,在确定了每一样本图像在两个神经网络模型下输出的花屏概率值之后,采用每个样本图像的花屏概率值与样本标签计算每个样本图像对应的交叉熵,具体计算公式如下:
Lc1=-[y*logp1+(1-y)*log(1-p1)]
Lc2=-[y*logp2+(1-y)*log(1-p2)]
其中,Lc1为第一神经网络模型对应的交叉熵;y为样本图像对应的标签值,即0或1;p1为第一神经网络对样本图像进行花屏检测得到的花屏概率值。Lc2为第二神经网络模型对应的交叉熵;p2为第二神经网络对样本图像进行花屏检测得到的花屏概率值。
然后,对计算得到的两个交叉熵进行求和,得到第一子损失参数,计算公式如下:
Lc=Lc1+Lc2
其中,Lc为求得的第一子损失参数,或者也可以称为分类损失。
步骤204,计算机设备计算两个花屏概率值之间的相对熵,得到第二子损失参数。
其中,如前所述,相对熵又可以称之为KL散度,求两个花屏概率值之间的相
Figure PCTCN2022098383-appb-000001
或者可以称为交叉正则损失。计算交叉正则损失的目的在于约束两个模型输出的花屏概率值之间的概率分布相似度,希望随着模型训练的进行,对于相同样本图像在两个模型下输出的概率值可以更加靠近。
此处由于只以两个神经网络模型为例进行说明,因此相对熵只有一个,若以多个神经网络模型进行协同训练,则需两两计算神经网络模型输出的花屏概率值之间的相对熵,并将求得的多个相对熵进行求和确定第二子损失参数。具体地,例如还有第三神经网络模型,样本图像经第三神经网络模型进行花屏检测输出的花屏概率值为p3,那么就还需要计算p1与p3之间的相对熵以及p2与p3之间的相对熵,再将三个相对熵求和得到第二子损失参数。
步骤205,计算机设备计算两个花屏概率值与样本图像标签分布之间的相对熵,得到第三子损失参数。
其中,样本图像标签分布为多个样本图像的标签值的分布情况。具体地,例如样本图像总计有100张,其中40张的标签值为1,60张的标签值为0,那么可以确定这100张样本图像中花屏与正常图像之间的比例为4:6,那么可以得到样本图像的标签分布P prior=[0.4,0.6]。然后计算两个花屏概率值与样本图像标签分布之间的交叉熵,具体计算公式如下:
L p1=-P prior*logp1
L p2=-P prior*logp2
其中,L p1为第一神经网络模型对应的交叉熵,L p2为第二神经网络模型对应的交叉熵。
那么进一步可以计算得到第三子损失参数,其计算公式如下:
L p=L p1+L p2
其中,L p为第三子损失参数,或者称为先验损失。添加先验损失的目的是希望随着对模型训练的进行两个模型的输出概率值分布能够不断接近人工标签值的分布。
如图4所示,为本申请实施例提供的样本图像损失参数计算框架示意图。样本图像10经第一神经网络模型21检测输出第一花屏概率值p1,样本图像10经第二神经网络模型22检测输出第二花屏概率值p2。然后,基于第一花屏概率值p1计算得到第一分类损失和第一先验损失,基于第二花屏概率值p2计算得到第二分类损失和第二先验损失,基于第一花屏概率值p1与第二花屏概率值p2计算得到交叉正则损失,最后再对第一分类损失、第一先验损失、第二分类损失、第二先验损失以及交叉正则损失进行加权求和,求得样本图像对应的损失参数。
步骤206,计算机设备根据第一子损失参数、第二子损失参数以及第三子损失参数计算每个样本图像对应的损失参数。
其中,在计算得到每个样本图像对应的分类损失、交叉正则损失以及先验损失之后,可以对三者进行加权求和,得到每个样本图像对应的损失参数。具体计算公式如下:
L=Lc+αL reg+βL p
其中,α为控制交叉正则损失的权重系数,β为控制先验损失的权重系数。然后,采用上述损失参数作为模型的端到端训练损失参数,指导模型的训练过程。
步骤207,计算机设备根据每个样本图像的损失参数确定目标样本图像。
其中,在计算得到每个样本图像对应的损失参数后,需要根据样本图像的损失参数对样本图像进行筛选,以剔除噪声较大(标签值不够准确)的样本。一般情况下,样本输出的损失参数值越大,则样本噪声越大,因此需要剔除掉部分损失参数较大的样本图像,采用损失参数值较小的目标样本图像对模型进行训练。
其中,目标样本图像的占比可以用如下公式进行计算:
Figure PCTCN2022098383-appb-000002
其中,R(t)为目标样本图像在多个样本图像中的占比,t为当前训练的迭代次数,T k为超参数,用于控制当前训练迭代次数t下对应的筛选率,τ为一个预设的筛选率。
根据R(t)的计算公式可知,当迭代训练初期,t较小时,R(t)值较大,会采用更多的样本图像对两个神经网络模型进行训练,对噪声样本的筛选比例较小。当迭代训练进入到后期,当t逐渐变大时,R(t)逐渐变小,即目标样本的数量也逐渐减少,对噪声样本的筛选比例变大,从而会剔除掉大部分的噪声样本图像。
在计算得到目标图像在多个样本图像中的占比R(t)之后,根据该占比从多个样本图像中选择损失参数最小的R(t)占比的样本图像作为目标样本图像。
步骤208,计算机设备采用目标样本图像对两个神经网络模型进行训练,并采用训练后的两个神经网络模型对两个神经网络模型进行更新。
其中,在确定了用于训练的目标样本图像后,采用目标样本图像以及其对应的标签值对两个神经网络模型进行训练,实现对两个神经网络模型的模型参数进行更新,得到更新后的两个神经网络模型。然后再采用更新后的两个神经网络模型进行进一步的训练和更新。
步骤209,计算机设备判断迭代训练次数是否达到预设次数。
其中,在每次对两个神经网络模型进行更新之后,计算机设备都需要对迭代训练次数进行判断,以确定是否达到预设迭代训练次数。若未达到则返回步骤202,采用更新后的两个神经网络模型重新对每个样本图像进行花屏检测,得到新的花屏概率值,再进一步根据新的花屏概率值计算每个样本图像的新的损失参数,然后再重新确定新的目标样本图像,采用新的目标样本图像对更新后的两个神经网络模型进行再一次的训练和更新。
步骤210,计算机设备确定更新得到的两个神经网络模型为训练后的两个神经网络模型。
若迭代训练次数达到了预设次数,则确定最终得到的两个神经网络模型为最终的训练后的神经网络模型。
步骤211,计算机设备采用训练后的两个神经网络模型对待检测图像进行花屏检测,得到花屏检测结果。
其中,在确定了训练后的两个神经网络模型后,可以采用训练后的两个神经网络模型对待检测图像进行花屏检测。具体地,可以从两个训练后的神经网络模型中确定检测结果更好的目标神经网络模型对待检测图像进行检测。两个训练后的神经网络模型的检测效果可以采用已经标注准确标签的图像对其检测效果进行验证。
采用目标神经网络模型对待检测图像进行花屏检测,输入待检测图像的花屏概率值,然后进一步根据该花屏概率值确定待检测图像的花屏二值结果,即是花屏图像还是不是花屏图像。具体地,可以根据检测输出的花屏概率值与预设概率值的比对结果确定待检测图像的花屏二值结果。例如当目标神经网络模型对待检测图像进行花屏检测输出的花屏概率值为0.9,而预设的花屏概率值为0.95,那么则确定该待检测图像为花屏图像。
根据上述描述可知,本申请实施例提供的图像检测方法,通过获取训练样本数据,训练样本数据包括多个样本图像以及每个样本图像对应的标签信息;将每一样本图像分别输入到至少两个神经网络模型,得到每一样本图像在至少两个神经网络模型下输出的模糊概率值集合;根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数;根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;返回执行将多个样本图像分别输入到更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果。以此,通过采用多模型协同对训练样本中的噪声样本进行筛选,提高了模型训练的效果,进而进一步提高了图像检测的准确率。
为了更好地实施以上方法,本发明实施例还提供一种图像检测装置,该图像检测装置可以集成在终端中。
例如,如图5所示,为本申请实施例提供的图像检测装置的结构示意图,该图像检测装置可以包括获取单元301、输入单元302、计算单元303、选取单元304、训练单元305以及检测单元306,如下:
获取单元301,用于获取训练样本数据,训练样本数据包括多个样本图像以及每个样本图像对应的标签信息;
输入单元302,用于将每一样本图像分别输入到至少两个神经网络模型,得到每一样本图像在至少两个神经网络模型下输出的模糊概率值集合;
计算单元303,用于根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数;
选取单元304,用于根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;
训练单元305,用于返回执行将多个样本图像分别输入到更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;
检测单元306,用于采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果。
在一些实施例中,计算单元,包括:
第一计算子单元,用于计算每个样本图像对应的模糊概率值集合中每个模糊概率值与对应的标签信息之间的第一交叉熵:
第二求和子单元,用于将计算得到的第一交叉熵进行求和,得到每个样本图像 对应的第一子损失参数;
确定子单元,用于根据每个样本图像对应的第一子损失参数确定每个样本图像对应的损失参数。
在一些实施例中,本申请提供的图像检测装置还包括:
第二计算子单元,用于计算每个样本图像对应的模糊概率值集合中每两个模糊概率值之间的相对熵:
第二求和子单元,用于将相对熵进行求和,得到每个样本图像对应的第二子损失参数;
确定子单元,还用于:
对每个样本图像对应的第一子损失参数与第二子损失参数进行加权求和,得到每个样本图像对应的损失参数。
在一些实施例中,本申请提供的图像检测装置还包括:
第一获取子单元,用于获取样本数据中标签信息的概率分布信息,并基于概率分布信息生成对应的特征向量;
第三计算子单元,用于计算特征向量与每个样本图像对应的模糊概率值集合之间的第二交叉熵:
第三求和子单元,用于对计算得到的第二交叉熵进行求和,得到每个样本图像对应的第三子损失参数;
确定子单元,还用于:
对每个样本图像对应的第一子损失参数、第二子损失参数以及第三子损失参数进行加权求和,得到每个样本图像对应的损失参数。
在一些实施例中,所选取单元,包括:
第二获取子单元,用于获取对至少两个神经网络模型进行迭代训练的训练次数;
第四计算子单元,用于根据迭代训练的训练次数计算目标样本图像的目标数量;
选取子单元,用于按照损失参数由小至大的顺序选取目标数量个样本图像,得到目标样本图像。
在一些实施例中,第四计算子单元,包括:
获取模块,用于获取预设的筛选率,筛选率用于控制对多个样本图像进行筛选;
第一计算模块,用于根据筛选率与迭代训练的训练次数计算目标样本图像在多个样本图像中的占比;
第二计算模块,用于根据占比以及多个样本图像的数量计算得到目标样本图像的目标数量。
在一些实施例中,检测单元,包括:
第一输入子单元,用于将待检测图像输入至训练后的至少两个神经网络模型进行模糊检测,得到至少两个模糊概率值;
第五计算子单元,用于计算至少两个模糊概率值的平均值,得到待检测图像对应的模糊概率。
在一些实施例中,检测单元,包括:
第三获取子单元,用于获取训练后的至少两个神经网络模型的预测准确率,得到至少两个预测准确率;
排序子单元,用于将至少两个预测准确率按照由高至低的顺序进行排序,并确定预测准确率最高的神经网络模型为目标神经网络模型;
检测子单元,用于将待检测图像输入至目标神经网络模型进行模糊检测,得到待检测图像对应的模糊概率。
具体实施时,以上各个单元可以作为独立的实体来实现,也可以进行任意组合,作为同一或若干个实体来实现,以上各个单元的具体实施可参见前面的方法实施例,在此不再赘述。
根据上述描述可知,本申请实施例提供的图像检测方法,通过获取单元301获取训练样本数据,训练样本数据包括多个样本图像以及每个样本图像对应的标签信息;输入单元302将每一样本图像分别输入到至少两个神经网络模型,得到每一样本图像在至少两个神经网络模型下输出的模糊概率值集合;计算单元303根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数;选取单元304根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;训练单元305返回执行将多个样本图像分别输入到更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;检测单元306采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果。以此,通过采用多模型协同对训练样本中的噪声样本进行筛选,提高了模型训练的效果,进而进一步提高了图像检测的准确率。
本申请实施例还提供一种计算机设备,该计算机设备可以为终端,如图6所示,该终端可以包括射频(RF,Radio Frequency)电路401、包括有一个或一个以上计算机可读存储介质的存储器402、输入组件403、显示单元404、传感器405、音频电路406、无线保真(WiFi,Wireless Fidelity)模块407、包括有一个或者一个以上处理核心的处理器408、以及电源409等部件。本领域技术人员可以理解,图6中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
存储器402可用于存储软件程序以及模块,处理器408通过运行存储在存储器402的软件程序以及模块,从而执行各种功能应用以及信息互动。
在本实施例中,终端中的处理器408会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行文件加载到存储器402中,并由处理器408来运行存储在存储器402中的应用程序,从而实现各种功能:
获取训练样本数据,训练样本数据包括多个样本图像以及每个样本图像对应的 标签信息;将每一样本图像分别输入到至少两个神经网络模型,得到每一样本图像在至少两个神经网络模型下输出的模糊概率值集合;根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数;根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;返回执行将多个样本图像分别输入到更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果。
应当说明的是,本申请实施例提供的计算机设备与上文实施例中的方法属于同一构思,以上各个操作的具体实施可参见前面的实施例,在此不作赘述。
本申请实施例还提供一种计算机设备,该计算机设备可以为服务器,如图7所示,为本申请提供的计算机设备的结构示意图。具体来讲:
该计算机设备可以包括一个或者一个以上处理核心的处理单元501、一个或一个以上存储介质的存储单元502、电源模块503和输入模块504等部件。本领域技术人员可以理解,图7中示出的计算机设备结构并不构成对计算机设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
处理单元501是该计算机设备的控制中心,利用各种接口和线路连接整个计算机设备的各个部分,通过运行或执行存储在存储单元502内的软件程序和/或模块,以及调用存储在存储单元502内的数据,执行计算机设备的各种功能和处理数据,从而对计算机设备进行整体监控。可选的,处理单元501可包括一个或多个处理核心;优选的,处理单元501可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理单元501中。
存储单元502可用于存储软件程序以及模块,处理单元501通过运行存储在存储单元502的软件程序以及模块,从而执行各种功能应用以及数据处理。存储单元502可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能以及网页访问等)等;存储数据区可存储根据计算机设备的使用所创建的数据等。此外,存储单元502可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储单元502还可以包括存储器控制器,以提供处理单元501对存储单元502的访问。
计算机设备还包括给各个部件供电的电源模块503,优选的,电源模块503可以通过电源管理系统与处理单元501逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。电源模块503还可以包括一个或一个以上的直流或交流电源、再充电系统、电源故障检测电路、电源转换器或者逆变器、电源状态 指示器等任意组件。
该计算机设备还可包括输入模块504,该输入模块504可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。
尽管未示出,计算机设备还可以包括显示单元等,在此不再赘述。具体在本实施例中,计算机设备中的处理单元501会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行文件加载到存储单元502中,并由处理单元501来运行存储在存储单元502中的应用程序,从而实现各种功能,如下:
获取训练样本数据,训练样本数据包括多个样本图像以及每个样本图像对应的标签信息;将每一样本图像分别输入到至少两个神经网络模型,得到每一样本图像在至少两个神经网络模型下输出的模糊概率值集合;根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数;根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;返回执行将多个样本图像分别输入到更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;采用训练后的至少两个神经网络模型对待检测图像进行模糊检测,得到模糊检测结果。
应当说明的是,本申请实施例提供的计算机设备与上文实施例中的方法属于同一构思,以上各个操作的具体实施可参见前面的实施例,在此不作赘述。
本领域普通技术人员可以理解,上述实施例的各种方法中的全部或部分步骤可以通过指令来完成,或通过指令控制相关的硬件来完成,该指令可以存储于一计算机可读存储介质中,并由处理器进行加载和执行。
为此,本发明实施例提供一种计算机可读存储介质,其中存储有多条指令,该指令能够被处理器进行加载,以执行本发明实施例所提供的任一种方法中的步骤。例如,该指令可以执行如下步骤:
获取训练样本数据,训练样本数据包括多个样本图像以及每个样本图像对应的标签信息;将每一样本图像分别输入到至少两个神经网络模型,得到每一样本图像在至少两个神经网络模型下输出的模糊概率值集合;根据每个样本图像的模糊概率值集合与每个样本图像对应的标签信息计算得到每个样本图像对应的损失参数;根据每个样本图像对应的损失参数的分布从多个样本图像中选取目标样本图像,并基于目标样本图像对至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;返回执行将多个样本图像分别输入到更新后的至少两个神经网络模型,得到每一样本图像在更新后的至少两个神经网络模型下输出的模糊概率值集合和对应更新后的目标图像并进行迭代训练,直至至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;采用训练后的至少两个神经网络模型对待检测图像进行 模糊检测,得到模糊检测结果。以上各个操作的具体实施可参见前面的实施例,在此不再赘述。
其中,该计算机可读存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)、磁盘或光盘等。
由于该计算机可读存储介质中所存储的指令,可以执行本发明实施例所提供的任一种方法中的步骤,因此,可以实现本发明实施例所提供的任一种方法所能实现的有益效果,详见前面的实施例,在此不再赘述。
其中,根据本申请的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在存储介质中。计算机设备的处理器从存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述图2或图3的各种可选实现方式中提供的方法。
以上对本发明实施例所提供的一种图像检测方法、装置、计算机可读存储介质及计算机设备进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本发明的限制。

Claims (20)

  1. 一种图像检测方法,由计算机设备执行,所述方法包括:
    步骤a:针对第一组多个样本图像中的每一样本图像,
    将所述样本图像分别输入到至少两个神经网络模型,得到所述样本图像的模糊概率值集合,所述模糊概率值集合包括所述至少两个神经网络模型中每个神经网络模型输出的模糊概率值;
    根据所述模糊概率值集合与所述样本图像的预设标签信息计算得到所述样本图像的损失参数;
    步骤b:根据所述多个样本图像的损失参数的分布从所述多个样本图像中选取目标样本图像,并基于所述目标样本图像对所述至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;
    对更新后的所述至少两个神经网络模型利用至少两个第二组多个样本图像依次执行上述步骤a和步骤b,直至所述至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;
    提供所述训练后的至少两个神经网络模型中的至少一个神经网络模型用于对待检测图像进行模糊检测以得到模糊检测结果。
  2. 根据权利要求1所述的方法,所述根据样本图像的模糊概率值集合与预设的所述样本图像对应的标签信息计算得到每个样本图像的损失参数,包括:
    计算所述样本图像的模糊概率值集合中每个模糊概率值与所述预设标签信息之间的第一交叉熵:
    将计算得到的第一交叉熵进行求和,得到所述样本图像的第一子损失参数;
    根据所述样本图像的第一子损失参数确定所述样本图像对应的损失参数。
  3. 根据权利要求2所述的方法,进一步包括:
    计算所述样本图像的模糊概率值集合中每两个模糊概率值之间的相对熵:
    将所述相对熵进行求和,得到所述样本图像对应的第二子损失参数;
    所述根据所述样本图像的第一子损失参数确定所述样本图像的损失参数,包括:
    对所述样本图像的第一子损失参数与第二子损失参数进行加权求和,得到所述样本图像对应的损失参数。
  4. 根据权利要求2所述的方法,进一步包括:
    获取所述多个样本图像的预设标签信息的概率分布信息,并基于所述概率分布信息生成对应的特征向量;
    计算所述特征向量与所述样本图像对应的模糊概率值集合之间的第二交叉熵:
    对计算得到的第二交叉熵进行求和,得到所述样本图像的第三子损失参数;
    所述根据所述样本图像的第一子损失参数确定所述样本图像的损失参数,包括:
    对所述样本图像对应的第一子损失参数、第三子损失参数进行加权求和,得到所述样本图像对应的损失参数。
  5. 根据权利要求3所述的方法,进一步包括:
    获取所述多个样本图像的预设标签信息的概率分布信息,并基于所述概率分布信息生成对应的特征向量;
    计算所述特征向量与所述样本图像对应的模糊概率值集合之间的第二交叉熵:
    对计算得到的第二交叉熵进行求和,得到所述样本图像的第三子损失参数;
    所述对所述样本图像对应的第一子损失参数与第二子损失参数进行加权求和,得到所述样本图像对应的损失参数,包括:
    对所述样本图像对应的第一子损失参数、第二子损失参数以及第三子损失参数进行加权求和,得到所述样本图像对应的损失参数。
  6. 根据权利要求1所述的方法,所述根据所述多个样本图像对应的损失参数的分布从所述多个样本图像中选取目标样本图像,包括:
    获取对所述至少两个神经网络模型进行迭代训练的训练次数;
    根据所述迭代训练的训练次数计算目标样本图像的目标数量;
    按照损失参数由小至大的顺序选取所述目标数量个样本图像,得到目标样本图像。
  7. 根据权利要求6所述的方法,所述根据所述迭代训练的训练次数计算目标样本图像的目标数量,包括:
    获取预设的筛选率,所述筛选率用于控制对所述多个样本图像进行筛选;
    根据所述筛选率与所述迭代训练的训练次数计算目标样本图像在所述多个样本图像中的占比;
    根据所述占比以及所述多个样本图像的数量计算得到目标样本图像的目标数量。
  8. 根据权利要求1所述的方法,所述提供所述训练后的至少两个神经网络模型中的至少一个神经网络模型用于对待检测图像进行模糊检测以得到模糊检测结果,包括:
    提供所述训练后的至少两个神经网络模型用于对待检测图像分别进行模糊检测并得到至少两个模糊概率值,将所述至少两个模糊概率值的平均值作为所述待检测图像对应的模糊概率。
  9. 根据权利要求1所述的方法,提供所述训练后的至少两个神经网络模型中的至少一个神经网络模型用于对待检测图像进行模糊检测以得到模糊检测结果,包括:
    获取所述训练后的至少两个神经网络模型的预测准确率,得到至少两个预测准确率;
    将所述至少两个预测准确率按照由高至低的顺序进行排序,提供所述至少两个神经网络模型中预测准确率最高的神经网络模型用于对待检测图像进行模糊检测以得到所述待检测图像对应的模糊概率。
  10. 一种图像检测装置,所述装置包括:
    输入单元,用于针对第一组多个样本图像中的每一样本图像,将所述样本图像分别输入到至少两个神经网络模型,得到所述样本图像的模糊概率值集合,所述模糊概率值集合包括所述至少两个神经网络模型中每个神经网络模型输出的模糊概率 值;
    计算单元,用于根据所述模糊概率值集合与所述样本图像的预设标签信息计算得到所述样本图像的损失参数;
    选取单元,用于根据所述多个样本图像的损失参数的分布从所述多个样本图像中选取目标样本图像,并基于所述目标样本图像对所述至少两个神经网络模型进行更新,得到更新后的至少两个神经网络模型;
    训练单元,用于对更新后的至少两个神经网络模型依次利用至少两个第二组多个样本图像使用所述输入单元、所述计算单元和所述选取单元进行迭代训练,直至所述至少两个神经网络模型收敛,得到训练后的至少两个神经网络模型;
    提供单元,用于提供所述训练后的至少两个神经网络模型中的至少一个神经网络模型用于对待检测图像进行模糊检测以得到模糊检测结果。
  11. 根据权利要求10所述的装置,所述计算单元,包括:
    第一计算子单元,用于计算所述样本图像的模糊概率值集合中每个模糊概率值与所述预设标签信息之间的第一交叉熵:
    第二求和子单元,用于将计算得到的第一交叉熵进行求和,得到所述样本图像对应的第一子损失参数;
    确定子单元,用于根据所述样本图像的第一子损失参数确定所述样本图像对应的损失参数。
  12. 根据权利要求11所述的装置,所述装置还包括:
    第二计算子单元,用于计算所述样本图像的模糊概率值集合中每两个模糊概率值之间的相对熵:
    第二求和子单元,用于将所述相对熵进行求和,得到所述样本图像对应的第二子损失参数;
    所述确定子单元,还用于:
    对所述样本图像对应的第一子损失参数与第二子损失参数进行加权求和,得到所述样本图像对应的损失参数。
  13. 根据权利要求11所述的装置,所述装置还包括:
    第一获取子单元,用于获取所述多个样本图像的预设标签信息的概率分布信息,并基于所述概率分布信息生成对应的特征向量;
    第三计算子单元,用于计算所述特征向量与所述样本图像对应的模糊概率值集合之间的第二交叉熵:
    第三求和子单元,用于对计算得到的第二交叉熵进行求和,得到所述样本图像对应的第三子损失参数;
    所述确定子单元,还用于:
    对所述每个样本图像对应的第一子损失参数、第三子损失参数进行加权求和,得到每个样本图像对应的损失参数。
  14. 根据权利要求11所述的装置,所述装置还包括:
    第一获取子单元,用于获取所述多个样本图像的预设标签信息的概率分布信息,并基于所述概率分布信息生成对应的特征向量;
    第三计算子单元,用于计算所述特征向量与所述样本图像对应的模糊概率值集合之间的第二交叉熵:
    第三求和子单元,用于对计算得到的第二交叉熵进行求和,得到所述样本图像对应的第三子损失参数;
    所述确定子单元,还用于:
    对所述样本图像对应的第一子损失参数、第二子损失参数以及第三子损失参数进行加权求和,得到所述样本图像对应的损失参数。
  15. 根据权利要求10所述的装置,所选取单元,包括:
    第二获取子单元,用于获取对所述至少两个神经网络模型进行迭代训练的训练次数;
    第四计算子单元,用于根据所述迭代训练的训练次数计算目标样本图像的目标数量;
    选取子单元,用于按照损失参数由小至大的顺序选取所述目标数量个样本图像,得到目标样本图像。
  16. 根据权利要求15所述的装置,所述第四计算子单元,包括:
    获取模块,用于获取预设的筛选率,所述筛选率用于控制对所述多个样本图像进行筛选;
    第一计算模块,用于根据所述筛选率与所述迭代训练的训练次数计算目标样本图像在所述多个样本图像中的占比;
    第二计算模块,用于根据所述占比以及所述多个样本图像的数量计算得到目标样本图像的目标数量。
  17. 根据权利要求10所述的装置,所述检测单元,包括:
    第三获取子单元,用于获取所述训练后的至少两个神经网络模型的预测准确率,得到至少两个预测准确率;
    排序子单元,用于将所述至少两个预测准确率按照由高至低的顺序进行排序,并确定预测准确率最高的神经网络模型为目标神经网络模型;
    提供子单元,用于提供所述目标神经网络模型用于对待检测图像进行模糊检测以得到所述待检测图像对应的模糊概率。
  18. 一种计算机可读存储介质,所述计算机可读存储介质存储有多条指令,所述指令适于处理器进行加载,以执行权利要求1至9中任一项所述的图像检测方法的步骤。
  19. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可以在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现权利要求1至9中任一项所述的图像检测方法的步骤。
  20. 一种计算机程序,所述计算机程序包括计算机指令,所述计算机指令存储在存储介质中,计算机设备的处理器从所述存储介质读取所述计算机指令,所述处理器执行所述计算机指令,使得所述计算机设备执行权利要求1至9中任一项所述的图像检测方法的步骤。
PCT/CN2022/098383 2021-07-16 2022-06-13 图像检测方法、装置、计算机可读存储介质及计算机设备 WO2023284465A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/302,265 US20230259739A1 (en) 2021-07-16 2023-04-18 Image detection method and apparatus, computer-readable storage medium, and computer device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110804450.6 2021-07-16
CN202110804450.6A CN113284142B (zh) 2021-07-16 2021-07-16 图像检测方法、装置、计算机可读存储介质及计算机设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/302,265 Continuation US20230259739A1 (en) 2021-07-16 2023-04-18 Image detection method and apparatus, computer-readable storage medium, and computer device

Publications (1)

Publication Number Publication Date
WO2023284465A1 true WO2023284465A1 (zh) 2023-01-19

Family

ID=77286657

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/098383 WO2023284465A1 (zh) 2021-07-16 2022-06-13 图像检测方法、装置、计算机可读存储介质及计算机设备

Country Status (3)

Country Link
US (1) US20230259739A1 (zh)
CN (1) CN113284142B (zh)
WO (1) WO2023284465A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116342571A (zh) * 2023-03-27 2023-06-27 中吉创新技术(深圳)有限公司 通风系统控制箱的状态检测方法、装置及存储介质
CN117218515A (zh) * 2023-09-19 2023-12-12 人民网股份有限公司 一种目标检测方法、装置、计算设备和存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113284142B (zh) * 2021-07-16 2021-10-29 腾讯科技(深圳)有限公司 图像检测方法、装置、计算机可读存储介质及计算机设备
CN115100739B (zh) * 2022-06-09 2023-03-28 厦门国际银行股份有限公司 人机行为检测方法、系统、终端设备及存储介质
CN115409159A (zh) * 2022-09-21 2022-11-29 北京京东方技术开发有限公司 对象操作方法、装置、计算机设备以及计算机存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180239949A1 (en) * 2015-02-23 2018-08-23 Cellanyx Diagnostics, Llc Cell imaging and analysis to differentiate clinically relevant sub-populations of cells
CN110490306A (zh) * 2019-08-22 2019-11-22 北京迈格威科技有限公司 一种神经网络训练和对象识别方法、装置和电子设备
CN112307860A (zh) * 2019-10-10 2021-02-02 北京沃东天骏信息技术有限公司 图像识别模型训练方法和装置、图像识别方法和装置
CN113284142A (zh) * 2021-07-16 2021-08-20 腾讯科技(深圳)有限公司 图像检测方法、装置、计算机可读存储介质及计算机设备

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106485192B (zh) * 2015-09-02 2019-12-06 富士通株式会社 用于图像识别的神经网络的训练方法和装置
CN107463953B (zh) * 2017-07-21 2019-11-19 上海媒智科技有限公司 在标签含噪情况下基于质量嵌入的图像分类方法及系统
CN110348428B (zh) * 2017-11-01 2023-03-24 腾讯科技(深圳)有限公司 眼底图像分类方法、装置及计算机可读存储介质
CN109697460B (zh) * 2018-12-05 2021-06-29 华中科技大学 对象检测模型训练方法、目标对象检测方法
CN110070184A (zh) * 2019-03-25 2019-07-30 北京理工大学 融合样本损失及优化速度约束的数据采样方法
CN110909815B (zh) * 2019-11-29 2022-08-12 深圳市商汤科技有限公司 神经网络训练、图像处理方法、装置及电子设备
CN111950647A (zh) * 2020-08-20 2020-11-17 连尚(新昌)网络科技有限公司 分类模型训练方法和设备
CN112906730B (zh) * 2020-08-27 2023-11-28 腾讯科技(深圳)有限公司 一种信息处理方法、装置及计算机可读存储介质
CN112149717B (zh) * 2020-09-03 2022-12-02 清华大学 基于置信度加权的图神经网络训练方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180239949A1 (en) * 2015-02-23 2018-08-23 Cellanyx Diagnostics, Llc Cell imaging and analysis to differentiate clinically relevant sub-populations of cells
CN110490306A (zh) * 2019-08-22 2019-11-22 北京迈格威科技有限公司 一种神经网络训练和对象识别方法、装置和电子设备
CN112307860A (zh) * 2019-10-10 2021-02-02 北京沃东天骏信息技术有限公司 图像识别模型训练方法和装置、图像识别方法和装置
CN113284142A (zh) * 2021-07-16 2021-08-20 腾讯科技(深圳)有限公司 图像检测方法、装置、计算机可读存储介质及计算机设备

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116342571A (zh) * 2023-03-27 2023-06-27 中吉创新技术(深圳)有限公司 通风系统控制箱的状态检测方法、装置及存储介质
CN116342571B (zh) * 2023-03-27 2023-12-22 中吉创新技术(深圳)有限公司 通风系统控制箱的状态检测方法、装置及存储介质
CN117218515A (zh) * 2023-09-19 2023-12-12 人民网股份有限公司 一种目标检测方法、装置、计算设备和存储介质
CN117218515B (zh) * 2023-09-19 2024-05-03 人民网股份有限公司 一种目标检测方法、装置、计算设备和存储介质

Also Published As

Publication number Publication date
CN113284142B (zh) 2021-10-29
US20230259739A1 (en) 2023-08-17
CN113284142A (zh) 2021-08-20

Similar Documents

Publication Publication Date Title
WO2023284465A1 (zh) 图像检测方法、装置、计算机可读存储介质及计算机设备
Oh et al. Crowd counting with decomposed uncertainty
Fu et al. Fast crowd density estimation with convolutional neural networks
US10535141B2 (en) Differentiable jaccard loss approximation for training an artificial neural network
CN109754078A (zh) 用于优化神经网络的方法
CN112116090B (zh) 神经网络结构搜索方法、装置、计算机设备及存储介质
CN111506773B (zh) 一种基于无监督深度孪生网络的视频去重方法
EP4177792A1 (en) Ai model updating method and apparatus, computing device and storage medium
CN110660478A (zh) 一种基于迁移学习的癌症图像预测判别方法和系统
Zhu et al. Portal nodes screening for large scale social networks
CN109034218B (zh) 模型训练方法、装置、设备及存储介质
CN115187772A (zh) 目标检测网络的训练及目标检测方法、装置及设备
Bui et al. Structured sparsity of convolutional neural networks via nonconvex sparse group regularization
CN112420125A (zh) 分子属性预测方法、装置、智能设备和终端
Sultana et al. Unsupervised adversarial learning for dynamic background modeling
Szemenyei et al. Real-time scene understanding using deep neural networks for RoboCup SPL
CN111144567A (zh) 神经网络模型的训练方法及装置
CN114972222A (zh) 细胞信息统计方法、装置、设备及计算机可读存储介质
US20210319269A1 (en) Apparatus for determining a classifier for identifying objects in an image, an apparatus for identifying objects in an image and corresponding methods
CN113609337A (zh) 图神经网络的预训练方法、训练方法、装置、设备及介质
CN114882315B (zh) 样本生成方法、模型训练方法、装置、设备及介质
CN116468479A (zh) 确定页面质量评估维度方法、页面质量的评估方法和装置
CN110377741A (zh) 文本分类方法、智能终端及计算机可读存储介质
CN115019342A (zh) 一种基于类关系推理的濒危动物目标检测方法
CN114693997A (zh) 基于迁移学习的图像描述生成方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22841105

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE