CN113962312A - Artificial intelligence-based ambiguity detection method, system, equipment and medium - Google Patents

Artificial intelligence-based ambiguity detection method, system, equipment and medium Download PDF

Info

Publication number
CN113962312A
CN113962312A CN202111255446.5A CN202111255446A CN113962312A CN 113962312 A CN113962312 A CN 113962312A CN 202111255446 A CN202111255446 A CN 202111255446A CN 113962312 A CN113962312 A CN 113962312A
Authority
CN
China
Prior art keywords
ambiguity
model
picture
detected
vgg19
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111255446.5A
Other languages
Chinese (zh)
Inventor
蒋翠平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202111255446.5A priority Critical patent/CN113962312A/en
Publication of CN113962312A publication Critical patent/CN113962312A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to artificial intelligence, and provides a method, a system, equipment and a medium for detecting ambiguity based on artificial intelligence, wherein the method comprises the following steps: inputting historical picture data into a trained VGG19 model, and outputting to obtain a ambiguity threshold range; inputting a picture to be detected into a trained ambiguity detection model to obtain an ambiguity value of the picture to be detected; and when the ambiguity value is out of the ambiguity threshold range, alarming the picture to be detected. According to the invention, the VGG19 model obtained through training can directly obtain the range information of the fuzzy degree threshold value through the picture, the fuzzy degree detection is carried out on the image based on the fuzzy degree detection model, the fuzzy degree detected by the image is compared with the fuzzy degree threshold value, when the fuzzy degree is out of the range of the fuzzy degree threshold value, the alarm is given to the picture to be detected, the original manual classification recognition is replaced, and thus the efficiency of the picture fuzzy degree recognition is improved.

Description

Ambiguity detection method, system, device and medium based on artificial intelligence
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method, a system, a device, and a medium for ambiguity detection based on artificial intelligence.
Background
At present, in an imported frozen product supervision system, data declared by a shipper is submitted to a background for approval, a large amount of data is returned due to unclear pictures, the uploaded clear pictures need to be revised again and then submitted, the pictures comprise a declaration form, a goods list, quarantine inspection proofs, disinfection proofs, nucleic acid detection proofs and the like, 9 pictures of each type can be uploaded at most, 45 pictures can be uploaded at most in the whole declaration, if one or more pictures are checked from the 45 pictures to be unclear, manual one-by-one checking is needed, a large amount of time is needed in the link of approval, due to the fact that manpower of approval personnel is insufficient, the time is limited, the approval speed is slow, the shipper is urgent, and due to the fact that frozen food is required to be rapidly pushed to the market, cost is increased due to delay of selling. The traditional mode of manually checking a plurality of pictures is time-consuming and labor-consuming, is slow in approval, delays the time of bringing frozen food to the market, and easily causes dissatisfaction of owners of goods.
Disclosure of Invention
The application provides a method, a system, equipment and a medium for detecting the ambiguity based on artificial intelligence, which are used for solving the problems of time and labor consumption and slow approval timeliness of the traditional manual checking of a plurality of pictures.
In order to solve the technical problem, the application adopts a technical scheme that: the ambiguity detection method based on artificial intelligence is provided, and comprises the following steps: inputting historical picture data into a trained VGG19 model, and outputting to obtain a ambiguity threshold range;
inputting a picture to be detected into a trained ambiguity detection model to obtain an ambiguity value of the picture to be detected;
and when the ambiguity value is out of the ambiguity threshold range, alarming the picture to be detected.
As a further improvement of the present application, the method includes the steps of inputting the historical picture data into a trained VGG19 model, and before outputting the range of the ambiguity threshold, further:
building a VGG19 network framework and defining a loss function, an initial learning rate and iteration times of the VGG19 network framework;
carrying out format conversion on the training sample pictures and the corresponding data sets, and inputting the data sets after format conversion into the VGG19 network framework for training;
and carrying out iterative training on the VGG19 network framework based on the loss function, the initial learning rate and the iteration times of the VGG19 network framework to obtain a trained VGG19 model.
As a further improvement of the present application, the method includes the steps of inputting a picture to be detected into a trained ambiguity detection model, and before obtaining the ambiguity value of the picture to be detected, further including:
adopting a convolutional neural network as a basic network, and adding a branch convolutional layer in the convolutional neural network layer to construct a network structure of a ambiguity detection model; the branch convolution layer is used for fusing the multi-level feature maps in the basic network;
inputting a plurality of label pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model.
As a further improvement of the present application, inputting a plurality of the labeled pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model, includes:
obtaining a loss value of a network structure of the ambiguity detection model by using the weighted cross entropy loss as a main loss function and the Ring loss as an auxiliary loss function;
optimizing parameters of the network structure of the ambiguity detection model by adopting a momentum random gradient descent algorithm based on the loss value of the network structure of the ambiguity detection model to obtain optimized parameters of the model;
and setting the learning rate by adopting a transfer learning method, and adjusting the optimization parameters of the model.
As a further improvement of the application, a convolutional neural network is adopted as a basic network, and a branch convolutional layer is added in the convolutional neural network layer to construct a network structure of a ambiguity detection model; before the branch convolution layer is used for fusing the multi-level feature maps in the base network, the method further includes: generating a general low-order filter model;
selecting a general low-order filter model required by a target filter model;
and carrying out series combination on the selected universal low-order filter models according to the series connection mode of output signals to form a target filter model.
As a further improvement of the present application, the method further includes, after the selected general low-order filter models are combined in series according to a series connection manner of output signals, further including:
and verifying the formed target filter model.
As a further improvement of the present application, when the ambiguity value is outside the ambiguity threshold range, the method for alerting the picture to be detected includes: judging whether the ambiguity value is within the ambiguity threshold range;
if so, confirming that the picture to be detected is clear;
if not, confirming that the picture to be detected is fuzzy, and outputting alarm information.
In order to solve the above technical problem, another technical solution adopted by the present application is: an artificial intelligence based ambiguity detection system is provided, comprising: the construction module is used for inputting the historical picture data into a trained VGG19 model and outputting to obtain a fuzzy threshold range;
the detection module is used for inputting the picture to be detected into a trained ambiguity detection model to obtain an ambiguity value of the picture to be detected;
and the judging module is used for alarming the picture to be detected when the ambiguity value is out of the ambiguity threshold range.
In order to solve the above technical problem, the present application adopts another technical solution that: there is provided a computer device comprising a processor, a memory coupled to the processor, the memory having stored therein program instructions that, when executed by the processor, cause the processor to perform the steps of the artificial intelligence based ambiguity detection method of any one of the above.
In order to solve the above technical problem, the present application adopts another technical solution that: there is provided a storage medium storing a program file capable of implementing any one of the above-described artificial intelligence-based ambiguity detection methods.
The beneficial effect of this application is: according to the ambiguity detection method based on artificial intelligence, a data set is obtained by firstly obtaining a sample image of a scene of a frozen product shot by a camera or other equipment; training the built VGG19 network framework by using a data set to obtain a VGG19 model; obtaining a threshold range of ambiguity; and inputting the frozen product image captured in real time into a trained VGG19 model, and detecting the ambiguity value of the image according to the ambiguity detection model. According to the invention, the VGG19 model obtained through training can directly obtain the range information of the fuzzy degree threshold value through the picture, the fuzzy degree detection is carried out on the image based on the fuzzy degree detection model, the fuzzy degree detected by the image is compared with the fuzzy degree threshold value, when the fuzzy degree is out of the range of the fuzzy degree threshold value, the alarm is given to the picture to be detected, the original manual classification recognition is replaced, and thus the efficiency of the picture fuzzy degree recognition is improved.
Drawings
FIG. 1 is a schematic flow chart of an artificial intelligence-based ambiguity detection method according to an embodiment of the present invention;
FIG. 2 is a functional block diagram of a system of an artificial intelligence-based ambiguity detection method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first", "second" and "third" in this application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any indication of the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise. All directional indications (such as up, down, left, right, front, and rear … …) in the embodiments of the present application are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indication is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Fig. 1 is a schematic flow chart of an artificial intelligence-based ambiguity detection method according to an embodiment of the present invention. It should be noted that the method of the present application is not limited to the flow sequence shown in fig. 1 if the results are substantially the same. As shown in fig. 1, the method includes:
and S1, inputting the historical picture data into a trained VGG19 model, and outputting to obtain an ambiguity threshold range.
Specifically, the history pictures include a target picture with an explicit scene and an interference picture without an explicit scene; the preset classification labels comprise a plurality of scene labels corresponding to scenes in the target picture and a non-scene label corresponding to the interference picture; for example, when the clear scene is a frozen product scene, the scene label may be a declaration form, a manifest, a quarantine inspection certificate, a disinfection certificate, a nucleic acid detection certificate, or the like, and the non-scene label is "other". Specifically, the frozen product scene data and "other" picture data may be collected separately in a variety of ways, including data collected using crawler technology, related data accumulated by the platform in the past, and manually supplemented annotated data. And inputting the historical picture data into a trained VGG19 model, and outputting to obtain a fuzzy threshold range.
Further, before step S1, the method further includes:
and S101, building a VGG19 network framework and defining a loss function, an initial learning rate and iteration times of the VGG19 network framework.
It should be noted that the VGG19 includes 19 hidden layers, including 16 convolutional layers and 3 full-link layers.
VGG is set forth by the Group of Visual Geometry Group of Oxford. The network is a relevant job on the ILSVRC 2014, the main job being to demonstrate that increasing the depth of the network can affect the ultimate performance of the network to some extent. There are two configurations of VGG, VGG16 and VGG19, which are not substantially different, except for different network depths. In the VGG, 3 × 3 convolution kernels are used to replace 7 × 7 convolution kernels, and 2 × 3 convolution kernels are used to replace 5 × 5 convolution kernels, so that the main purpose of increasing the depth of the network and the effect of the neural network to some extent is to ensure that the network has the same perception field. For example, a layer-by-layer superposition of 3 × 3 convolution kernels with step size 1 can be viewed as a field of size 7 (which in essence means that 3 × 3 successive convolutions correspond to a 7 × 7 convolution), with a total number of parameters of 3 × C (9 × C2), and with a total number of parameters of 49 × C2 if 7 × 7 convolution kernels are used directly, where C refers to the number of input and output channels. Obviously, 27 < C2 > is less than 49 < C2 >, i.e. the parameters are reduced; and the 3 x 3 convolution kernel is beneficial for better preservation of image properties.
Specifically, a VGG19 model is established, which includes 16 convolutional layers and 3 full-connected layers, and specifically includes an input layer, a 64-channel conv2 convolutional layer 1, a 64-channel conv2 convolutional layer 2, a pool _1 pooling layer, a 128-channel conv3 convolutional layer 1, a 128-channel conv3 convolutional layer 2, a pool _2 pooling layer, a 256-channel conv4 convolutional layer 1, a 256-channel conv4 convolutional layer 2, a 256-channel conv4 convolutional layer 3, a 256-channel conv4 convolutional layer 4, a pool _3 pooling layer, a 512-channel conv5 convolutional layer 1, a 512-channel conv5 convolutional layer 2, a 512-channel conv5 convolutional layer 3, a 512-channel conv5 convolutional layer 4, a pool _4 pooling layer, a 512-channel conv5 convolutional layer 5, a 512-channel conv5 convolutional layer 6, a 512-channel conv5 convolutional layer 7, a 512-channel conv 6 convolutional layer, a pool _5, a full-connected layer 6, and a full-connected layer 7 fc. Where conv denotes the convolutional layer, FC denotes the fully-connected layer, conv3 denotes the convolutional layer using 3 × 3filters, conv3-64 denotes depth 64, maxpool denotes maximum pooling.
Before training, a loss function, an initial learning rate and an iteration number of the VGG19 model are defined. And calculating a new weight coefficient through the loss function, updating the weight coefficient, and finishing one-time training iteration. The network repeats the process, completes fixed times of iteration on all images, updates the weight when the calculated value of the loss function is lower, and finishes training until reaching the preset iteration times so as to obtain the VGG model and the weight.
And S102, converting the format of the training sample picture and the corresponding data set, and inputting the data set after format conversion into the VGG19 network framework for training.
In particular, VGG19 requires pre-processing of pictures, converting RGB to BGR, resize to 224 × 3, subtracting the average value trained on ImageNet from each pixel in the picture, then training, starting with a pre-trained model trained on ImageNet, choosing batch _ size equal to 4. 20 epochs were trained and model information was stored in h5 format.
An Epoch refers to a process called an Epoch when a complete data set passes through the neural network once and returns once. (i.e., all training samples have been propagated in a forward direction and a backward direction in the neural network) then, one Epoch is the process of training all training samples once. However, when the number of samples (i.e. all training samples) of an Epoch is too large (for a computer), it needs to be divided into a plurality of small blocks, i.e. into a plurality of batchs for training. Wherein, Batch refers to dividing the whole training sample into several batches, Batch _ Size (Batch Size): size of each batch of samples.
And S103, carrying out iterative training on the VGG19 network framework based on the loss function, the initial learning rate and the iteration times of the VGG19 network framework to obtain a trained VGG19 model.
Specifically, the loss function (loss function) is used to measure the degree of inconsistency between the predicted value and the true value of the model, and is a non-negative true value function, and the smaller the value of the loss function is, the higher the accuracy of the VGG19 model is. And updating the loss function according to a predefined loss function, and generating an optimized VGG19 model according to the loss function.
Iterative training is a model training mode in deep learning and is used for optimizing a model. The iterative training in this step is realized by the following steps: firstly, constructing a target loss function of a VGG19 model, and performing cyclic training by adopting an optimization algorithm, such as an SGD (stochastic gradient descent) optimization algorithm; in each cycle training process, all training sample images are sequentially read in, the current loss function of the VGG19 model is calculated, the gradient descending direction is determined based on an optimization algorithm, the target loss function is gradually reduced and reaches a stable state, and the optimization of each parameter in the constructed network model is realized.
The loss function convergence refers to that the loss function is close to 0, for example, less than 0.1, and the like, that is, the value of the output of the VGG19 model for a given sample (positive sample or negative sample) is close to 0.5, it is considered that the VGG19 cannot distinguish the positive sample from the negative sample, that is, the output of the VGG19 is converged, that is, the training is stopped, and the model parameters of the last training are used as the parameters of the VGG19 model, so as to obtain the optimized VGG19 model.
And step S2, inputting the picture to be detected into the trained ambiguity detection model to obtain the ambiguity value of the picture to be detected.
Specifically, pictures to be detected are stored in a frozen product monitoring system imported from the building city, data declared by a frozen product owner are submitted to a background for approval, and the pictures to be detected comprise a customs declaration form, a goods list, a quarantine inspection certificate, a disinfection certificate, a nucleic acid detection certificate and the like. A first live image of a frozen scene is first taken by a camera or other device. For example, all the shot pictures are accessed to a local area network, so all cameras can be accessed through a DSS platform, the DSS has a screenshot function, and the frozen product field images shot by screenshot are stored in a bmp mode.
Adding a fully connected output layer to the improved model according to the number of the original data set and the data set to be trained of the model which need to be classified by adopting a multi-task training method, and adding a fully connected layer consisting of a plurality of neurons when the model needs to be classified into several types; the improved model is a main body, the fully-connected output layers added to the original data set and the data set to be trained are respectively two training branches, the original data set and the data set to be trained of the model are used for alternately training the model, the original data set is trained by using a cross entropy loss function, the data set to be trained is trained by using the similar perception loss function, back propagation iteration is carried out according to the magnitude of the loss value of forward propagation to update the weight of each layer in front, the model is stopped when the loss value of the model tends to be converged, the added output layers are removed to obtain a fuzziness detection model, and the picture to be detected is input to the trained fuzziness detection model to obtain the fuzziness value of the picture to be detected.
Further, before step S2, the method further includes:
step S20, adopting a convolutional neural network as a basic network, and adding a branch convolutional layer in the convolutional neural network layer to construct a network structure of a ambiguity detection model; the branch convolution layer is used for fusing the multi-level feature maps in the basic network.
In particular, the network structure may be defined using an open source PyTorch (a machine learning library) deep learning framework.
The branch convolution layer can transform the size and the channel of the feature picture, so that the multi-level feature maps in the basic network are fused. In the field of image recognition, a convolutional neural network is widely adopted for image classification and recognition at present, and has a relatively mature network structure and a relatively mature training algorithm, and the existing research results show that if a training sample is guaranteed to be high in quality and sufficient, the convolutional neural network has a high recognition rate in the traditional image recognition. However, the convolutional neural network has better biological simulation performance than the conventional artificial neural network, and is one of the research hotspots in recent years. Discrete pulses of the convolutional neural network have sparsity characteristics, the network operation amount can be greatly reduced, and the convolutional neural network has the advantages of achieving high performance, low power consumption, relieving overfitting and the like. The convolutional neural network has the advantages of ensuring the image recognition rate and simultaneously playing the advantages of low power consumption, low time delay and the like, thereby realizing high-speed time-varying information characteristic extraction and accurate recognition.
Step S21, inputting a plurality of label pictures into the network structure of the picture recognition model for training to generate the ambiguity detection model.
Specifically, a picture recognition model may be deployed as a service interface, and a plurality of tag pictures may be input into a network structure of the picture recognition model for training to generate the ambiguity detection model. The convolutional neural network is different from the general neural network in that the convolutional neural network includes a feature extractor composed of convolutional layers and sub-sampling layers. In the convolutional layer of the convolutional neural network, one neuron is connected to only part of the neighbor neurons. In a convolutional layer of CNN, there are usually several feature planes (featuremaps), each of which is composed of some neurons arranged in a rectangle, and the neurons in the same feature plane share a weight, where the shared weight is a convolution kernel. The convolution kernel is generally initialized in the form of a random decimal matrix, and the convolution kernel learns to obtain a reasonable weight in the training process of the network. Sharing weights (convolution kernels) brings the immediate benefit of reducing the connections between layers of the network, while reducing the risk of over-fitting. Subsampling, also called pooling (posing), usually takes the form of both mean (mean) and maximum (max) subsampling. Sub-sampling can be viewed as a special convolution process. Convolution and sub-sampling greatly simplify the complexity of the model and reduce the parameters of the model.
Further, step S21 further includes:
and step S211, using the weighted cross entropy loss as a main loss function and Ring loss as an auxiliary loss function to obtain a loss value of the network structure of the ambiguity detection model.
Specifically, the model output is Y ═ Y1, Y2.., yN +1}, the weight is W ═ W1, W2.., wN +1}, the value is based on the ratio of the number of samples in the training set, and then the cross-entropy loss between N scene labels and "others" is expressed as loss:
wherein label represents the sequence number of the real category label of the picture, and the value range is an integer of [1, N +1 ]; wlan < abel > belongs to W and is the weight corresponding to the real class label of the picture; and the ylabel belongs to Y and is a model output value corresponding to the picture real category label.
The target modular length is R, R is initialized with the mean of the characteristic vector modular lengths after the first iteration, Ring loss is expressed as lossrl:
the final loss lostotal is a weighted sum of two loss functions:
losstotal=lossce+λlossrl
wherein, the lambda is a weight factor and takes a value of 0.01.
And S212, optimizing parameters of the network structure of the ambiguity detection model by adopting a momentum random gradient descent algorithm based on the loss value of the network structure of the ambiguity detection model to obtain optimized parameters of the model.
Specifically, the back propagation of the loss adopts a random gradient descent method based on momentum, and the momentum factor is momentum ═ 0.9.
And step S213, setting the learning rate by adopting a transfer learning method, and adjusting the optimization parameters of the model.
Specifically, migration learning is performed based on an open source model trained on a public scene classification dataset place365 (a dataset), and pre-training weights except for a full connection layer and a feature fusion branch in a basic network are loaded. Training weight parameters in the added branch convolution layer and the fully-connected layer of the basic network, wherein the initial learning rate is set to be 0.01; fine-tuning the pre-training weight parameters in the networks conv2_ x, conv3_ x, conv4_ x and conv5_ x, with the initial learning rates of conv2_ x and conv3_ x set to 0.001 and the initial learning rates of conv4_ x and conv5_ x set to 0.002; the parameters in the other layers are frozen and no update is performed. In the training process, the value of the parameter learning rate is reduced by half every 5 iterations.
And testing by using the data in the diagram library, and checking the identification result to evaluate the accuracy and recall rate of the model. And (3) supplementing corresponding positive and negative samples to a training set aiming at the wrongly-divided cases, eliminating the atypical samples which are not beneficial to model training, updating the weight W of the cross entropy, and retraining the model. And repeating multiple rounds of data iteration until the accuracy of the model meets the production requirement, and stopping training.
Further, before step S20, the method further includes:
and step S200, generating a general low-order filter model.
Specifically, the expression of the high-order filter is relatively high in order, the expression becomes very complex, the internal structural characteristics of the filter are not clear, and designers are not easy to master the structural characteristics of the filter when designing the filter.
Step S201, selecting a general low-order filter model required by the target filter model.
Specifically, a general low-order filter model is first generated, wherein the generated general low-order filter model may be only a general low-order filter model constituting the target filter model or may be a general low-order filter model of all filters. And if the generated universal low-order filter model is the universal low-order filter models of all the filters, selecting the universal low-order filter model forming the target filter model from the generated universal low-order filter models, and then serially combining the selected universal low-order filter models in a mode of serially connecting output signals to finally form the target filter model. And if the first generated universal low-order filter model is a universal low-order filter model forming the target filter model, selecting all the generated universal low-order filter models, and then serially combining the selected universal low-order filter models in a mode of serially connecting output signals to form the target filter model.
And S202, carrying out series combination on the selected universal low-order filter models according to the series connection mode of output signals to form a target filter model.
Specifically, the generic low-order filter model is named freely and corresponds to the filter model one to one. In addition, a state mark list is established and is respectively in one-to-one correspondence with each universal low-order filter model. Then, according to the target filter model, the status flag of the general low-order filter model constituting the target filter model is set to 1 and the others are set to 0 in the status flag list. Through carrying out the series connection combination with the filter model by a plurality of general low order filter models, compare in prior art and carry out the global design to the wave filter for the design flexibility of wave filter model is more, and the designer can be more convenient adjusts the structure, the parameter of wave filter model.
Further, after step S202, the method further includes: and verifying the formed target filter model.
Specifically, in the verification process, a signal source model is selected from a signal generation model library, and a noise model is superimposed on the selected signal source model. The signal generation model library is stored with a plurality of signal source models, and the selected signal source model is a signal input by a simulation target filter model. The parameters of the signal source model can be set according to requirements, and the packaging of the signal source model requires a uniform interface form, so that the subsequent addition and replacement of various new signal source models are facilitated. And then, forming a filter analysis model by the signal source model, the noise model and the filter model. And operating the filter analysis model to generate operation result signal data, then evaluating the operation result signal data by adopting a quantitative evaluation method, and judging whether the target filter model meets the design requirement.
And step S3, when the ambiguity value is out of the ambiguity threshold range, alarming the picture to be detected.
Specifically, the VGG19 model obtained through training can directly obtain the range information of the ambiguity threshold through the picture, detect the ambiguity value of the image based on the ambiguity detection model, compare the detected ambiguity value of the image with the ambiguity threshold, and alarm the picture to be detected when the ambiguity value is out of the ambiguity threshold range.
Further, step S3 includes:
and step S31, judging whether the ambiguity value is within the ambiguity threshold range.
And step S32, if yes, confirming that the picture to be detected is clear.
And step S33, if not, confirming that the picture to be detected is fuzzy, and outputting alarm information.
Specifically, a sample image of a scene of the frozen product is first taken by a camera or other device. Such as the product picture in the imported frozen product supervision system, the types of the product picture comprise a customs declaration form, a goods list, a quarantine inspection certificate, a disinfection certificate, a nucleic acid detection certificate and the like. Then inputting the field image into a trained ambiguity detection model, comparing the detected ambiguity value of the image with an ambiguity threshold value, and if the ambiguity value is within the range of the ambiguity threshold value, confirming that the picture to be detected is clear;
and if the ambiguity value is not within the ambiguity threshold range, confirming that the picture to be detected is ambiguous, and outputting alarm information.
According to the picture fuzzy rate testing method, a sample image of a frozen product scene is firstly obtained through a camera or other equipment, and a data set is obtained; training the built VGG19 network framework by using a data set to obtain a VGG19 model; obtaining a threshold range of ambiguity; and inputting the frozen product image captured in real time into a trained VGG19 model, and detecting the ambiguity value of the image according to the ambiguity detection model. According to the invention, the VGG19 model obtained through training can directly obtain the range information of the fuzzy degree threshold value through the picture, the fuzzy degree detection is carried out on the image based on the fuzzy degree detection model, the fuzzy degree detected by the image is compared with the fuzzy degree threshold value, when the fuzzy degree is out of the range of the fuzzy degree threshold value, the alarm is given to the picture to be detected, the original manual classification recognition is replaced, and thus the efficiency of the picture fuzzy degree recognition is improved.
Fig. 2 is a functional module schematic diagram of an artificial intelligence-based ambiguity detection system according to an embodiment of the present application. As shown in fig. 2, the system 2 for detecting ambiguity based on artificial intelligence comprises a building module 21, a detecting module 22, and a judging module 23.
The construction module 21 is used for inputting the historical picture data into a trained VGG19 model and outputting the historical picture data to obtain a fuzzy threshold range;
the detection module 22 is used for inputting the picture to be detected into the trained ambiguity detection model to obtain the ambiguity value of the picture to be detected;
and the judging module 23 is configured to alarm the picture to be detected when the ambiguity value is outside the ambiguity threshold range.
Optionally, inputting the historical picture data into a trained VGG19 model, and before outputting the range of the ambiguity threshold, the method further includes:
building a VGG19 network framework and defining a loss function, an initial learning rate and iteration times of the VGG19 network framework;
carrying out format conversion on the training sample pictures and the corresponding data sets, and inputting the data sets after format conversion into the VGG19 network framework for training;
and carrying out iterative training on the VGG19 network framework based on the loss function, the initial learning rate and the iteration times of the VGG19 network framework to obtain a trained VGG19 model.
Optionally, before the image to be detected is input to the trained ambiguity detection model to obtain the ambiguity value of the image to be detected, the method further includes:
adopting a convolutional neural network as a basic network, and adding a branch convolutional layer in the convolutional neural network layer to construct a network structure of a ambiguity detection model; the branch convolution layer is used for fusing the multi-level feature maps in the basic network;
inputting a plurality of label pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model.
Optionally, inputting a plurality of the tagged pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model, including:
obtaining a loss value of a network structure of the ambiguity detection model by using the weighted cross entropy loss as a main loss function and the Ring loss as an auxiliary loss function;
optimizing parameters of the network structure of the ambiguity detection model by adopting a momentum random gradient descent algorithm based on the loss value of the network structure of the ambiguity detection model to obtain optimized parameters of the model;
and setting the learning rate by adopting a transfer learning method, and adjusting the optimization parameters of the model.
Optionally, a convolutional neural network is used as a basic network, and a branch convolutional layer is added in the convolutional neural network layer to construct a network structure of a ambiguity detection model; before the branch convolution layer is used for fusing the multi-level feature maps in the base network, the method further includes: generating a general low-order filter model;
selecting a general low-order filter model required by a target filter model;
and carrying out series combination on the selected universal low-order filter models according to the series connection mode of output signals to form a target filter model.
Optionally, the step of serially combining the selected general low-order filter models in a manner of serially connecting output signals to form a target filter model further includes: and verifying the formed target filter model.
Optionally, when the ambiguity value is outside the ambiguity threshold range, the method for warning the picture to be detected includes: judging whether the ambiguity value is within the ambiguity threshold range;
if so, confirming that the picture to be detected is clear;
if not, confirming that the picture to be detected is fuzzy, and outputting alarm information.
For other details of the technical solutions for implementing the modules in the image blur rate detection system according to the above embodiments, reference may be made to the description of the image blur rate detection method in the above embodiments, and details are not repeated here.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the system-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment
Referring to fig. 3, fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 3, the computer device 30 includes a processor 31 and a memory 32 coupled to the processor 31.
The memory 32 stores program instructions that, when executed by the processor 31, cause the processor 31 to perform the steps of the artificial intelligence based ambiguity detection method in the above-described embodiments.
The processor 31 may also be referred to as a CPU (Central Processing Unit). The processor 31 may be an integrated circuit chip having signal processing capabilities. The processor 31 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application. The storage medium of the embodiment of the present application stores a program file 41 capable of implementing all the methods described above, where the program file 41 may be stored in the storage medium in the form of a software product, and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform.
In the several embodiments provided in the present application, it should be understood that the disclosed terminal, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The above embodiments are merely examples and are not intended to limit the scope of the present disclosure, and all modifications, equivalents, and flow charts using the contents of the specification and drawings of the present disclosure or those directly or indirectly applied to other related technical fields are intended to be included in the scope of the present disclosure.

Claims (10)

1.一种基于人工智能的模糊度检测方法,其特征在于,包括:1. a kind of ambiguity detection method based on artificial intelligence, is characterized in that, comprises: 将历史图片数据输入至训练好的VGG19模型中,输出得到模糊度阈值范围;Input the historical image data into the trained VGG19 model, and the output obtains the ambiguity threshold range; 将待检测图片输入至训练好的模糊度检测模型,得到所述待检测图像的模糊度值;Input the image to be detected into the trained ambiguity detection model to obtain the ambiguity value of the image to be detected; 当所述模糊度值在所述模糊度阈值范围之外时,对所述待检测图片告警。When the blurriness value is outside the blurriness threshold value range, an alarm is given to the picture to be detected. 2.根据权利要求1所述的基于人工智能的模糊度检测方法,其特征在于,所述将历史图片数据输入至训练好的VGG19模型中,输出得到模糊度阈值范围之前,还包括:2. the ambiguity detection method based on artificial intelligence according to claim 1, is characterized in that, described in the VGG19 model that historical picture data is input into training, before output obtains ambiguity threshold range, also comprises: 搭建VGG19网络框架并定义所述VGG19网络框架的损失函数、初始学习率及迭代次数;Build the VGG19 network framework and define the loss function, the initial learning rate and the number of iterations of the VGG19 network framework; 将所述训练样本图片和对应的数据集进行格式转换,并将格式转换后的数据集输入至所述VGG19网络框架进行训练;The training sample picture and the corresponding data set are format-converted, and the format-converted data set is input to the VGG19 network framework for training; 基于所述VGG19网络框架的损失函数、初始学习率及迭代次数对所述VGG19网络框架进行迭代训练,得到训练好的VGG19模型。The VGG19 network framework is iteratively trained based on the loss function, the initial learning rate and the number of iterations of the VGG19 network framework to obtain a trained VGG19 model. 3.根据权利要求1所述的基于人工智能的模糊度检测方法,其特征在于,所述将待检测图片输入至训练好的模糊度检测模型,得到所述待检测图像的模糊度值之前,还包括:3. The ambiguity detection method based on artificial intelligence according to claim 1, is characterized in that, before the ambiguity value of the described image to be detected is obtained, the picture to be detected is input into the trained ambiguity detection model, Also includes: 采用卷积神经网络作为基础网络,并在所述卷积神经网络层中增加分支卷积层,以构建模糊度检测模的网络结构;所述分支卷积层用于将所述基础网络中的多级特征图进行融合;A convolutional neural network is used as the basic network, and a branched convolutional layer is added to the convolutional neural network layer to construct a network structure of the ambiguity detection mode; the branched convolutional layer is used to combine the Multi-level feature maps are fused; 将多个所述标签图片输入至所述图片识别模型的网络结构中进行训练以生成所述模糊度检测模型。A plurality of the labeled pictures are input into the network structure of the picture recognition model for training to generate the ambiguity detection model. 4.根据权利要求3所述的基于人工智能的模糊度检测方法,其特征在于,所述将多个所述标签图片输入至所述图片识别模型的网络结构中进行训练以生成所述模糊度检测模型包括:4 . The artificial intelligence-based ambiguity detection method according to claim 3 , wherein the ambiguity is generated by inputting a plurality of the labeled pictures into the network structure of the picture recognition model for training to generate the ambiguity. 5 . Detection models include: 利用加权的交叉熵损失作为主损失函数、Ring loss作为辅助损失函数,以获得所述模糊度检测模型的网络结构的损失值;Use the weighted cross-entropy loss as the main loss function and the Ring loss as the auxiliary loss function to obtain the loss value of the network structure of the ambiguity detection model; 基于所述模糊度检测模型的网络结构的损失值,采用动量的随机梯度下降算法对所述模糊度检测模型的网络结构的参数进行优化,以得到模型的优化参数;Based on the loss value of the network structure of the ambiguity detection model, adopting the stochastic gradient descent algorithm of momentum to optimize the parameters of the network structure of the ambiguity detection model to obtain the optimized parameters of the model; 采用迁移学习的方法,对学习率进行设置,并对模型的优化参数进行调整。Using the transfer learning method, the learning rate is set and the optimization parameters of the model are adjusted. 5.根据权利要求3所述的基于人工智能的模糊度检测方法,其特征在于,所述采用卷积神经网络作为基础网络,并在所述卷积神经网络层中增加分支卷积层,以构建模糊度检测模的网络结构;所述分支卷积层用于将所述基础网络中的多级特征图进行融合之前,还包括:5. The ambiguity detection method based on artificial intelligence according to claim 3, is characterized in that, described adopting convolutional neural network as basic network, and adding branch convolutional layer in described convolutional neural network layer, to Constructing the network structure of the ambiguity detection module; before the branch convolution layer is used to fuse the multi-level feature maps in the basic network, it also includes: 生成通用低阶滤波器模型;Generate generic low-order filter models; 选取构成目标滤波器模型所需的通用低阶滤波器模型;Select the general low-order filter model required to form the target filter model; 将所述选取的通用低阶滤波器模型按照输出信号串联的方式进行串联组合,形成目标滤波器模型。The selected general low-order filter models are combined in series in a manner of connecting the output signals in series to form a target filter model. 6.根据权利要求5所述的基于人工智能的模糊度检测方法,其特征在于,所述将所述选取的通用低阶滤波器模型按照输出信号串联的方式进行串联组合,形成目标滤波器模型之后还包括:6. ambiguity detection method based on artificial intelligence according to claim 5, is characterized in that, described by described selected general low-order filter model according to the mode of output signal series connection to carry out series combination, form target filter model After that it also includes: 对所述形成的目标滤波器模型进行验证。The formed target filter model is verified. 7.根据权利要求1所述的基于人工智能的模糊度检测方法,其特征在于,所述当所述模糊度值在所述模糊度阈值范围之外时,对所述待检测图片告警包括:7. The artificial intelligence-based ambiguity detection method according to claim 1, wherein, when the ambiguity value is outside the ambiguity threshold range, alerting the picture to be detected comprises: 判断所述模糊度值是否在所述模糊度阈值范围内;judging whether the ambiguity value is within the ambiguity threshold range; 若是,则确认所述待检测图片清晰;If so, confirm that the picture to be detected is clear; 若否,则确认所述待检测图片模糊,并输出告警信息。If not, confirm that the picture to be detected is blurred, and output alarm information. 8.一种基于人工智能的模糊度检测系统,其特征在于,其包括:8. An artificial intelligence-based ambiguity detection system, characterized in that it comprises: 构建模块,将历史图片数据输入至训练好的VGG19模型中,输出得到模糊度阈值范围;Build a module, input historical image data into the trained VGG19 model, and output the ambiguity threshold range; 检测模块,将待检测图片输入至训练好的模糊度检测模型,得到所述待检测图像的模糊度值;a detection module, which inputs the image to be detected into the trained ambiguity detection model, and obtains the ambiguity value of the image to be detected; 判断模块,当所述模糊度值在所述模糊度阈值范围之外时,对所述待检测图片告警。A judgment module, when the ambiguity value is outside the ambiguity threshold range, alert the picture to be detected. 9.一种计算机设备,其特征在于,所述终端设备包括处理器、与所述处理器耦接的存储器,所述存储器中存储有程序指令,所述程序指令被所述处理器执行时,使得所述处理器执行如权利要求1-7中任一项权利要求所述的基于人工智能的模糊度检测方法的步骤。9. A computer device, characterized in that the terminal device comprises a processor and a memory coupled to the processor, wherein program instructions are stored in the memory, and when the program instructions are executed by the processor, The processor is caused to perform the steps of the artificial intelligence-based ambiguity detection method of any one of claims 1-7. 10.一种存储介质,其特征在于,存储有能够实现如权利要求1-7中任一项所述的基于人工智能的模糊度检测方法的程序文件。10 . A storage medium, characterized in that a program file capable of implementing the artificial intelligence-based ambiguity detection method according to any one of claims 1 to 7 is stored. 11 .
CN202111255446.5A 2021-10-27 2021-10-27 Artificial intelligence-based ambiguity detection method, system, equipment and medium Pending CN113962312A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111255446.5A CN113962312A (en) 2021-10-27 2021-10-27 Artificial intelligence-based ambiguity detection method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111255446.5A CN113962312A (en) 2021-10-27 2021-10-27 Artificial intelligence-based ambiguity detection method, system, equipment and medium

Publications (1)

Publication Number Publication Date
CN113962312A true CN113962312A (en) 2022-01-21

Family

ID=79467404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111255446.5A Pending CN113962312A (en) 2021-10-27 2021-10-27 Artificial intelligence-based ambiguity detection method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN113962312A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114359854A (en) * 2022-03-21 2022-04-15 上海闪马智能科技有限公司 Object identification method and device, storage medium and electronic device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678809A (en) * 2013-12-16 2014-03-26 北京经纬恒润科技有限公司 Designing method of filter model
WO2020211003A1 (en) * 2019-04-17 2020-10-22 深圳市欢太科技有限公司 Image processing method, computer readable storage medium, and computer device
CN112954315A (en) * 2021-02-25 2021-06-11 深圳市中西视通科技有限公司 Image focusing measurement method and system for security camera
CN113065443A (en) * 2021-03-25 2021-07-02 携程计算机技术(上海)有限公司 Training method, recognition method, system, device and medium of image recognition model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678809A (en) * 2013-12-16 2014-03-26 北京经纬恒润科技有限公司 Designing method of filter model
WO2020211003A1 (en) * 2019-04-17 2020-10-22 深圳市欢太科技有限公司 Image processing method, computer readable storage medium, and computer device
CN112954315A (en) * 2021-02-25 2021-06-11 深圳市中西视通科技有限公司 Image focusing measurement method and system for security camera
CN113065443A (en) * 2021-03-25 2021-07-02 携程计算机技术(上海)有限公司 Training method, recognition method, system, device and medium of image recognition model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114359854A (en) * 2022-03-21 2022-04-15 上海闪马智能科技有限公司 Object identification method and device, storage medium and electronic device

Similar Documents

Publication Publication Date Title
CN112445823B (en) Neural network structure search method, image processing method and device
CN112434462B (en) Method and equipment for obtaining model
CN111860588B (en) Training method for graphic neural network and related equipment
CN109886066B (en) Rapid target detection method based on multi-scale and multi-layer feature fusion
CN109033107B (en) Image retrieval method and apparatus, computer device, and storage medium
US11443514B2 (en) Recognizing minutes-long activities in videos
CN113705769A (en) Neural network training method and device
CN114388064A (en) Multi-modal information fusion method, system, terminal and storage medium for protein characterization learning
CN112529005B (en) Target detection method based on semantic feature consistency supervision pyramid network
CN112862828B (en) A semantic segmentation method, model training method and device
CN111797992A (en) A machine learning optimization method and device
CN113592060A (en) Neural network optimization method and device
CN113554653B (en) Semantic segmentation method based on mutual information calibration point cloud data long tail distribution
CN113987236B (en) Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network
US20200272812A1 (en) Human body part segmentation with real and synthetic images
Terziyan et al. Causality-aware convolutional neural networks for advanced image classification and generation
CN114037056A (en) Method and device for generating neural network, computer equipment and storage medium
Jiang et al. TAB: Temporal Accumulated Batch normalization in spiking neural networks
US20240273338A1 (en) Neural network construction method and apparatus
CN104036242B (en) The object identification method of Boltzmann machine is limited based on Centering Trick convolution
US11868878B1 (en) Executing sublayers of a fully-connected layer
Li et al. Multi-view convolutional vision transformer for 3D object recognition
CN115018039A (en) Neural network distillation method, target detection method and device
Zhang et al. A novel CapsNet neural network based on MobileNetV2 structure for robot image classification
CN113962312A (en) Artificial intelligence-based ambiguity detection method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination