CN113962312A - Ambiguity detection method, system, device and medium based on artificial intelligence - Google Patents
Ambiguity detection method, system, device and medium based on artificial intelligence Download PDFInfo
- Publication number
- CN113962312A CN113962312A CN202111255446.5A CN202111255446A CN113962312A CN 113962312 A CN113962312 A CN 113962312A CN 202111255446 A CN202111255446 A CN 202111255446A CN 113962312 A CN113962312 A CN 113962312A
- Authority
- CN
- China
- Prior art keywords
- ambiguity
- model
- picture
- detected
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The application relates to artificial intelligence, and provides a method, a system, equipment and a medium for detecting ambiguity based on artificial intelligence, wherein the method comprises the following steps: inputting historical picture data into a trained VGG19 model, and outputting to obtain a ambiguity threshold range; inputting a picture to be detected into a trained ambiguity detection model to obtain an ambiguity value of the picture to be detected; and when the ambiguity value is out of the ambiguity threshold range, alarming the picture to be detected. According to the invention, the VGG19 model obtained through training can directly obtain the range information of the fuzzy degree threshold value through the picture, the fuzzy degree detection is carried out on the image based on the fuzzy degree detection model, the fuzzy degree detected by the image is compared with the fuzzy degree threshold value, when the fuzzy degree is out of the range of the fuzzy degree threshold value, the alarm is given to the picture to be detected, the original manual classification recognition is replaced, and thus the efficiency of the picture fuzzy degree recognition is improved.
Description
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method, a system, a device, and a medium for ambiguity detection based on artificial intelligence.
Background
At present, in an imported frozen product supervision system, data declared by a shipper is submitted to a background for approval, a large amount of data is returned due to unclear pictures, the uploaded clear pictures need to be revised again and then submitted, the pictures comprise a declaration form, a goods list, quarantine inspection proofs, disinfection proofs, nucleic acid detection proofs and the like, 9 pictures of each type can be uploaded at most, 45 pictures can be uploaded at most in the whole declaration, if one or more pictures are checked from the 45 pictures to be unclear, manual one-by-one checking is needed, a large amount of time is needed in the link of approval, due to the fact that manpower of approval personnel is insufficient, the time is limited, the approval speed is slow, the shipper is urgent, and due to the fact that frozen food is required to be rapidly pushed to the market, cost is increased due to delay of selling. The traditional mode of manually checking a plurality of pictures is time-consuming and labor-consuming, is slow in approval, delays the time of bringing frozen food to the market, and easily causes dissatisfaction of owners of goods.
Disclosure of Invention
The application provides a method, a system, equipment and a medium for detecting the ambiguity based on artificial intelligence, which are used for solving the problems of time and labor consumption and slow approval timeliness of the traditional manual checking of a plurality of pictures.
In order to solve the technical problem, the application adopts a technical scheme that: the ambiguity detection method based on artificial intelligence is provided, and comprises the following steps: inputting historical picture data into a trained VGG19 model, and outputting to obtain a ambiguity threshold range;
inputting a picture to be detected into a trained ambiguity detection model to obtain an ambiguity value of the picture to be detected;
and when the ambiguity value is out of the ambiguity threshold range, alarming the picture to be detected.
As a further improvement of the present application, the method includes the steps of inputting the historical picture data into a trained VGG19 model, and before outputting the range of the ambiguity threshold, further:
building a VGG19 network framework and defining a loss function, an initial learning rate and iteration times of the VGG19 network framework;
carrying out format conversion on the training sample pictures and the corresponding data sets, and inputting the data sets after format conversion into the VGG19 network framework for training;
and carrying out iterative training on the VGG19 network framework based on the loss function, the initial learning rate and the iteration times of the VGG19 network framework to obtain a trained VGG19 model.
As a further improvement of the present application, the method includes the steps of inputting a picture to be detected into a trained ambiguity detection model, and before obtaining the ambiguity value of the picture to be detected, further including:
adopting a convolutional neural network as a basic network, and adding a branch convolutional layer in the convolutional neural network layer to construct a network structure of a ambiguity detection model; the branch convolution layer is used for fusing the multi-level feature maps in the basic network;
inputting a plurality of label pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model.
As a further improvement of the present application, inputting a plurality of the labeled pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model, includes:
obtaining a loss value of a network structure of the ambiguity detection model by using the weighted cross entropy loss as a main loss function and the Ring loss as an auxiliary loss function;
optimizing parameters of the network structure of the ambiguity detection model by adopting a momentum random gradient descent algorithm based on the loss value of the network structure of the ambiguity detection model to obtain optimized parameters of the model;
and setting the learning rate by adopting a transfer learning method, and adjusting the optimization parameters of the model.
As a further improvement of the application, a convolutional neural network is adopted as a basic network, and a branch convolutional layer is added in the convolutional neural network layer to construct a network structure of a ambiguity detection model; before the branch convolution layer is used for fusing the multi-level feature maps in the base network, the method further includes: generating a general low-order filter model;
selecting a general low-order filter model required by a target filter model;
and carrying out series combination on the selected universal low-order filter models according to the series connection mode of output signals to form a target filter model.
As a further improvement of the present application, the method further includes, after the selected general low-order filter models are combined in series according to a series connection manner of output signals, further including:
and verifying the formed target filter model.
As a further improvement of the present application, when the ambiguity value is outside the ambiguity threshold range, the method for alerting the picture to be detected includes: judging whether the ambiguity value is within the ambiguity threshold range;
if so, confirming that the picture to be detected is clear;
if not, confirming that the picture to be detected is fuzzy, and outputting alarm information.
In order to solve the above technical problem, another technical solution adopted by the present application is: an artificial intelligence based ambiguity detection system is provided, comprising: the construction module is used for inputting the historical picture data into a trained VGG19 model and outputting to obtain a fuzzy threshold range;
the detection module is used for inputting the picture to be detected into a trained ambiguity detection model to obtain an ambiguity value of the picture to be detected;
and the judging module is used for alarming the picture to be detected when the ambiguity value is out of the ambiguity threshold range.
In order to solve the above technical problem, the present application adopts another technical solution that: there is provided a computer device comprising a processor, a memory coupled to the processor, the memory having stored therein program instructions that, when executed by the processor, cause the processor to perform the steps of the artificial intelligence based ambiguity detection method of any one of the above.
In order to solve the above technical problem, the present application adopts another technical solution that: there is provided a storage medium storing a program file capable of implementing any one of the above-described artificial intelligence-based ambiguity detection methods.
The beneficial effect of this application is: according to the ambiguity detection method based on artificial intelligence, a data set is obtained by firstly obtaining a sample image of a scene of a frozen product shot by a camera or other equipment; training the built VGG19 network framework by using a data set to obtain a VGG19 model; obtaining a threshold range of ambiguity; and inputting the frozen product image captured in real time into a trained VGG19 model, and detecting the ambiguity value of the image according to the ambiguity detection model. According to the invention, the VGG19 model obtained through training can directly obtain the range information of the fuzzy degree threshold value through the picture, the fuzzy degree detection is carried out on the image based on the fuzzy degree detection model, the fuzzy degree detected by the image is compared with the fuzzy degree threshold value, when the fuzzy degree is out of the range of the fuzzy degree threshold value, the alarm is given to the picture to be detected, the original manual classification recognition is replaced, and thus the efficiency of the picture fuzzy degree recognition is improved.
Drawings
FIG. 1 is a schematic flow chart of an artificial intelligence-based ambiguity detection method according to an embodiment of the present invention;
FIG. 2 is a functional block diagram of a system of an artificial intelligence-based ambiguity detection method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first", "second" and "third" in this application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any indication of the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise. All directional indications (such as up, down, left, right, front, and rear … …) in the embodiments of the present application are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indication is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Fig. 1 is a schematic flow chart of an artificial intelligence-based ambiguity detection method according to an embodiment of the present invention. It should be noted that the method of the present application is not limited to the flow sequence shown in fig. 1 if the results are substantially the same. As shown in fig. 1, the method includes:
and S1, inputting the historical picture data into a trained VGG19 model, and outputting to obtain an ambiguity threshold range.
Specifically, the history pictures include a target picture with an explicit scene and an interference picture without an explicit scene; the preset classification labels comprise a plurality of scene labels corresponding to scenes in the target picture and a non-scene label corresponding to the interference picture; for example, when the clear scene is a frozen product scene, the scene label may be a declaration form, a manifest, a quarantine inspection certificate, a disinfection certificate, a nucleic acid detection certificate, or the like, and the non-scene label is "other". Specifically, the frozen product scene data and "other" picture data may be collected separately in a variety of ways, including data collected using crawler technology, related data accumulated by the platform in the past, and manually supplemented annotated data. And inputting the historical picture data into a trained VGG19 model, and outputting to obtain a fuzzy threshold range.
Further, before step S1, the method further includes:
and S101, building a VGG19 network framework and defining a loss function, an initial learning rate and iteration times of the VGG19 network framework.
It should be noted that the VGG19 includes 19 hidden layers, including 16 convolutional layers and 3 full-link layers.
VGG is set forth by the Group of Visual Geometry Group of Oxford. The network is a relevant job on the ILSVRC 2014, the main job being to demonstrate that increasing the depth of the network can affect the ultimate performance of the network to some extent. There are two configurations of VGG, VGG16 and VGG19, which are not substantially different, except for different network depths. In the VGG, 3 × 3 convolution kernels are used to replace 7 × 7 convolution kernels, and 2 × 3 convolution kernels are used to replace 5 × 5 convolution kernels, so that the main purpose of increasing the depth of the network and the effect of the neural network to some extent is to ensure that the network has the same perception field. For example, a layer-by-layer superposition of 3 × 3 convolution kernels with step size 1 can be viewed as a field of size 7 (which in essence means that 3 × 3 successive convolutions correspond to a 7 × 7 convolution), with a total number of parameters of 3 × C (9 × C2), and with a total number of parameters of 49 × C2 if 7 × 7 convolution kernels are used directly, where C refers to the number of input and output channels. Obviously, 27 < C2 > is less than 49 < C2 >, i.e. the parameters are reduced; and the 3 x 3 convolution kernel is beneficial for better preservation of image properties.
Specifically, a VGG19 model is established, which includes 16 convolutional layers and 3 full-connected layers, and specifically includes an input layer, a 64-channel conv2 convolutional layer 1, a 64-channel conv2 convolutional layer 2, a pool _1 pooling layer, a 128-channel conv3 convolutional layer 1, a 128-channel conv3 convolutional layer 2, a pool _2 pooling layer, a 256-channel conv4 convolutional layer 1, a 256-channel conv4 convolutional layer 2, a 256-channel conv4 convolutional layer 3, a 256-channel conv4 convolutional layer 4, a pool _3 pooling layer, a 512-channel conv5 convolutional layer 1, a 512-channel conv5 convolutional layer 2, a 512-channel conv5 convolutional layer 3, a 512-channel conv5 convolutional layer 4, a pool _4 pooling layer, a 512-channel conv5 convolutional layer 5, a 512-channel conv5 convolutional layer 6, a 512-channel conv5 convolutional layer 7, a 512-channel conv 6 convolutional layer, a pool _5, a full-connected layer 6, and a full-connected layer 7 fc. Where conv denotes the convolutional layer, FC denotes the fully-connected layer, conv3 denotes the convolutional layer using 3 × 3filters, conv3-64 denotes depth 64, maxpool denotes maximum pooling.
Before training, a loss function, an initial learning rate and an iteration number of the VGG19 model are defined. And calculating a new weight coefficient through the loss function, updating the weight coefficient, and finishing one-time training iteration. The network repeats the process, completes fixed times of iteration on all images, updates the weight when the calculated value of the loss function is lower, and finishes training until reaching the preset iteration times so as to obtain the VGG model and the weight.
And S102, converting the format of the training sample picture and the corresponding data set, and inputting the data set after format conversion into the VGG19 network framework for training.
In particular, VGG19 requires pre-processing of pictures, converting RGB to BGR, resize to 224 × 3, subtracting the average value trained on ImageNet from each pixel in the picture, then training, starting with a pre-trained model trained on ImageNet, choosing batch _ size equal to 4. 20 epochs were trained and model information was stored in h5 format.
An Epoch refers to a process called an Epoch when a complete data set passes through the neural network once and returns once. (i.e., all training samples have been propagated in a forward direction and a backward direction in the neural network) then, one Epoch is the process of training all training samples once. However, when the number of samples (i.e. all training samples) of an Epoch is too large (for a computer), it needs to be divided into a plurality of small blocks, i.e. into a plurality of batchs for training. Wherein, Batch refers to dividing the whole training sample into several batches, Batch _ Size (Batch Size): size of each batch of samples.
And S103, carrying out iterative training on the VGG19 network framework based on the loss function, the initial learning rate and the iteration times of the VGG19 network framework to obtain a trained VGG19 model.
Specifically, the loss function (loss function) is used to measure the degree of inconsistency between the predicted value and the true value of the model, and is a non-negative true value function, and the smaller the value of the loss function is, the higher the accuracy of the VGG19 model is. And updating the loss function according to a predefined loss function, and generating an optimized VGG19 model according to the loss function.
Iterative training is a model training mode in deep learning and is used for optimizing a model. The iterative training in this step is realized by the following steps: firstly, constructing a target loss function of a VGG19 model, and performing cyclic training by adopting an optimization algorithm, such as an SGD (stochastic gradient descent) optimization algorithm; in each cycle training process, all training sample images are sequentially read in, the current loss function of the VGG19 model is calculated, the gradient descending direction is determined based on an optimization algorithm, the target loss function is gradually reduced and reaches a stable state, and the optimization of each parameter in the constructed network model is realized.
The loss function convergence refers to that the loss function is close to 0, for example, less than 0.1, and the like, that is, the value of the output of the VGG19 model for a given sample (positive sample or negative sample) is close to 0.5, it is considered that the VGG19 cannot distinguish the positive sample from the negative sample, that is, the output of the VGG19 is converged, that is, the training is stopped, and the model parameters of the last training are used as the parameters of the VGG19 model, so as to obtain the optimized VGG19 model.
And step S2, inputting the picture to be detected into the trained ambiguity detection model to obtain the ambiguity value of the picture to be detected.
Specifically, pictures to be detected are stored in a frozen product monitoring system imported from the building city, data declared by a frozen product owner are submitted to a background for approval, and the pictures to be detected comprise a customs declaration form, a goods list, a quarantine inspection certificate, a disinfection certificate, a nucleic acid detection certificate and the like. A first live image of a frozen scene is first taken by a camera or other device. For example, all the shot pictures are accessed to a local area network, so all cameras can be accessed through a DSS platform, the DSS has a screenshot function, and the frozen product field images shot by screenshot are stored in a bmp mode.
Adding a fully connected output layer to the improved model according to the number of the original data set and the data set to be trained of the model which need to be classified by adopting a multi-task training method, and adding a fully connected layer consisting of a plurality of neurons when the model needs to be classified into several types; the improved model is a main body, the fully-connected output layers added to the original data set and the data set to be trained are respectively two training branches, the original data set and the data set to be trained of the model are used for alternately training the model, the original data set is trained by using a cross entropy loss function, the data set to be trained is trained by using the similar perception loss function, back propagation iteration is carried out according to the magnitude of the loss value of forward propagation to update the weight of each layer in front, the model is stopped when the loss value of the model tends to be converged, the added output layers are removed to obtain a fuzziness detection model, and the picture to be detected is input to the trained fuzziness detection model to obtain the fuzziness value of the picture to be detected.
Further, before step S2, the method further includes:
step S20, adopting a convolutional neural network as a basic network, and adding a branch convolutional layer in the convolutional neural network layer to construct a network structure of a ambiguity detection model; the branch convolution layer is used for fusing the multi-level feature maps in the basic network.
In particular, the network structure may be defined using an open source PyTorch (a machine learning library) deep learning framework.
The branch convolution layer can transform the size and the channel of the feature picture, so that the multi-level feature maps in the basic network are fused. In the field of image recognition, a convolutional neural network is widely adopted for image classification and recognition at present, and has a relatively mature network structure and a relatively mature training algorithm, and the existing research results show that if a training sample is guaranteed to be high in quality and sufficient, the convolutional neural network has a high recognition rate in the traditional image recognition. However, the convolutional neural network has better biological simulation performance than the conventional artificial neural network, and is one of the research hotspots in recent years. Discrete pulses of the convolutional neural network have sparsity characteristics, the network operation amount can be greatly reduced, and the convolutional neural network has the advantages of achieving high performance, low power consumption, relieving overfitting and the like. The convolutional neural network has the advantages of ensuring the image recognition rate and simultaneously playing the advantages of low power consumption, low time delay and the like, thereby realizing high-speed time-varying information characteristic extraction and accurate recognition.
Step S21, inputting a plurality of label pictures into the network structure of the picture recognition model for training to generate the ambiguity detection model.
Specifically, a picture recognition model may be deployed as a service interface, and a plurality of tag pictures may be input into a network structure of the picture recognition model for training to generate the ambiguity detection model. The convolutional neural network is different from the general neural network in that the convolutional neural network includes a feature extractor composed of convolutional layers and sub-sampling layers. In the convolutional layer of the convolutional neural network, one neuron is connected to only part of the neighbor neurons. In a convolutional layer of CNN, there are usually several feature planes (featuremaps), each of which is composed of some neurons arranged in a rectangle, and the neurons in the same feature plane share a weight, where the shared weight is a convolution kernel. The convolution kernel is generally initialized in the form of a random decimal matrix, and the convolution kernel learns to obtain a reasonable weight in the training process of the network. Sharing weights (convolution kernels) brings the immediate benefit of reducing the connections between layers of the network, while reducing the risk of over-fitting. Subsampling, also called pooling (posing), usually takes the form of both mean (mean) and maximum (max) subsampling. Sub-sampling can be viewed as a special convolution process. Convolution and sub-sampling greatly simplify the complexity of the model and reduce the parameters of the model.
Further, step S21 further includes:
and step S211, using the weighted cross entropy loss as a main loss function and Ring loss as an auxiliary loss function to obtain a loss value of the network structure of the ambiguity detection model.
Specifically, the model output is Y ═ Y1, Y2.., yN +1}, the weight is W ═ W1, W2.., wN +1}, the value is based on the ratio of the number of samples in the training set, and then the cross-entropy loss between N scene labels and "others" is expressed as loss:
wherein label represents the sequence number of the real category label of the picture, and the value range is an integer of [1, N +1 ]; wlan < abel > belongs to W and is the weight corresponding to the real class label of the picture; and the ylabel belongs to Y and is a model output value corresponding to the picture real category label.
The target modular length is R, R is initialized with the mean of the characteristic vector modular lengths after the first iteration, Ring loss is expressed as lossrl:
the final loss lostotal is a weighted sum of two loss functions:
losstotal=lossce+λlossrl
wherein, the lambda is a weight factor and takes a value of 0.01.
And S212, optimizing parameters of the network structure of the ambiguity detection model by adopting a momentum random gradient descent algorithm based on the loss value of the network structure of the ambiguity detection model to obtain optimized parameters of the model.
Specifically, the back propagation of the loss adopts a random gradient descent method based on momentum, and the momentum factor is momentum ═ 0.9.
And step S213, setting the learning rate by adopting a transfer learning method, and adjusting the optimization parameters of the model.
Specifically, migration learning is performed based on an open source model trained on a public scene classification dataset place365 (a dataset), and pre-training weights except for a full connection layer and a feature fusion branch in a basic network are loaded. Training weight parameters in the added branch convolution layer and the fully-connected layer of the basic network, wherein the initial learning rate is set to be 0.01; fine-tuning the pre-training weight parameters in the networks conv2_ x, conv3_ x, conv4_ x and conv5_ x, with the initial learning rates of conv2_ x and conv3_ x set to 0.001 and the initial learning rates of conv4_ x and conv5_ x set to 0.002; the parameters in the other layers are frozen and no update is performed. In the training process, the value of the parameter learning rate is reduced by half every 5 iterations.
And testing by using the data in the diagram library, and checking the identification result to evaluate the accuracy and recall rate of the model. And (3) supplementing corresponding positive and negative samples to a training set aiming at the wrongly-divided cases, eliminating the atypical samples which are not beneficial to model training, updating the weight W of the cross entropy, and retraining the model. And repeating multiple rounds of data iteration until the accuracy of the model meets the production requirement, and stopping training.
Further, before step S20, the method further includes:
and step S200, generating a general low-order filter model.
Specifically, the expression of the high-order filter is relatively high in order, the expression becomes very complex, the internal structural characteristics of the filter are not clear, and designers are not easy to master the structural characteristics of the filter when designing the filter.
Step S201, selecting a general low-order filter model required by the target filter model.
Specifically, a general low-order filter model is first generated, wherein the generated general low-order filter model may be only a general low-order filter model constituting the target filter model or may be a general low-order filter model of all filters. And if the generated universal low-order filter model is the universal low-order filter models of all the filters, selecting the universal low-order filter model forming the target filter model from the generated universal low-order filter models, and then serially combining the selected universal low-order filter models in a mode of serially connecting output signals to finally form the target filter model. And if the first generated universal low-order filter model is a universal low-order filter model forming the target filter model, selecting all the generated universal low-order filter models, and then serially combining the selected universal low-order filter models in a mode of serially connecting output signals to form the target filter model.
And S202, carrying out series combination on the selected universal low-order filter models according to the series connection mode of output signals to form a target filter model.
Specifically, the generic low-order filter model is named freely and corresponds to the filter model one to one. In addition, a state mark list is established and is respectively in one-to-one correspondence with each universal low-order filter model. Then, according to the target filter model, the status flag of the general low-order filter model constituting the target filter model is set to 1 and the others are set to 0 in the status flag list. Through carrying out the series connection combination with the filter model by a plurality of general low order filter models, compare in prior art and carry out the global design to the wave filter for the design flexibility of wave filter model is more, and the designer can be more convenient adjusts the structure, the parameter of wave filter model.
Further, after step S202, the method further includes: and verifying the formed target filter model.
Specifically, in the verification process, a signal source model is selected from a signal generation model library, and a noise model is superimposed on the selected signal source model. The signal generation model library is stored with a plurality of signal source models, and the selected signal source model is a signal input by a simulation target filter model. The parameters of the signal source model can be set according to requirements, and the packaging of the signal source model requires a uniform interface form, so that the subsequent addition and replacement of various new signal source models are facilitated. And then, forming a filter analysis model by the signal source model, the noise model and the filter model. And operating the filter analysis model to generate operation result signal data, then evaluating the operation result signal data by adopting a quantitative evaluation method, and judging whether the target filter model meets the design requirement.
And step S3, when the ambiguity value is out of the ambiguity threshold range, alarming the picture to be detected.
Specifically, the VGG19 model obtained through training can directly obtain the range information of the ambiguity threshold through the picture, detect the ambiguity value of the image based on the ambiguity detection model, compare the detected ambiguity value of the image with the ambiguity threshold, and alarm the picture to be detected when the ambiguity value is out of the ambiguity threshold range.
Further, step S3 includes:
and step S31, judging whether the ambiguity value is within the ambiguity threshold range.
And step S32, if yes, confirming that the picture to be detected is clear.
And step S33, if not, confirming that the picture to be detected is fuzzy, and outputting alarm information.
Specifically, a sample image of a scene of the frozen product is first taken by a camera or other device. Such as the product picture in the imported frozen product supervision system, the types of the product picture comprise a customs declaration form, a goods list, a quarantine inspection certificate, a disinfection certificate, a nucleic acid detection certificate and the like. Then inputting the field image into a trained ambiguity detection model, comparing the detected ambiguity value of the image with an ambiguity threshold value, and if the ambiguity value is within the range of the ambiguity threshold value, confirming that the picture to be detected is clear;
and if the ambiguity value is not within the ambiguity threshold range, confirming that the picture to be detected is ambiguous, and outputting alarm information.
According to the picture fuzzy rate testing method, a sample image of a frozen product scene is firstly obtained through a camera or other equipment, and a data set is obtained; training the built VGG19 network framework by using a data set to obtain a VGG19 model; obtaining a threshold range of ambiguity; and inputting the frozen product image captured in real time into a trained VGG19 model, and detecting the ambiguity value of the image according to the ambiguity detection model. According to the invention, the VGG19 model obtained through training can directly obtain the range information of the fuzzy degree threshold value through the picture, the fuzzy degree detection is carried out on the image based on the fuzzy degree detection model, the fuzzy degree detected by the image is compared with the fuzzy degree threshold value, when the fuzzy degree is out of the range of the fuzzy degree threshold value, the alarm is given to the picture to be detected, the original manual classification recognition is replaced, and thus the efficiency of the picture fuzzy degree recognition is improved.
Fig. 2 is a functional module schematic diagram of an artificial intelligence-based ambiguity detection system according to an embodiment of the present application. As shown in fig. 2, the system 2 for detecting ambiguity based on artificial intelligence comprises a building module 21, a detecting module 22, and a judging module 23.
The construction module 21 is used for inputting the historical picture data into a trained VGG19 model and outputting the historical picture data to obtain a fuzzy threshold range;
the detection module 22 is used for inputting the picture to be detected into the trained ambiguity detection model to obtain the ambiguity value of the picture to be detected;
and the judging module 23 is configured to alarm the picture to be detected when the ambiguity value is outside the ambiguity threshold range.
Optionally, inputting the historical picture data into a trained VGG19 model, and before outputting the range of the ambiguity threshold, the method further includes:
building a VGG19 network framework and defining a loss function, an initial learning rate and iteration times of the VGG19 network framework;
carrying out format conversion on the training sample pictures and the corresponding data sets, and inputting the data sets after format conversion into the VGG19 network framework for training;
and carrying out iterative training on the VGG19 network framework based on the loss function, the initial learning rate and the iteration times of the VGG19 network framework to obtain a trained VGG19 model.
Optionally, before the image to be detected is input to the trained ambiguity detection model to obtain the ambiguity value of the image to be detected, the method further includes:
adopting a convolutional neural network as a basic network, and adding a branch convolutional layer in the convolutional neural network layer to construct a network structure of a ambiguity detection model; the branch convolution layer is used for fusing the multi-level feature maps in the basic network;
inputting a plurality of label pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model.
Optionally, inputting a plurality of the tagged pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model, including:
obtaining a loss value of a network structure of the ambiguity detection model by using the weighted cross entropy loss as a main loss function and the Ring loss as an auxiliary loss function;
optimizing parameters of the network structure of the ambiguity detection model by adopting a momentum random gradient descent algorithm based on the loss value of the network structure of the ambiguity detection model to obtain optimized parameters of the model;
and setting the learning rate by adopting a transfer learning method, and adjusting the optimization parameters of the model.
Optionally, a convolutional neural network is used as a basic network, and a branch convolutional layer is added in the convolutional neural network layer to construct a network structure of a ambiguity detection model; before the branch convolution layer is used for fusing the multi-level feature maps in the base network, the method further includes: generating a general low-order filter model;
selecting a general low-order filter model required by a target filter model;
and carrying out series combination on the selected universal low-order filter models according to the series connection mode of output signals to form a target filter model.
Optionally, the step of serially combining the selected general low-order filter models in a manner of serially connecting output signals to form a target filter model further includes: and verifying the formed target filter model.
Optionally, when the ambiguity value is outside the ambiguity threshold range, the method for warning the picture to be detected includes: judging whether the ambiguity value is within the ambiguity threshold range;
if so, confirming that the picture to be detected is clear;
if not, confirming that the picture to be detected is fuzzy, and outputting alarm information.
For other details of the technical solutions for implementing the modules in the image blur rate detection system according to the above embodiments, reference may be made to the description of the image blur rate detection method in the above embodiments, and details are not repeated here.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the system-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment
Referring to fig. 3, fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 3, the computer device 30 includes a processor 31 and a memory 32 coupled to the processor 31.
The memory 32 stores program instructions that, when executed by the processor 31, cause the processor 31 to perform the steps of the artificial intelligence based ambiguity detection method in the above-described embodiments.
The processor 31 may also be referred to as a CPU (Central Processing Unit). The processor 31 may be an integrated circuit chip having signal processing capabilities. The processor 31 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application. The storage medium of the embodiment of the present application stores a program file 41 capable of implementing all the methods described above, where the program file 41 may be stored in the storage medium in the form of a software product, and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform.
In the several embodiments provided in the present application, it should be understood that the disclosed terminal, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The above embodiments are merely examples and are not intended to limit the scope of the present disclosure, and all modifications, equivalents, and flow charts using the contents of the specification and drawings of the present disclosure or those directly or indirectly applied to other related technical fields are intended to be included in the scope of the present disclosure.
Claims (10)
1. An artificial intelligence-based ambiguity detection method is characterized by comprising the following steps:
inputting historical picture data into a trained VGG19 model, and outputting to obtain a ambiguity threshold range;
inputting a picture to be detected into a trained ambiguity detection model to obtain an ambiguity value of the picture to be detected;
and when the ambiguity value is out of the ambiguity threshold range, alarming the picture to be detected.
2. The artificial intelligence based ambiguity detection method of claim 1, wherein said inputting historical picture data into a trained VGG19 model, before outputting the ambiguity threshold range, further comprises:
building a VGG19 network framework and defining a loss function, an initial learning rate and iteration times of the VGG19 network framework;
carrying out format conversion on the training sample pictures and the corresponding data sets, and inputting the data sets after format conversion into the VGG19 network framework for training;
and carrying out iterative training on the VGG19 network framework based on the loss function, the initial learning rate and the iteration times of the VGG19 network framework to obtain a trained VGG19 model.
3. The artificial intelligence based ambiguity detection method of claim 1, wherein before inputting the picture to be detected into the trained ambiguity detection model to obtain the ambiguity value of the image to be detected, the method further comprises:
adopting a convolutional neural network as a basic network, and adding a branch convolutional layer in the convolutional neural network layer to construct a network structure of a ambiguity detection model; the branch convolution layer is used for fusing the multi-level feature maps in the basic network;
inputting a plurality of label pictures into a network structure of the picture recognition model for training to generate the ambiguity detection model.
4. The artificial intelligence based ambiguity detection method of claim 3, wherein said inputting a plurality of said tagged pictures into a network structure of said picture recognition model for training to generate said ambiguity detection model comprises:
obtaining a loss value of a network structure of the ambiguity detection model by using the weighted cross entropy loss as a main loss function and the Ring loss as an auxiliary loss function;
optimizing parameters of the network structure of the ambiguity detection model by adopting a momentum random gradient descent algorithm based on the loss value of the network structure of the ambiguity detection model to obtain optimized parameters of the model;
and setting the learning rate by adopting a transfer learning method, and adjusting the optimization parameters of the model.
5. The artificial intelligence based ambiguity detection method of claim 3, wherein the convolutional neural network is adopted as a base network, and branch convolutional layers are added in the convolutional neural network layer to construct a network structure of an ambiguity detection model; before the branch convolution layer is used for fusing the multi-level feature maps in the base network, the method further includes:
generating a general low-order filter model;
selecting a general low-order filter model required by a target filter model;
and carrying out series combination on the selected universal low-order filter models according to the series connection mode of output signals to form a target filter model.
6. The artificial intelligence based ambiguity detection method of claim 5, wherein said combining said selected generic low order filter models in series in a way that output signals are connected in series, further comprising after forming a target filter model:
and verifying the formed target filter model.
7. The artificial intelligence based ambiguity detection method of claim 1, wherein said alerting said picture to be detected when said ambiguity value is outside said ambiguity threshold range comprises:
judging whether the ambiguity value is within the ambiguity threshold range;
if so, confirming that the picture to be detected is clear;
if not, confirming that the picture to be detected is fuzzy, and outputting alarm information.
8. An artificial intelligence based ambiguity detection system, comprising:
the construction module is used for inputting the historical picture data into a trained VGG19 model and outputting to obtain a fuzzy threshold range;
the detection module is used for inputting the picture to be detected into a trained ambiguity detection model to obtain an ambiguity value of the picture to be detected;
and the judging module is used for alarming the picture to be detected when the ambiguity value is out of the ambiguity threshold range.
9. A computer device, characterized in that the terminal device comprises a processor, a memory coupled to the processor, in which memory program instructions are stored which, when executed by the processor, cause the processor to carry out the steps of the artificial intelligence based ambiguity detection method according to any one of claims 1-7.
10. A storage medium storing a program file capable of implementing the artificial intelligence based ambiguity detection method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111255446.5A CN113962312A (en) | 2021-10-27 | 2021-10-27 | Ambiguity detection method, system, device and medium based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111255446.5A CN113962312A (en) | 2021-10-27 | 2021-10-27 | Ambiguity detection method, system, device and medium based on artificial intelligence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113962312A true CN113962312A (en) | 2022-01-21 |
Family
ID=79467404
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111255446.5A Pending CN113962312A (en) | 2021-10-27 | 2021-10-27 | Ambiguity detection method, system, device and medium based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113962312A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114359854A (en) * | 2022-03-21 | 2022-04-15 | 上海闪马智能科技有限公司 | Object identification method and device, storage medium and electronic device |
-
2021
- 2021-10-27 CN CN202111255446.5A patent/CN113962312A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114359854A (en) * | 2022-03-21 | 2022-04-15 | 上海闪马智能科技有限公司 | Object identification method and device, storage medium and electronic device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022083536A1 (en) | Neural network construction method and apparatus | |
CN114937151B (en) | Lightweight target detection method based on multiple receptive fields and attention feature pyramid | |
CN109033107B (en) | Image retrieval method and apparatus, computer device, and storage medium | |
CN112183577A (en) | Training method of semi-supervised learning model, image processing method and equipment | |
WO2020069533A1 (en) | Method, machine-readable medium and system to parameterize semantic concepts in a multi-dimensional vector space and to perform classification, predictive, and other machine learning and ai algorithms thereon | |
CN112434462A (en) | Model obtaining method and device | |
CN112445823A (en) | Searching method of neural network structure, image processing method and device | |
CN109522942A (en) | A kind of image classification method, device, terminal device and storage medium | |
CN110222718B (en) | Image processing method and device | |
CN112784913A (en) | miRNA-disease associated prediction method and device based on graph neural network fusion multi-view information | |
US20200272812A1 (en) | Human body part segmentation with real and synthetic images | |
CN110111365B (en) | Training method and device based on deep learning and target tracking method and device | |
CN113987236B (en) | Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network | |
CN114037056A (en) | Method and device for generating neural network, computer equipment and storage medium | |
CN115601692A (en) | Data processing method, training method and device of neural network model | |
CN113536970A (en) | Training method of video classification model and related device | |
CN115018039A (en) | Neural network distillation method, target detection method and device | |
CN115526316A (en) | Knowledge representation and prediction method combined with graph neural network | |
CN114170484B (en) | Picture attribute prediction method and device, electronic equipment and storage medium | |
Gao et al. | Deep learning for sequence pattern recognition | |
Terziyan et al. | Causality-aware convolutional neural networks for advanced image classification and generation | |
WO2024175079A1 (en) | Model quantization method and related device | |
CN113962312A (en) | Ambiguity detection method, system, device and medium based on artificial intelligence | |
Li et al. | Multi-view convolutional vision transformer for 3D object recognition | |
CN110866445A (en) | Crowd counting and density estimation method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |