CN115240037A - Model training method, image processing method, device and storage medium - Google Patents

Model training method, image processing method, device and storage medium Download PDF

Info

Publication number
CN115240037A
CN115240037A CN202211161244.9A CN202211161244A CN115240037A CN 115240037 A CN115240037 A CN 115240037A CN 202211161244 A CN202211161244 A CN 202211161244A CN 115240037 A CN115240037 A CN 115240037A
Authority
CN
China
Prior art keywords
image
model
label
matrix
probability distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211161244.9A
Other languages
Chinese (zh)
Inventor
孟海秀
温书远
陈录城
孙琦
王艳纳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haier Digital Technology Qingdao Co Ltd
Cosmoplat Industrial Intelligent Research Institute Qingdao Co Ltd
Haier Cosmo IoT Technology Co Ltd
Original Assignee
Haier Digital Technology Qingdao Co Ltd
Cosmoplat Industrial Intelligent Research Institute Qingdao Co Ltd
Haier Cosmo IoT Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haier Digital Technology Qingdao Co Ltd, Cosmoplat Industrial Intelligent Research Institute Qingdao Co Ltd, Haier Cosmo IoT Technology Co Ltd filed Critical Haier Digital Technology Qingdao Co Ltd
Priority to CN202211161244.9A priority Critical patent/CN115240037A/en
Publication of CN115240037A publication Critical patent/CN115240037A/en
Priority to PCT/CN2023/098759 priority patent/WO2024060684A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a model training method, an image processing method, equipment and a storage medium, which relate to the field of image processing, and the model training method comprises the following steps: preprocessing the image in the sample data set to obtain a multi-label word vector of the image and a multi-label adjacency matrix of the image; clustering multi-label word vectors and multi-label adjacency matrixes by adopting a graph wavelet neural network model to obtain a classification model; extracting the characteristics of the image to be processed and outputting a characteristic matrix of the image to be processed; training the classification model based on the characteristic matrix to obtain a multi-label probability distribution model of the image to be processed; and adopting a loss function to carry out convergence processing on the probability distribution model to obtain an image annotation model, wherein the image annotation model is used for carrying out annotation processing on the target image to obtain a label of the target image. The purpose of improving the precision of image labeling is achieved.

Description

Model training method, image processing method, device and storage medium
Technical Field
The application belongs to the field of image processing, and particularly relates to a model training method, an image processing method, equipment and a storage medium.
Background
With the development of computer vision technology, image annotation plays a crucial role in computer vision. The goal of image annotation is to determine task-specific labels that are relevant to the task.
In the related art, image annotation is usually performed based on a Graph Convolution Neural Network (GCNN), but this method has a large amount of calculation, does not have a locality characteristic, and cannot mine correlation and co-occurrence between labels of an image, so that the related art has a problem of low accuracy of image annotation when performing image annotation.
Disclosure of Invention
In order to solve the above problem, that is, to solve the problem that the precision of image annotation is low when the current technology performs image annotation, the present application provides a model training method, an image processing method, an apparatus, and a storage medium.
In a first aspect, the present application provides a model training method, including: preprocessing the image in the sample data set to obtain a multi-label word vector of the image and a multi-label adjacency matrix of the image; clustering multi-label word vectors and multi-label adjacency matrixes by adopting a graph wavelet neural network model to obtain a classification model; extracting the characteristics of the image to be processed, and outputting a characteristic matrix of the image to be processed; training the classification model based on the characteristic matrix to obtain a multi-label probability distribution model of the image to be processed; and adopting a loss function to carry out convergence processing on the probability distribution model to obtain an image annotation model, wherein the image annotation model is used for carrying out annotation processing on the target image to obtain a label of the target image.
In the preferred technical solution of the above model training method, the graph wavelet neural network model includes a 2-layer graph wavelet neural network, and the clustering process is performed on the multi-label word vector and the multi-label adjacency matrix by using the graph wavelet neural network model to obtain a classification model, including: adopting a first-layer image wavelet neural network in the 2-layer image wavelet neural network to perform clustering processing on the multi-label word vectors and the multi-label adjacency matrixes to obtain output vectors; and clustering the output vectors by adopting a second layer image wavelet neural network in the 2-layer image wavelet neural network to obtain a classification model.
In a preferred technical solution of the above model training method, the first-layer graph wavelet neural network is a nonlinear activation function silu, and the second-layer graph wavelet neural network is a nonlinear activation function softmax.
In a preferred technical solution of the above model training method, preprocessing an image in a sample data set to obtain a multi-label adjacency matrix of the image includes: determining a first parameter and a second parameter of the multi-label adjacency matrix according to a first label and a second label of the image, wherein the first parameter is used for representing the number of times that the first label and the second label in the sample data set appear at the same time, and the second parameter is used for representing the number of times that the first label appears in the sample data set; determining a conditional probability matrix of the image according to the first parameter and the second parameter; carrying out binarization processing on the conditional probability matrix to obtain a binarization adjacent matrix; and carrying out weighting processing on the binary adjacency matrix to obtain a multi-label adjacency matrix.
In the preferred technical solution of the above model training method, training the classification model based on the feature matrix to obtain a multi-label probability distribution model of the image to be processed includes: and carrying out matrix multiplication processing on the characteristic matrix and the classification model to obtain a probability distribution model.
In the preferred technical solution of the above model training method, performing convergence processing on the probability distribution model by using a loss function to obtain an image labeling model, including: determining a network hyper-parameter of the probability distribution model; and carrying out convergence processing on the probability distribution model according to the loss function and the network hyper-parameter to obtain an image annotation model.
In a preferred technical solution of the above model training method, the converging processing is performed on the probability distribution model by using a loss function to obtain an image labeling model, including: monitoring the value of the loss function and the precision value of the probability distribution model; and if the value of the loss function is smaller than the first threshold value or the precision value of the probability distribution model is larger than the second threshold value, outputting the probability distribution model as an image annotation model.
In a second aspect, the present application provides an image processing method, including: acquiring an image to be processed; and inputting the image to be processed into an image annotation model for annotation processing to obtain a label of the image to be processed, wherein the image annotation model is obtained by training through the model training method of the first aspect.
In a third aspect, the present application provides a model training apparatus, comprising: the preprocessing module is used for preprocessing the image in the sample data set to obtain a multi-label word vector of the image and a multi-label adjacency matrix of the image; the clustering module is used for clustering multi-label word vectors and multi-label adjacency matrixes by adopting a graph wavelet neural network model to obtain a classification model; the extraction module is used for extracting the characteristics of the image to be processed and outputting a characteristic matrix of the image to be processed; the training module is used for training the classification model based on the characteristic matrix to obtain a multi-label probability distribution model of the image to be processed; and the convergence module is used for carrying out convergence processing on the probability distribution model by adopting a loss function to obtain an image annotation model, and the image annotation model is used for carrying out annotation processing on the target image to obtain a label of the target image.
In a fourth aspect, the present application provides an image processing apparatus comprising: the acquisition module is used for acquiring an image to be processed; and the annotation module is used for inputting the image to be processed into the image annotation model for annotation processing to obtain the label of the image to be processed, wherein the image annotation model is obtained by training through the model training method of the first aspect.
In a fifth aspect, the present application provides an electronic device, comprising: a processor, and a memory communicatively coupled to the processor; the memory stores computer execution instructions; the processor executes computer-executable instructions stored by the memory to implement the model training method of the first aspect or the image processing method of the second aspect.
In a sixth aspect, the present application provides a computer-readable storage medium having stored thereon computer-executable instructions for implementing the model training method of the first aspect or the image processing method of the second aspect when executed by a processor.
In a seventh aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the model training method of the first aspect or the image processing method of the second aspect.
According to the model training method, the image processing equipment and the storage medium, the graph wavelet neural network is adopted, the local characteristics of the graph wavelets and the correlation among multiple labels of the image are utilized, the co-occurrence characteristics among the multiple labels are fully captured, and therefore the precision of labeling the multiple label images is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a schematic structural diagram of an image processing system according to an embodiment of the present application;
FIG. 2 is a flow chart of a model training method provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of a model training process provided in an embodiment of the present application;
FIG. 4 is a flowchart of an image processing method provided in an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a model training apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;
fig. 7 is a schematic view of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The terms referred to in the present application will be explained first.
Adjacency matrix: the two-dimensional array of data that holds the relationships (edges or arcs) between all vertices in the graph is called the adjacency matrix.
Graph wavelet neural network: a Wavelet Neural Network (WNN) is an artificial Neural Network provided on the basis of breakthrough in Wavelet analysis research. The method is a novel layered and multi-resolution artificial neural network model constructed based on wavelet analysis theory and wavelet transformation. Graph Wavelet Neural Network (GWNN) is a wavelet neural network used for analyzing images.
Image labeling: the method is a process of adding text characteristic information reflecting the content of the image to the image by a machine learning method aiming at the visual content of the image.
The related art mentioned in the background art has at least the following technical problems:
with the development of computer vision technology, image annotation plays a crucial role in computer vision. The goal of image annotation is to determine task-specific labels that are relevant to the task. For large-scale data sets, image automatic annotation is usually performed based on a Graph Convolution Neural Network (GCNN) in the related technology, but the method needs to perform characteristic decomposition of a graph correlation matrix in graph Fourier change, and the calculated amount is large; the multi-label correlation matrix has the characteristic of sparsity, and the sparsity of the original multi-label image automatic labeling based on the graph convolution neural network cannot be utilized; in addition, the graph convolution neural network has no locality characteristic, and correlation and co-occurrence characteristics among multiple labels cannot be mined.
In order to solve the problems, the application provides a model training method and an image processing method, a graph wavelet neural network is adopted, and the co-occurrence characteristics among multiple labels are fully captured by utilizing the localization characteristics of a graph wavelet and the correlation among the multiple labels of an image, so that the trained image labeling model can achieve the purpose of improving the precision of labeling the multiple label images, and the calculation cost can be reduced, thereby achieving the purpose of improving the efficiency of labeling the multiple label images.
In a possible implementation, the model training method and the image processing method provided by this embodiment may be applied in an application scenario. Fig. 1 is a schematic structural diagram of an image processing system according to an embodiment of the present application. As shown in FIG. 1, in this scenario, the image processing system may include a data acquisition device 101, a database 102, a training device 103, an execution device 104, a data storage system 105, and a user device 106, wherein the execution device 104 includes a target model/rule 107 and an I/O interface 108.
The data acquisition device 101 may be configured to obtain a multi-label adjacency matrix of a preprocessed image and a multi-label word vector of the preprocessed image of the sample data set, and store the multi-label adjacency matrix and the multi-label word vector in the database 102.
The training device 103 may perform the model training method in the embodiment of the present application, so as to train the target model/rule 107 for acquiring the image label. The target models/rules 107 derived by the training device 103 may be applied in different systems or devices.
The execution device 104 is configured with an I/O interface 108, and can perform data interaction with the user device 106, and a user can input a target image to be subjected to tag labeling to the I/O interface 108 through the user device 106; the object model/rules 107 in the execution device 104 may process the input object image to obtain a label for the object image; the I/O interface 108 returns the label of the target image to the user device 106 for presentation to the user by the user device 106.
The execution device 104 may call data, code, etc. in the data storage system 105, or may store data, instructions, etc. in the data storage system 105.
Based on the above scenario, in one case, the user may manually input the target image to the I/O interface 108 through the user device 106, for example, operating in an interface provided by the I/O interface 108; in another case, the user device 106 may automatically enter the target image into the I/O interface 108 and retrieve the tag for the target image returned by the I/O interface 108. It should be noted that, if the user device 106 automatically inputs data into the I/O interface 108 and obtains a result returned by the I/O interface 108, the user device 106 needs to obtain authorization of the user, and the user may set a permission for response in the user device 106.
In the above scenario, the user device 106 may also serve as a data acquisition end to store the received target image and the tag of the target image in the database 102 for use as a sample.
It should be noted that the structure of the image processing system shown in fig. 1 is only a schematic diagram, and the positional relationship between the devices, modules, and the like shown in the diagram does not constitute any limitation, for example, in fig. 1, the data storage system 105 is an external memory with respect to the execution device 104, and in other cases, the data storage system 105 may be disposed in the execution device 104; the database 102 is an external memory with respect to the training device 103, in other cases the database 102 may also be placed in the training device 103.
With reference to the above scenario, the following describes in detail the technical solutions of the model training method and the image processing method provided in the present application through several specific embodiments.
The embodiment of the application provides a model training method. Fig. 2 is a flowchart of a model training method provided in an embodiment of the present application, which may be executed by the training apparatus in fig. 1, as shown in fig. 2, where the model training method includes the following steps:
s201: and preprocessing the image in the sample data set to obtain a multi-label word vector of the image and a multi-label adjacency matrix of the image.
In this step, the image in the sample data set may be preprocessed through a pre-written Python script, and a Glove construction method is adopted to construct a multi-label word vector word2vec of the image.
Alternatively, the multi-tag adjacency matrix may be determined by the label and the interconnection relationship between the labels.
Optionally, the sample data set may include sample image data, text data, and tag data.
S202: and clustering the multi-label word vectors and the multi-label adjacency matrixes by adopting a graph wavelet neural network model to obtain a classification model.
In this step, a multi-label classification module may be constructed using the graph wavelet neural network to obtain a graph wavelet neural network model, and the multi-label word vectors and the multi-label adjacency matrix are clustered by the graph wavelet neural network model to obtain a classification model. Wherein, the classification model is the classifier.
Alternatively, the multi-label classification module may be composed of a two-layer graph wavelet neural network.
S203: and performing feature extraction on the image to be processed, and outputting a feature matrix of the image to be processed.
In this step, the image to be processed may be a training data set image in a training process, and may be an image input by a user in an actual operation process, a ResNet-101 network may be used to establish a feature extraction module, and the feature extraction module is used to perform feature extraction on the image to be processed, so as to obtain a feature matrix of the image to be processed.
S204: and training the classification model based on the characteristic matrix to obtain a multi-label probability distribution model of the image to be processed.
In this step, the probability distribution model is the multi-label probability distribution result of the image to be processed. After the feature matrix and the classification model are obtained, the classification model can be trained by using the feature matrix, so that probability distribution results of a plurality of labels of the image to be processed are obtained.
Alternatively, the probability of labeling each label of the plurality of labels of the image to be processed may be 1 or 0.
S205: and adopting a loss function to carry out convergence processing on the probability distribution model to obtain an image annotation model.
In this step, the image annotation model is used to perform annotation processing on the target image to obtain a label of the target image.
Optionally, for the multi-label classification problem, the Loss function may adopt a Binary Cross Entropy Loss function (Binary Cross Entropy Loss), and after the probability distribution model is obtained, the model parameter in the probability distribution model may be subjected to convergence processing by using the Binary Cross Entropy Loss function, so that when the obtained image labeling model performs image labeling on a target image, the accuracy of labeling the target image may be improved.
According to the model training method provided by the embodiment of the application, the graph wavelet neural network is adopted, the local characteristics of the graph wavelets and the correlation among the multiple labels of the image are utilized, the co-occurrence characteristics among the multiple labels are fully captured, and therefore the accuracy of labeling the multiple label images is improved.
In a possible implementation manner, the graph wavelet neural network model is a 2-layer graph wavelet neural network, and the clustering processing is performed on the multi-label word vector and the multi-label adjacency matrix by using the graph wavelet neural network model to obtain a classification model, which includes: adopting a first layer image wavelet neural network in the 2-layer image wavelet neural network to perform clustering processing on the multi-label word vectors and the multi-label adjacency matrix to obtain output vectors; and clustering the output vectors by adopting an activation function of a second layer image wavelet neural network in the 2-layer image wavelet neural network to obtain a classification model.
In this scheme, a directed graph G = (V, E, W) between multiple labels may be first established, where V may be used to represent a node of the directed graph G, E may be used to represent an edge of the directed graph G, and W may be used to represent a weight of the edge E. In the directed graph G, each node V may represent a label, and the directed graph G may be represented by a multi-label word vector. And connecting each label in the plurality of labels with each other to obtain a multi-label adjacency matrix, and clustering by using a two-layer graph wavelet neural network to obtain a classification model.
Alternatively, in general, the larger the convolution kernel, the larger the receptive field (receptive field), the more image information that can be seen, the better the global features obtained. Preferably, the graph wavelet neural network model in the present application may specifically include a two-layer graph wavelet neural network, and each layer of graph wavelet neural network may have 32 graph convolution kernels; optionally, to enlarge the receptive field, each layer of the graph wavelet neural network may also have 80 graph convolution kernels.
In the above scheme, the graph wavelet convolution may be defined as:
Figure DEST_PATH_IMAGE001
wherein, the wavelet basis can be expressed as:
Figure 82298DEST_PATH_IMAGE002
u, may be used to represent the laplacian eigenvector,
Figure DEST_PATH_IMAGE003
can be used to represent matrix dot products; the wavelet transform can be expressed as:
Figure 777853DEST_PATH_IMAGE004
the inverse wavelet transform may be expressed as:
Figure DEST_PATH_IMAGE005
yandxall are variables of the graph wavelet participating in the operation.
Then further graph wavelet neural network can be obtainedModel (model)HIt can be formulated as follows:
Figure 192653DEST_PATH_IMAGE006
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE007
can be used to represent the basis of a wavelet,
Figure 968759DEST_PATH_IMAGE008
may be used to represent an inverse graph wavelet transform and h may be used to represent a nonlinear activation function.
After further simplification, a graph wavelet neural network model comprising a two-layer graph wavelet neural network, namely a classification model can be obtainedZThe formula may be as follows:
Figure DEST_PATH_IMAGE009
wherein the content of the first and second substances,
Figure 469010DEST_PATH_IMAGE010
may be used to represent a multi-label adjacency matrix,Xcan be used to represent a tag characterization matrix,Wcan be used for representing a parameter matrix to be trained of the model, and the parameter matrix to be trained of the model can be initialized randomly.
In the scheme, the graph wavelet neural network algorithm is adopted to replace a graph convolution neural network, the calculated amount can be reduced, the correlation among multiple labels can be fully utilized by constructing a directed graph, the co-occurrence characteristic among the multiple labels can be fully captured by utilizing the localization characteristic of the graph wavelet, the multiple labels can have good interpretability, and therefore the image annotation model obtained by subsequent training can improve the efficiency of image annotation on the target image.
In one possible embodiment, the first-level map wavelet neural network is a nonlinear activation function silu, and the second-level map wavelet neural network is a nonlinear activation function softmax.
In the scheme, the first layer diagram wavelet neural network can be a nonlinear activation function silu or a nonlinear activation function relu. Relu is used as the most common activation function for executing most deep learning tasks, the required calculation amount is very small, and the calculation speed is fast; while the silu is more nonlinear than relu, is more suitable for performing a multi-label classification task, and both the silu and the first derivative have smooth characteristics, and are more easily converged than relu. For the selection of the silu and the relu, a more applicable activation function can be selected according to actual conditions, so that the efficiency and the accuracy of training the model can be improved.
In one possible embodiment, preprocessing an image in a sample data set to obtain a multi-label adjacency matrix of the image includes: determining a first parameter and a second parameter of the multi-label adjacency matrix according to a first label and a second label of the image, wherein the first parameter is used for representing the number of times that the first label and the second label appear in the sample data set at the same time, and the second parameter is used for representing the number of times that the first label appears in the sample data set; determining a conditional probability matrix of the image according to the first parameter and the second parameter; performing binarization processing on the conditional probability matrix to obtain a binarization adjacent matrix; and carrying out weighting processing on the binary adjacent matrix to obtain a multi-label adjacent matrix.
In this scheme, when constructing the multi-label adjacency matrix, firstly, the labels included in the images in the sample data set may be counted, then, the number of times that a first label and a second label of the labels appear in one image at the same time is calculated, the number of times is used as a first parameter, the number of times that the first label appears in the sample data set is calculated, the number of times is used as a second parameter, so that the conditional probability matrix is determined according to the first parameter and the second parameter, and finally, the multi-label adjacency matrix is obtained. The formula may be as follows:
the first label is denoted as i, the second label is denoted as j, and the first parameter is denoted as
Figure DEST_PATH_IMAGE011
The second parameter is recorded as
Figure 686365DEST_PATH_IMAGE012
Conditional probability matrix
Figure DEST_PATH_IMAGE013
Can be expressed as:
Figure 971984DEST_PATH_IMAGE014
in the above scheme, a threshold value may be determined first
Figure DEST_PATH_IMAGE015
Then, the conditional probability matrix is subjected to binarization processing to obtain a binarization adjacent matrix, and the binarization adjacent matrix
Figure 147750DEST_PATH_IMAGE016
Can be expressed as:
Figure DEST_PATH_IMAGE017
in the above scheme, after the condition probability matrix is binarized, since the obtained binarized adjacent matrix has an over-smoothing phenomenon, the binarized adjacent matrix may be further reweighed to obtain a final multi-label adjacent matrix, and the multi-label adjacent matrix
Figure 620451DEST_PATH_IMAGE010
Can be expressed as:
Figure 375917DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 968573DEST_PATH_IMAGE019
can be used to represent hyper-parameters which can be set manually, when
Figure 534814DEST_PATH_IMAGE019
When the node is close to 1, the characteristics of the label corresponding to the node can be ignored; when in use
Figure 478500DEST_PATH_IMAGE019
When the node is close to 0, the information of the label corresponding to the adjacent node can be ignored;
Figure 37657DEST_PATH_IMAGE020
may be used to indicate the number of categories of multi-labels.
Optionally, by constructing a multi-label adjacency matrix of the image to be processed, the sparsity of the multi-label adjacency matrix can be utilized, so that the image labeling model obtained by subsequent training can improve the efficiency of image labeling on the target image.
In a possible implementation manner, training the classification model based on the feature matrix to obtain a multi-label probability distribution model of the image to be processed includes: and matrix multiplication processing is carried out on the characteristic matrix and the classification model to obtain a probability distribution model.
In the scheme, a feature extraction module can be adopted to perform feature extraction on the image to be processed input by a user so as to obtain a feature vector, namely a feature matrix, of the image to be processed, and then the feature matrix and the classification model are subjected to matrix multiplication, so that a probability distribution result of multi-label labeling can be obtained, and the result can be represented by a probability distribution model.
In the above scheme, fig. 3 is a schematic diagram of a model training process provided in an embodiment of the present application, and in fig. 3, after a to-be-processed image input by a user is subjected to feature extraction by a convolutional neural network (i.e., a ResNet-101 network), image features may be obtained, and then after the image features are subjected to global max pooling (global max pooling), a feature matrix may be obtained. The multi-label word vectors represented by the directed graph can form a multi-label adjacency matrix, the multi-label adjacency matrix passes through a first-layer graph wavelet neural network to obtain an output vector, and the distribution probability of a plurality of labels included in the output vector is different from that of the plurality of labels included in the multi-label adjacency matrix (the height of a box where the label is located in the graph represents the probability); and clustering the output vector by adopting a second layer diagram wavelet neural network to obtain another output vector, and converting the other output vector to obtain a classification model, wherein the classification model can represent the probability of multiple labels under different label categories and different characteristic dimensions. After matrix multiplication processing is carried out on the characteristic matrix and the classification model, a multi-label probability distribution result of the image to be processed can be obtained and is represented by the probability distribution model. Finally, after the convergence of the loss function, an image labeling model can be obtained, and the loss function can be a multi-label loss function.
In the above scheme, the feature extraction module may adopt a pre-trained network model of the ImageNet data set, so that subsequent convergence processing of the probability distribution model by adopting a loss function can be accelerated. The probability distribution model is obtained through training, the probability distribution result of the label labeling on the image to be processed can be determined, and therefore the label in the target image can be accurately labeled.
In a possible implementation manner, performing convergence processing on the probability distribution model by using a loss function to obtain an image labeling model includes: determining a network hyper-parameter of the probability distribution model; and carrying out convergence processing on the probability distribution model according to the loss function and the network hyperparameter to obtain an image annotation model.
In the scheme, the loss function may adopt a binarization cross entropy loss function, when the loss function is used for convergence processing of the probability distribution model, network hyper-parameters of the probability distribution model may be determined first, where the network hyper-parameters may include a learning rate lr =0.01, a momentum =0.9, and a weight attenuation penalty =5e-4, and the optimizer may adopt an SGD random gradient descent method to perform gradient back propagation and model training, the iteration number epoch =100, the batch size =32, and the like, so that the training speed of the image labeling model may be increased.
In a possible implementation manner, performing convergence processing on the probability distribution model by using a loss function to obtain an image labeling model includes: monitoring the value of the loss function and the precision value of the probability distribution model; and if the value of the loss function is smaller than the first threshold value or the precision value of the probability distribution model is larger than the second threshold value, outputting the probability distribution model as an image annotation model.
In the scheme, in the process of adopting the loss function to carry out convergence processing on the probability distribution model to obtain the image annotation model, the value of the loss function and the precision value of the probability distribution model can be monitored in real time, so that the optimal probability distribution model can be obtained, the optimal probability distribution model is the finally required image annotation model, and the accuracy of image annotation on the target image can be improved. The first threshold and the second threshold may be preset values, for example, the first threshold may be 0.03, the second threshold may be 99, when the value of the loss function is smaller than 0.03, or the precision value of the probability distribution model is greater than 99, the probability distribution model stops being converged, and the probability distribution model at this time may be output as the finally required image annotation model.
In the above scheme, the parameters of the training platform for model training may be selected as: the system comprises a Ubuntu 16.04 system and 4 Nvidia Tesla V100 display cards, wherein an Intel (R) Xeon (R) CPU E5-2637 v4 @ 3.50GHz can be adopted as a processor; the model framework may be based on an underlying environment of python =3.6, pytorch =1.8.0, cuda =10.2, cudnn =7.6.5. Optionally, multiple Graphics Processors (GPUs) may be employed to accelerate the convergence process on the probability distribution model.
The embodiment of the application also provides an image processing method. Fig. 4 is a flowchart of an image processing method provided in an embodiment of the present application, which may be executed by the execution device in fig. 1, as shown in fig. 4, where the image processing method includes the following steps:
s401: and acquiring an image to be processed.
S402: and inputting the image to be processed into the image annotation model for annotation processing to obtain the label of the image to be processed.
In this step, the image annotation model is obtained by training through the model training method.
According to the image processing method provided by the embodiment of the application, after the image annotation model is obtained through the model training method shown in the embodiment, the image to be processed to be annotated is input into the image annotation model, and then the label of the image to be processed can be rapidly and accurately obtained, so that the efficiency and the accuracy of label automatic annotation of the image to be processed can be improved.
In general, the technical scheme provided by the application is a technical scheme which can improve the efficiency of labeling the image to be processed and the accuracy of labeling the image to be processed.
The application also provides a model training device. Fig. 5 is a schematic structural diagram of a model training apparatus provided in an embodiment of the present application, and as shown in fig. 5, the model training apparatus 500 may include:
an obtaining module 501, configured to pre-process an image in the sample data set to obtain a multi-label word vector of the image and a multi-label adjacency matrix of the image;
a first processing module 502, configured to perform clustering processing on the multi-label word vector and the multi-label adjacency matrix by using a graph wavelet neural network model to obtain a classification model;
the extraction module 503 is configured to perform feature extraction on the image to be processed, and output a feature matrix of the image to be processed;
a training module 504, configured to train the classification model based on the feature matrix to obtain a multi-label probability distribution model of the image to be processed;
and the second processing module 505 is configured to perform convergence processing on the probability distribution model by using a loss function to obtain an image annotation model, where the image annotation model is configured to perform annotation processing on the target image to obtain a tag of the target image.
Optionally, the graph wavelet neural network model includes a 2-layer graph wavelet neural network, and the first processing module 502 is specifically configured to, when the graph wavelet neural network model is used to perform clustering processing on the multi-label word vector and the multi-label adjacency matrix to obtain the classification model: adopting a first layer image wavelet neural network in the 2-layer image wavelet neural network to perform clustering processing on the multi-label word vectors and the multi-label adjacency matrix to obtain output vectors; and clustering the output vectors by adopting a second layer image wavelet neural network in the 2-layer image wavelet neural network to obtain a classification model.
Optionally, the first-layer map wavelet neural network is a nonlinear activation function silu, and the second-layer map wavelet neural network is a nonlinear activation function softmax.
Optionally, the obtaining module 501, when preprocessing the image in the sample data set to obtain the multi-label adjacency matrix of the image, is specifically configured to: determining a first parameter and a second parameter of the multi-label adjacency matrix according to a first label and a second label of the image, wherein the first parameter is used for representing the number of times that the first label and the second label in the sample data set appear at the same time, and the second parameter is used for representing the number of times that the first label appears in the sample data set; determining a conditional probability matrix of the image according to the first parameter and the second parameter; performing binarization processing on the conditional probability matrix to obtain a binarization adjacent matrix; and carrying out weighting processing on the binary adjacency matrix to obtain a multi-label adjacency matrix.
Optionally, the training module 504 is specifically configured to, when training the classification model based on the feature matrix to obtain a multi-label probability distribution model of the image to be processed: and carrying out matrix multiplication processing on the characteristic matrix and the classification model to obtain a probability distribution model.
Optionally, the second processing module 505, when performing convergence processing on the probability distribution model by using a loss function to obtain an image labeling model, is specifically configured to: determining a network hyper-parameter of the probability distribution model; and carrying out convergence processing on the probability distribution model according to the loss function and the network hyper-parameter to obtain an image annotation model.
Optionally, when the second processing module 505 performs convergence processing on the probability distribution model by using the loss function to obtain the image annotation model, the second processing module is further specifically configured to: monitoring the value of the loss function and the precision value of the probability distribution model; and if the value of the loss function is smaller than the first threshold value or the precision value of the probability distribution model is larger than the second threshold value, outputting the probability distribution model as an image annotation model.
The model training device is used for executing the technical scheme provided by the embodiment of the model training method, and the implementation principle and the technical effect of the model training device are similar to those of the embodiment of the method, and are not repeated herein.
The application also provides an image processing device. Fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application, and as shown in fig. 6, the image processing apparatus 600 may include:
an obtaining module 601, configured to obtain an image to be processed;
the labeling module 602 is configured to input the image to be processed into an image labeling model for labeling processing, so as to obtain a label of the image to be processed, where the image labeling model is obtained by the model training method.
The image processing apparatus is configured to execute the technical solution provided by the foregoing image processing method embodiment, and the implementation principle and technical effect of the image processing apparatus are similar to those in the foregoing method embodiment, and are not described herein again.
The embodiment of the application further provides the electronic equipment. Fig. 7 is a schematic view of an electronic device according to an embodiment of the present application. As shown in fig. 7, the electronic device 700 may include:
a processor 711, a memory 712, and an interaction interface 713;
wherein the processor 711 is communicatively coupled to the memory 712; the memory 712 is used to store computer-executable instructions that are executable by the processor 711;
wherein the processor 711 is configured to perform the technical solution of the aforementioned model training method or image processing method via executing computer executable instructions.
Alternatively, the memory 712 may be separate or integrated with the processor 711.
Optionally, when the memory 712 is a separate device from the processor 711, the electronic device 700 may further include:
and the bus is used for connecting the devices.
Alternatively, the Memory may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory is used for storing programs, and the processor executes the programs after receiving the execution instructions. Further, the software programs and modules within the aforementioned memories may also include an operating system, which may include various software components and/or drivers for managing system tasks (e.g., memory management, storage device control, power management, etc.), and may communicate with various hardware or software components to provide an operating environment for other software components.
Alternatively, the processor may be an integrated circuit chip having signal processing capabilities. The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The embodiment of the present application further provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when the computer-executable instructions are executed by a processor, the computer-executable instructions are used to implement the technical solution of the model training method or the image processing method provided in the foregoing method embodiment.
The embodiments of the present application further provide a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program is configured to implement the technical solution of the model training method or the image processing method provided in the foregoing method embodiments.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (10)

1. A method of model training, comprising:
preprocessing an image in the sample data set to obtain a multi-label word vector of the image and a multi-label adjacency matrix of the image;
clustering the multi-label word vectors and the multi-label adjacency matrix by adopting a graph wavelet neural network model to obtain a classification model;
extracting the characteristics of the image to be processed, and outputting a characteristic matrix of the image to be processed;
training the classification model based on the characteristic matrix to obtain a multi-label probability distribution model of the image to be processed;
and adopting a loss function to carry out convergence processing on the probability distribution model to obtain an image annotation model, wherein the image annotation model is used for carrying out annotation processing on a target image to obtain a label of the target image.
2. The model training method of claim 1, wherein the graph wavelet neural network model comprises a 2-layer graph wavelet neural network, and the clustering process is performed on the multi-label word vector and the multi-label adjacency matrix by using the graph wavelet neural network model to obtain a classification model, comprising:
adopting a first layer image wavelet neural network in the 2-layer image wavelet neural network to perform clustering processing on the multi-label word vectors and the multi-label adjacency matrix to obtain output vectors;
and clustering the output vectors by adopting a second layer image wavelet neural network in the 2-layer image wavelet neural network to obtain the classification model.
3. The model training method of claim 2, wherein the first layer diagram wavelet neural network is a nonlinear activation function silu and the second layer diagram wavelet neural network is a nonlinear activation function softmax.
4. The model training method of claim 1, wherein preprocessing the images in the sample data set to obtain a multi-label adjacency matrix for the images comprises:
determining a first parameter and a second parameter of the multi-label adjacency matrix according to a first label and a second label of the image, wherein the first parameter is used for representing the number of times that the first label and the second label appear in the sample data set at the same time, and the second parameter is used for representing the number of times that the first label appears in the sample data set;
determining a conditional probability matrix of the image according to the first parameter and the second parameter;
carrying out binarization processing on the conditional probability matrix to obtain a binarization adjacent matrix;
and carrying out weighting processing on the binarization adjacent matrix to obtain the multi-label adjacent matrix.
5. The model training method according to any one of claims 1 to 4, wherein the training of the classification model based on the feature matrix to obtain the multi-label probability distribution model of the image to be processed comprises:
and carrying out matrix multiplication processing on the characteristic matrix and the classification model to obtain the probability distribution model.
6. The model training method according to any one of claims 1 to 4, wherein the converging the probability distribution model by using the loss function to obtain the image labeling model comprises:
determining a network hyper-parameter of the probability distribution model;
and carrying out convergence processing on the probability distribution model according to the loss function and the network hyper-parameter to obtain the image annotation model.
7. The model training method according to any one of claims 1 to 4, wherein the converging the probability distribution model by using the loss function to obtain the image labeling model comprises:
monitoring values of the loss function and precision values of the probability distribution model;
and if the value of the loss function is smaller than a first threshold value or the precision value of the probability distribution model is larger than a second threshold value, outputting the probability distribution model as the image annotation model.
8. An image processing method, comprising:
acquiring an image to be processed;
inputting the image to be processed into an image annotation model for annotation processing to obtain a label of the image to be processed, wherein the image annotation model is obtained by training through the model training method of any one of claims 1 to 7.
9. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored by the memory to implement the model training method of any one of claims 1 to 7 or the image processing method of claim 8.
10. A computer-readable storage medium having computer-executable instructions stored thereon, wherein the computer-executable instructions, when executed by a processor, are configured to implement the model training method of any one of claims 1 to 7 or the image processing method of claim 8.
CN202211161244.9A 2022-09-23 2022-09-23 Model training method, image processing method, device and storage medium Pending CN115240037A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211161244.9A CN115240037A (en) 2022-09-23 2022-09-23 Model training method, image processing method, device and storage medium
PCT/CN2023/098759 WO2024060684A1 (en) 2022-09-23 2023-06-07 Model training method, image processing method, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211161244.9A CN115240037A (en) 2022-09-23 2022-09-23 Model training method, image processing method, device and storage medium

Publications (1)

Publication Number Publication Date
CN115240037A true CN115240037A (en) 2022-10-25

Family

ID=83667275

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211161244.9A Pending CN115240037A (en) 2022-09-23 2022-09-23 Model training method, image processing method, device and storage medium

Country Status (2)

Country Link
CN (1) CN115240037A (en)
WO (1) WO2024060684A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402113A (en) * 2023-06-08 2023-07-07 之江实验室 Task execution method and device, storage medium and electronic equipment
CN117611933A (en) * 2024-01-24 2024-02-27 卡奥斯工业智能研究院(青岛)有限公司 Image processing method, device, equipment and medium based on classified network model
WO2024060684A1 (en) * 2022-09-23 2024-03-28 卡奥斯工业智能研究院(青岛)有限公司 Model training method, image processing method, device, and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552803A (en) * 2020-04-08 2020-08-18 西安工程大学 Text classification method based on graph wavelet network model
CN112199536A (en) * 2020-10-15 2021-01-08 华中科技大学 Cross-modality-based rapid multi-label image classification method and system
CN113255798A (en) * 2021-06-02 2021-08-13 苏州浪潮智能科技有限公司 Classification model training method, device, equipment and medium
CN113378965A (en) * 2021-06-25 2021-09-10 齐鲁工业大学 Multi-label image identification method and system based on DCGAN and GCN
CN113657425A (en) * 2021-06-28 2021-11-16 华南师范大学 Multi-label image classification method based on multi-scale and cross-modal attention mechanism
CN113657171A (en) * 2021-07-20 2021-11-16 国网上海市电力公司 Low-voltage distribution network platform region topology identification method based on graph wavelet neural network
CN114266924A (en) * 2021-12-23 2022-04-01 深圳大学 Multi-mode-based amine area tumor image classification method and terminal equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104700116B (en) * 2015-03-13 2018-03-06 西安电子科技大学 The sorting technique of the Polarimetric SAR Image atural object represented based on multi-layer quantum ridge ripple
RU2019125602A (en) * 2019-08-13 2021-02-15 Общество С Ограниченной Ответственностью "Тексел" COMPLEX SYSTEM AND METHOD FOR REMOTE SELECTION OF CLOTHES
CN115240037A (en) * 2022-09-23 2022-10-25 卡奥斯工业智能研究院(青岛)有限公司 Model training method, image processing method, device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552803A (en) * 2020-04-08 2020-08-18 西安工程大学 Text classification method based on graph wavelet network model
CN112199536A (en) * 2020-10-15 2021-01-08 华中科技大学 Cross-modality-based rapid multi-label image classification method and system
CN113255798A (en) * 2021-06-02 2021-08-13 苏州浪潮智能科技有限公司 Classification model training method, device, equipment and medium
CN113378965A (en) * 2021-06-25 2021-09-10 齐鲁工业大学 Multi-label image identification method and system based on DCGAN and GCN
CN113657425A (en) * 2021-06-28 2021-11-16 华南师范大学 Multi-label image classification method based on multi-scale and cross-modal attention mechanism
CN113657171A (en) * 2021-07-20 2021-11-16 国网上海市电力公司 Low-voltage distribution network platform region topology identification method based on graph wavelet neural network
CN114266924A (en) * 2021-12-23 2022-04-01 深圳大学 Multi-mode-based amine area tumor image classification method and terminal equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周杭驰,: "基于深度学习的图像分类标注研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024060684A1 (en) * 2022-09-23 2024-03-28 卡奥斯工业智能研究院(青岛)有限公司 Model training method, image processing method, device, and storage medium
CN116402113A (en) * 2023-06-08 2023-07-07 之江实验室 Task execution method and device, storage medium and electronic equipment
CN116402113B (en) * 2023-06-08 2023-10-03 之江实验室 Task execution method and device, storage medium and electronic equipment
CN117611933A (en) * 2024-01-24 2024-02-27 卡奥斯工业智能研究院(青岛)有限公司 Image processing method, device, equipment and medium based on classified network model

Also Published As

Publication number Publication date
WO2024060684A1 (en) 2024-03-28

Similar Documents

Publication Publication Date Title
Duong et al. Automated fruit recognition using EfficientNet and MixNet
Shi et al. An attribution-based pruning method for real-time mango detection with YOLO network
Bell et al. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks
US10713563B2 (en) Object recognition using a convolutional neural network trained by principal component analysis and repeated spectral clustering
CN115240037A (en) Model training method, image processing method, device and storage medium
US20200074227A1 (en) Neural network-based action detection
CN112183577A (en) Training method of semi-supervised learning model, image processing method and equipment
CN111178251A (en) Pedestrian attribute identification method and system, storage medium and terminal
Nawaz et al. AI-based object detection latest trends in remote sensing, multimedia and agriculture applications
Nandhini et al. Object Detection Algorithm Based on Multi-Scaled Convolutional Neural Networks
Seneviratne Contrastive Representation Learning for Natural World Imagery: Habitat prediction for 30, 000 species.
CN114820463A (en) Point cloud detection and segmentation method and device, and electronic equipment
Ouf Leguminous seeds detection based on convolutional neural networks: Comparison of faster R-CNN and YOLOv4 on a small custom dataset
Ouadiay et al. Simultaneous object detection and localization using convolutional neural networks
CN115620122A (en) Training method of neural network model, image re-recognition method and related equipment
CN116503399B (en) Insulator pollution flashover detection method based on YOLO-AFPS
CN111353577B (en) Multi-task-based cascade combination model optimization method and device and terminal equipment
Marasović et al. Person classification from aerial imagery using local convolutional neural network features
CN114462559B (en) Target positioning model training method, target positioning method and device
CN115565115A (en) Outfitting intelligent identification method and computer equipment
CN115797691A (en) Target detection method and device based on small sample learning and storage medium
Grabowski et al. Squeezing adaptive deep learning methods with knowledge distillation for on-board cloud detection
Ahmed et al. Classification of semantic segmentation using fully convolutional networks based unmanned aerial vehicle application
US11347968B2 (en) Image enhancement for realism
Latif et al. Implementation of hybrid algorithm for the UAV images preprocessing based on embedded heterogeneous system: The case of precision agriculture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20221025