CN114037857B - Image classification precision improving method - Google Patents

Image classification precision improving method Download PDF

Info

Publication number
CN114037857B
CN114037857B CN202111229240.5A CN202111229240A CN114037857B CN 114037857 B CN114037857 B CN 114037857B CN 202111229240 A CN202111229240 A CN 202111229240A CN 114037857 B CN114037857 B CN 114037857B
Authority
CN
China
Prior art keywords
network
instance
gating
pruning
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111229240.5A
Other languages
Chinese (zh)
Other versions
CN114037857A (en
Inventor
施梦楠
刘畅
叶齐祥
焦建彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Chinese Academy of Sciences
Original Assignee
University of Chinese Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Chinese Academy of Sciences filed Critical University of Chinese Academy of Sciences
Priority to CN202111229240.5A priority Critical patent/CN114037857B/en
Publication of CN114037857A publication Critical patent/CN114037857A/en
Application granted granted Critical
Publication of CN114037857B publication Critical patent/CN114037857B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses an image classification precision improving method which is characterized by comprising the following steps: acquiring an image dataset; setting a convolutional neural network; performing dynamic network pruning on the convolutional neural network by adopting a characteristic-gating coupling method to obtain an optimized network; inputting the images to be classified into an optimization network, and classifying the images by utilizing the optimization network, wherein the dynamic network pruning method comprises the following steps: step 1, obtaining a characteristic space and a gating space in a dynamic pruning network; step 2, obtaining an example neighborhood relationship in a feature space; step 3, aligning the example neighborhood relationship between the gating space and the feature space; step 4, acquiring a total target loss function of the dynamic pruning network; and 5, updating the network parameters. The characteristic-gating coupling method suitable for dynamic network pruning disclosed by the invention can greatly reduce the distortion of gating characteristics and obviously improve the performance of a dynamic pruning network.

Description

Image classification precision improving method
Technical Field
The invention relates to an image classification precision improving method, and belongs to the technical field of image classification.
Background
In order to achieve higher image classification accuracy, Convolutional Neural Networks (CNNs) are designed to be larger and deeper, but the computational cost is also greatly increased.
In order to reduce the calculation cost, researchers have proposed a network pruning method and widely used the method, which can remove network parameters that contribute less to the classification accuracy. By the method, the calculation cost can be reduced as much as possible, and the representation capability of the original network of the pruned network is kept, so that the classification precision is reduced to the minimum, and a compact network model is obtained.
Existing network pruning methods can be roughly divided into two categories: static pruning and dynamic pruning, wherein in the static channel pruning method, a static simplified model is obtained by deleting a characteristic channel which has small contribution to the overall performance; the dynamic pruning method obtains a sub-network related to the input picture instance to reduce the computational cost at runtime.
In the conventional dynamic pruning method, an attached gating module is used to generate a binary mask at a channel level, i.e., a gating vector, which is used to indicate deletion or reservation of a channel. The gating module explores instance-level redundancy according to feature variations of different inputs, i.e., channels identifying particular features can be adaptively opened or closed for different input instances.
However, existing network pruning methods typically ignore the consistency between features and gating distributions. For example, when pairs of instances have similar characteristics but are gated differently, since the gating characteristics are generated by channel multiplication of the eigenvector and the gating vector, the disparity of the two distributions may cause distortion in the gating characteristic space, which may introduce noise instances into the similar pairs of instances or separate them, thereby reducing the representation ability of the pruned network.
Therefore, there is a need for further research on gated coupling in order to solve the above technical problems.
Disclosure of Invention
In order to overcome the above problems, the present inventors have conducted intensive studies to design an image classification accuracy improving method, including:
acquiring an image dataset;
setting a convolutional neural network;
performing dynamic network pruning on the convolutional neural network by adopting a characteristic-gating coupling method to obtain an optimized network;
inputting the images to be classified into an optimization network, classifying the images by utilizing the optimization network,
the feature-gated coupling method comprises the following steps:
step 1, obtaining a characteristic space and a gating space in a dynamic pruning network;
step 2, obtaining an example neighborhood relationship in a feature space;
step 3, aligning the example neighborhood relationship between the gating space and the feature space;
step 4, acquiring a total target loss function of the dynamic pruning network;
and 5, updating the network parameters.
Preferably, step 2 comprises the following sub-steps:
step 21, pooling example characteristics to obtain example pooling vectors
Figure BDA0003315382500000021
Step 22, pooling the vectors according to the instances
Figure BDA0003315382500000022
Obtaining a similarity matrix for an instance
Figure BDA0003315382500000023
Step 23, similarity matrix by example
Figure BDA0003315382500000024
And determining the nearest neighbor instance of the instance, and taking the set of the nearest neighbor instance sequence numbers as an automatic supervision signal.
Preferably, in step 22, the vectors are pooled by measuring different instances
Figure BDA0003315382500000025
The similarity between different instances is obtained by a dot product.
Preferably, in step 23, the method comprises
Figure BDA0003315382500000031
The ith row of (a) is sorted, the column sequence numbers of the k largest elements in the sequence are obtained, and the example set corresponding to the sequence numbers is used as the self-supervision signal of the example i.
Preferably, in step 3, the positive examples and the discrete negative examples are gathered by the contrast loss function, so as to realize the alignment of the example neighborhood relationship between the gating space and the feature space.
Preferably, step 3 comprises the following sub-steps:
step 31, acquiring the probability that the instance j is the positive instance of the instance i;
and step 32, acquiring a contrast loss function according to the positive example probability, minimizing the contrast loss function, and realizing the alignment of the example neighborhood relationship of the gating space and the feature space.
Preferably, in step 31, the probability that the input instance is identified as a positive instance of instance i in the gated space is:
Figure BDA0003315382500000032
wherein, l represents the l-th network,
Figure BDA0003315382500000033
example i gating probability of output after passing through the gating module,
Figure BDA0003315382500000034
is the gating probability of the output of example j after passing through the gating module, τ is the temperature hyperparameter.
Preferably, in step 4, the total objective loss function loss is:
Figure BDA0003315382500000035
wherein the content of the first and second substances,
Figure BDA0003315382500000036
is a function of the cross-entropy loss,
Figure BDA0003315382500000037
the contrast loss function of the l-th layer is represented,
Figure BDA0003315382500000038
l0 norm loss function for layer L; eta and rho are coefficients, and omega is a set of network layer sequence numbers.
The present invention also provides an electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform one of the methods described above.
The invention also provides a computer-readable storage medium having stored thereon computer instructions for causing the computer to perform one of the methods according to the above.
The invention has the advantages that:
(1) feature distributions and corresponding gating distributions in the dynamic pruning network can be aligned;
(2) feature-gating coupling is achieved by iteratively performing neighborhood relationship exploration and feature-gating alignment, and distortion of gating features can be greatly reduced;
(3) the performance of the dynamic pruning network can be significantly improved.
Drawings
Fig. 1 shows a schematic flow chart of a feature-gated coupling method for dynamic network pruning according to a preferred embodiment of the present invention.
Detailed Description
The invention is explained in more detail below with reference to the figures and examples. The features and advantages of the present invention will become more apparent from the description.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
In order to solve the problem that the traditional dynamic network pruning method may have distortion of the gating characteristics, in the invention, the distortion of the gating characteristics is reduced to the maximum extent through the distribution of the alignment characteristics and the gating.
The image classification precision improving method provided by the invention comprises the following steps:
acquiring an image dataset;
setting a convolutional neural network;
performing dynamic network pruning on the convolutional neural network by adopting a characteristic-gating coupling method to obtain an optimized network;
inputting the images to be classified into an optimization network, classifying the images by utilizing the optimization network,
in the present invention, the image data set is used for training a neural network, and the method for acquiring the image data set is not particularly limited in the present invention, and may be any open image classification training set, or an image data set designed by a person skilled in the art according to actual needs.
In the present invention, the specific structure of the convolutional neural network is not limited, and those skilled in the art can select an appropriate convolutional neural network structure according to actual needs.
The characteristic-gated coupling method, as shown in fig. 1, includes the following steps:
step 1, obtaining a characteristic space and a gating space in a dynamic pruning network;
step 2, obtaining an example neighborhood relationship in a feature space;
step 3, aligning the example neighborhood relationship between the gating space and the feature space;
step 4, acquiring a total target loss function of the dynamic pruning network;
and 5, updating the network parameters.
In step 1, mapping the image instance input into the convolutional neural network to a feature space and a gating space according to a method in a dynamic network pruning BAS, which can be referred to in the article for the description of the dynamic network pruning BAS: e hteshami Bejnordi, t.blankevorort, m.welling, Batch-profiling for spare controlled channel networks, in: proceedings of International Conference on Learning retrieval (ICLR),2020.
The feature space contains example features of the current convolutional layer, according to the present invention, the dynamic network prune can be set at any convolutional layer of the convolutional neural network, and when the dynamic network prune is set at the ith convolutional layer of the convolutional neural network, the example features are expressed as:
Figure BDA0003315382500000061
where i denotes the different instances and l denotes the layers of the convolutional neural network;
Figure BDA0003315382500000062
is the input of the first convolutional layer of the convolutional neural network,
Figure BDA0003315382500000063
is the output of the first convolution layer, C l Is the number of output channels, w, of the first convolution layer l Convolution weight matrix representing the first convolution layer, represents convolution operator, and ReLU represents activation function.
The gating space comprises gating vectors of the current convolutional layer, and further, when the dynamic network pruning is arranged on the l convolutional layer of the convolutional neural network, the gating vectors and the example features are multiplied by each other at channel level to obtain gating features, which are expressed as:
Figure BDA0003315382500000064
wherein G denotes a gating block, which indicates channel level multiplication,
Figure BDA0003315382500000065
indicating gating characteristics.
In a preferred embodiment, the gating module G may be represented as:
g (), binconstret (Linear (p ())) (iii)
Wherein p represents a global average pooling layer for generating spatial descriptors of the input features; linear () represents two fully connected layers for generating gating probabilities; binconcoret is an activation function.
Further, the gating probability corresponding to the ith convolutional layer of the convolutional neural network is expressed as:
Figure BDA0003315382500000066
the gating vector corresponding to the first convolution layer of the convolution neural network is as follows:
Figure BDA0003315382500000067
further, according to the first, third, and sixth equations, the gating characteristic corresponding to the first convolutional layer of the convolutional neural network can also be expressed as:
Figure BDA0003315382500000071
in step 2, since the example neighborhood relationships are different in feature spaces of different semantic levels, for example, in low-level feature spaces, examples with similar colors or textures may be closer, while in high-level feature spaces, examples in the same class may be clustered together, so manual annotation cannot provide adaptive supervision of the example neighborhood relationships across different network stages, and how to obtain the example neighborhood relationships in the feature spaces is a difficult point of the present invention.
The invention provides a method for adaptively acquiring an example field relation in each feature space, which can be used in any layer of a network to acquire an automatic supervision signal for feature-gating distribution alignment.
Specifically, the method comprises the following substeps:
step 21, pooling example characteristics to obtain example pooling vectors
Figure BDA0003315382500000072
For the first convolutional layer of the convolutional neural network provided with dynamic network pruning, the characteristics of the ith instance in the layer are divided by using a global average pooling layer
Figure BDA0003315382500000073
Pooling into vectors
Figure BDA0003315382500000074
Pooled vectors
Figure BDA0003315382500000075
Called an example pooling vector, and the dimension of the feature is changed from C through pooling l ×W l ×H l Reduction to C l X 1, improving the efficiency and effect of subsequent treatment.
Step 22, pooling the vectors according to the instances
Figure BDA0003315382500000076
Obtaining a similarity matrix for an instance
Figure BDA0003315382500000077
Similarity matrix of the example
Figure BDA0003315382500000078
For characterizing the similarity between different instances in the layer l, and further, by measuring the pooling vectors of different instances
Figure BDA0003315382500000079
The similarity between different instances characterizes the similarity between different instances, preferably the metric is performed by dot product, and the similarity between different instances can be expressed as:
Figure BDA00033153825000000710
wherein i, j represent different instances,
Figure BDA00033153825000000711
the similarity between the ith and jth examples in the ith convolutional layer of the convolutional neural network is shown, and T represents transposition.
Furthermore, the similarity of all the examples in the first convolution layer is combined into a matrix, which is the similarity matrix
Figure BDA0003315382500000081
The elements of the matrix are
Figure BDA0003315382500000082
N represents the total number of instances used for training.
Step 23, similarity matrix by example
Figure BDA0003315382500000083
And determining the nearest neighbor instance of the instance, and taking the set of the nearest neighbor instance sequence numbers as an automatic supervision signal.
To pair
Figure BDA0003315382500000084
The ith row of the gate module is sequenced to obtain column sequence numbers of k maximum elements, the corresponding examples of the column sequence numbers are nearest neighbor examples of the example i, the set of the nearest neighbor example sequence numbers are self-supervision signals of the example i, the self-supervision signals are used for regulating the gate module, and the sequence numbers are expressed as:
Figure BDA0003315382500000085
wherein the content of the first and second substances,
Figure BDA0003315382500000086
denotes the self-supervision signal of example i, topk denotes
Figure BDA0003315382500000087
The row number of the row i is k, and the element corresponding to the column number in the row i is the largest k elements in the row i.
Preferably, k is 100-500, more preferably 200.
In a preferred embodiment, in step 21, instead of recalculating the instance pooling vector for all the inputted picture instances each time a new picture instance is inputted, the obtained instance pooling vector is stored in the feature bank
Figure BDA0003315382500000088
In (1),
Figure BDA0003315382500000089
where D is the dimension of the pooled feature, i.e., the number of channels C l In the subsequently input picture example, when similarity is calculated, the call is directly made
Figure BDA00033153825000000810
The vector of (1).
Further, the feature bank
Figure BDA00033153825000000811
Updated in a momentum manner, which can be expressed as:
Figure BDA00033153825000000812
where the momentum coefficient m is set to 0.3-0.7, preferably 0.5, all vectors in the bank are initialized to unity random vector.
In step 3, according to the self-supervision signal of instance i, a nearest neighbor set of instance i can be obtained, denoted as
Figure BDA00033153825000000813
The instances in the nearest neighbor set are positive instances to be pulled in, while the other instances outside the nearest neighbor set are negative instances to be pushed out in the gating space.
The positive instance refers to the nearest neighbor instance of instance i, and the instances except the nearest neighbor instance are referred to as negative instances.
In a preferred embodiment, step 3 comprises the following sub-steps:
and 31, acquiring the probability that the instance j is the positive instance of the instance i.
Specifically, the probability that an instance of the input convolutional neural network is identified as a positive instance of instance i in the gating space corresponding to the ith convolutional layer is:
Figure BDA0003315382500000091
wherein l represents the first convolutional layer of the convolutional neural network,
Figure BDA0003315382500000092
example i gating probability output after passing through the gating module,
Figure BDA0003315382500000093
is the gating probability of the output of the example j after passing through the gating module, tau is a temperature over-parameter and is preferably set to 0.01-0.2, for example to 0.07.
And step 32, acquiring a contrast loss function according to the positive example probability, minimizing the contrast loss function, and realizing the alignment of the example neighborhood relationship of the gating space and the feature space.
In particular, the contrast loss function
Figure BDA0003315382500000094
Comprises the following steps:
Figure BDA0003315382500000095
by contrast loss function
Figure BDA0003315382500000096
And minimizing, namely, the nearest neighbor in the characteristic space is pulled close in the gating space, so that the example neighborhood relationship of the characteristic space is reproduced in the gating space, and the alignment of the gating space and the example neighborhood relationship of the characteristic space is realized.
In the inventionDesigned contrast loss function
Figure BDA0003315382500000097
Is a variant of the InfoNCE loss function, which minimizes the mutual information between the characteristic distribution and the gated distribution, and the specific principle of which can be explained by the following analysis:
mutual information between the instance feature distribution and the gating feature distribution is defined as:
Figure BDA0003315382500000098
Figure BDA0003315382500000099
and
Figure BDA00033153825000000910
can be represented by a probability density ratio
Figure BDA00033153825000000911
Figure BDA0003315382500000101
Is shown, and
Figure BDA0003315382500000102
is that
Figure BDA0003315382500000103
The corresponding gating probability is set to be,
Figure BDA0003315382500000104
is the gating probability of the corresponding positive instance.
From the above equation, the contrast loss function can be rewritten as:
Figure BDA0003315382500000105
wherein the content of the first and second substances,
Figure BDA0003315382500000106
representing nearest neighbor instance features in the feature space.
Figure BDA0003315382500000107
And
Figure BDA0003315382500000108
the mutual information between them is constrained by the following inequality:
Figure BDA0003315382500000109
the above equation shows the contrast loss function
Figure BDA00033153825000001010
Minimizing may maximize a lower bound of mutual information of the instance feature distribution and the gating feature distribution, and accordingly, boosting of the mutual information of the instance feature distribution and the gating feature distribution facilitates alignment of the two distributions.
In a preferred embodiment, in step 31, the memory is accessed by gating the memory bank
Figure BDA00033153825000001011
To store the gating probability of any instance i
Figure BDA00033153825000001012
In a step 32 of the method, it is,
Figure BDA00033153825000001013
self-supervision Signal according to example i
Figure BDA00033153825000001014
Slave gated memory bank
Figure BDA00033153825000001015
Extract corresponding good example
Figure BDA00033153825000001016
And the formula (eleven) is calculated, so that repeated calculation is avoided, and the calculation amount is reduced.
Further, the gated bank is used for storing the new instance after the new instance is input
Figure BDA00033153825000001017
Updated in a momentum manner, which can be expressed as:
Figure BDA00033153825000001018
where the momentum coefficient m is set to 0.3-0.7, preferably 0.5, and all vectors in the bank are initialized to be unit random vectors.
In step 4, the total target loss function loss is set as follows:
Figure BDA00033153825000001019
wherein the content of the first and second substances,
Figure BDA0003315382500000111
is a function of the cross-entropy loss,
Figure BDA0003315382500000112
representing the contrast loss function of the first convolutional layer,
Figure BDA0003315382500000113
the L0 norm loss function for the first convolutional layer is the same as the L0 norm loss function in BAS; eta and rho are coefficients, and omega is a set of sequence numbers of convolution layers with characteristic-gating coupling in the convolution neural network.
Further, the air conditioner is provided with a fan,
Figure BDA0003315382500000114
for each gated layer, wherein |) 0 Is the L0 norm.
According to the invention, the value of the coefficient ρ can be determined by a person skilled in the art through multiple tests as required, wherein the coefficient ρ is used for controlling the sparsity of the pruning model, i.e., the larger ρ is, the more sparse the model is, the lower the precision is, the lower ρ is, the lower the sparsity is, and the higher the precision is.
Preferably, the value of the coefficient η is between 0.002 and 0.004, more preferably 0.003.
According to the present invention, the overall target loss function includes a cross entropy loss function, an L0 norm loss function, and a contrast loss function. Wherein, the cross entropy loss function is used for an image classification task; the L0 norm loss function is used for enhancing the sparsity of the gating vector; the contrast loss function is used for feature-gated distribution alignment. Through the arrangement, the dynamic network pruning can couple the characteristics with the gate control, and the distortion of the gate control characteristics is reduced to the greatest extent.
In step 5, updating parameters of the dynamic pruning network through gradient back transmission according to a total target loss function of the dynamic pruning network, and terminating and returning to a final model if the network is converged; otherwise, repeating the steps 1-5 and carrying out iterative updating.
The specific method of this step is the same as the traditional neural network parameter updating method, and is not described in detail in the present invention.
Various embodiments of the above-described methods of the present invention may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), System On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the methods described herein may be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The methods and apparatus described herein may be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described herein), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in a traditional physical host and a VPS service ("Virtual Private Server", or "VPS" for short). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
Examples
Example 1
Experiments were performed on the title quasi-image classification database, i.e. the image dataset was CIFAR10, the CIFAR10 dataset both contained 6 ten thousand color image instances, of which 5 ten thousand instances were used for training and 1 ten thousand instances were used for testing, the experimental environment being: NVIDIA RTX 3090 GPU, performed using PyTorch.
A convolutional neural network was set up and experiments were performed using the ResNet-20 model, which was trained for 400 cycles with 256 image instances per batch. Acceleration gradient descent optimizer using Nesterov, with momentum set to 0.9 and weight decay set to 5e -4 The initial learning rate is set to 0.1, and a multi-step decay strategy is adopted, namely the learning rate reaches 200, 275 and 350 in a training period respectively]The time decay is 0.1.
And (3) dynamically pruning a dynamic pruning network arranged in the last two residual modules in the ResNet-20 model by adopting a characteristic-gating coupling method to obtain a superior network, wherein the dynamic pruning method comprises the following steps:
step 1, obtaining a characteristic space and a gating space in a dynamic pruning network;
step 2, obtaining an example neighborhood relation in a feature space;
step 3, aligning the example neighborhood relationship between the gating space and the feature space;
step 4, acquiring a total target loss function of the dynamic pruning network;
and 5, updating the network parameters.
In step 1, the number of layers of the last two residual modules of the ResNet-20 model is the l convolutional layer and the l +1 convolutional layer of the convolutional neural network, where the l convolutional layer is taken as an example for explanation, and the pruning process of the l +1 layer and the l layer is the same.
The characteristics of the first convolutional layer are expressed as:
Figure BDA0003315382500000151
the gating characteristics corresponding to the ith convolution layer of the ResNet-20 model are expressed as:
Figure BDA0003315382500000152
wherein the content of the first and second substances,
Figure BDA0003315382500000153
p represents a global average pooling layer; linear () represents two fully connected layers; binconcoret is an activation function.
Step 2 comprises the following substeps:
step 21, pooling example characteristics to obtain example pooling vectors
Figure BDA0003315382500000154
For the first convolution layer of the convolutional neural network, the characteristics of the ith example are combined by using a global average pooling layer
Figure BDA0003315382500000155
Pooling as vectors
Figure BDA0003315382500000156
Step 22, obtaining the similarity matrix of the example
Figure BDA0003315382500000157
Matrix array
Figure BDA0003315382500000158
The elements of (b) are the similarities between the different examples:
Figure BDA0003315382500000159
wherein the content of the first and second substances,
Figure BDA00033153825000001510
indicating that in the l-th network, between the ith and jth instancesWith T representing transpose
Step 23, similarity matrix by example
Figure BDA00033153825000001511
Determining nearest neighbor examples of different examples, and taking a set of nearest neighbor example serial numbers as an automatic supervision signal;
the self-supervision signals for example i are:
Figure BDA00033153825000001512
wherein the content of the first and second substances,
Figure BDA00033153825000001513
denotes the self-supervision Signal for example i, topk denotes
Figure BDA00033153825000001514
The column index of the k largest elements in the ith row of (1), where k is taken to be 200.
In step 3, step 3 comprises the following substeps:
step 31, acquiring the probability that the instance j is the positive instance of the instance i;
the probability that an instance of the input convolutional neural network is identified as a positive instance of instance i in the gating space corresponding to the ith convolutional layer is:
Figure BDA0003315382500000161
l denotes the first convolutional layer of the convolutional neural network,
Figure BDA0003315382500000162
example i gating probability of output after passing through the gating module,
Figure BDA0003315382500000163
is the gating probability of the output of example j after passing through the gating module, τ is 0.07,
Figure BDA0003315382500000164
step 32, obtaining a contrast loss function according to the positive example probability, minimizing the contrast loss function, and realizing the alignment of the example neighborhood relationship of the gating space and the feature space;
contrast loss function
Figure BDA0003315382500000165
Expressed as:
Figure BDA0003315382500000166
by contrast loss function
Figure BDA0003315382500000167
And minimizing to realize the alignment of the instance neighborhood relationship between the gating space and the feature space.
In step 4, the total target loss function loss is set as follows:
Figure BDA0003315382500000168
wherein the content of the first and second substances,
Figure BDA0003315382500000169
is a function of the cross-entropy loss,
Figure BDA00033153825000001610
the contrast loss function of the l-th layer is represented,
Figure BDA00033153825000001611
l0 norm loss function for layer L; eta and rho are coefficients, eta is 0.003, rho is 0.4, and omega is a set of network layer numbers, wherein the total number of the network layers is l and l + 1.
Further, classification experiments are carried out on the classification database through an optimized network.
Example 2
The same experiment as in example 1 was carried out, except that the experiment was carried out using the ResNet-32 model.
Example 3
The same experiment as that of example 1 was performed, except that the experiment was performed using the ResNet-56 model, a dynamic pruning network was set in the last four residual modules in the ResNet-56, the dynamic pruning process was the same as that of example 1, and the set of network layer numbers in step 4 was four layers.
Example 4
The same experiment as in example 1 was performed except that the experimental database was CIFAR100, the model was trained for 400 cycles, the number of image instances per batch was 128, the initial learning rate was set to 0.1, and a multi-step decay strategy was also employed, with a 0.2 decay being achieved when the training cycles respectively reached [60,120,160 ].
Example 5
The same experiment as in example 4 was performed except that the experiment was performed using the ResNet-32 model.
Example 6
The same experiment as that of example 4 was performed, except that the experiment was performed using the ResNet-56 model, a dynamic pruning network was set in the last four residual modules in the ResNet-56, the dynamic pruning process was the same as that of example 4, and the set of network layer numbers in step 4 was four layers.
Example 7
The same experiment as in example 1 was performed except that the experimental database was ImageNet, and since the training process took a lot of time, the experiment was performed using the ResNet-18 model, which had 130 cycles in total, the number of image instances per batch was 256, and the weight decay was set to 1e -4 The learning rate is initialized to 0.1, again using a multi-step decay strategy, at a training period of [40,70,100 ]]The time decay is 0.1.
Example 8
The same experiment as in example 1 was performed, except that in step 23, k was taken as 5, 20, 100, 512, 1024, 2048, 4096, respectively.
The results are shown in table one:
watch 1
Figure BDA0003315382500000181
As can be seen from table one, when the number k of nearest neighbors is 200, the obtained pruning effect is better.
Example 9
The same experiment as in example 1 was carried out except that in step 4, the coefficients η were 5e-4, 0.001, 0.002, 0.003, 0.005, 0.01, and 0.02, respectively.
The results are shown in Table II
Watch two
Figure BDA0003315382500000182
Figure BDA0003315382500000191
As can be seen from Table II, the pruning effect obtained is excellent when the coefficient eta is 0.003.
Comparative example
Comparative example 1
The same experiment as in example 1 was conducted except that the SFP method, FPGM method, DSA method, Hinge method, DHP method, FBS method, BAS method were used for pruning,
SFP methods are specifically described in the papers: y.he, g.kang, x.dong, y.fu, y.yang, Soft filter setting for acquiring deep connected neural networks, in: proceedings of International Joint Conference on Artificial Intelligence insight (IJCAI), 2018, pp.2234-2240.
The FPGM method is specifically described in the paper: y.he, p.liu, z.wang, z.hu, y.yang, Filter pruning vitamin metric medium for deep connected network access, in: proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2019, pp.4340-4349.
DSA methods are specifically described in the papers: x.ning, t.zhao, w.li, p.lei, y.wang, h.yang, DSA: more effective partitioned leaving allocation, in: proceedings of European Conference on Computer Vision (ECCV),2020, pp.592-607.
The Hinge method is specifically described in the article: y.li, s.gu, c.mayer, l.v.gool, r.timofte, Group spark: the high between filter setting and decoding for network compression, in: proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2020, pp.8015-8024.
DHP methods are specifically seen in the papers: y.li, s.gu, k.zhang, l.v.gool, r.timofte, DHP: differential formatting via networks, in: proceedings of European Conference on Computer Vision (ECCV),2020, pp.608-624.
FBS methods are specifically referred to the paper: gao, y.zhao, l.dudziak, r.d.mullins, c.xu, Dynamic channel pruning: feature boosting and compression, in: proceedings of International Conference on Learning retrieval (ICLR),2019.
BAS methods are specifically referred to the article: ehteshami Bejnordi, t.blankevor, m.welling, Batch-mapping for learning a conditional channel networks, in: proceedings of International Conference on Learning retrieval (ICLR),2020.
Comparative example 2
The same experiment as in example 2 was performed except that pruning was performed using the Baseline method, SFP method, FPGM method, FBS method, BAS method instead of the dynamic pruning method in example 2,
the Baseline method is specifically referred to the article: k.he, x.zhang, s.ren, j.sun, Deep residual learning for image recognition, in: proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016: 770-778.
Comparative example 3
The same experiment as in example 3 was performed except that the dynamic pruning method in example 3 was replaced with the Baseline method, SFP method, FPGM method, HRank method, DSA method, Hinge method, DHP method, FBS method, BAS method, respectively, for pruning.
The HRank method is specifically described in the article: m.lin, r.ji, y.wang, y.zhang, b.zhang, y.tiana, l.sho, Hrank: filter pruning using high-rank feature map, in: proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2020, pp.1526-1535.
Comparative example 4
The same experiment as in example 4 was performed except that the dynamic pruning method in example 4 was replaced with the Baseline method and the BAS method, respectively, for pruning.
Comparative example 5
The same experiment as in example 5 was performed except that pruning was performed using the Baseline method, CAC method, BAS method instead of the dynamic pruning method in example 5, respectively.
The CAC method is specifically described in the article: chen, T.xu, C.Du, C.Liu, H.He, dynamic channel pruning by conditional access change for deep neural Networks, IEEE trans.neural Networks Learn.Syst. (TNNLS)32(2), (2021) 799-.
Comparative example 6
The same experiment as in example 6 was performed except that pruning was performed using the Baseline method, CAC method, BAS method instead of the dynamic pruning method in example 6, respectively.
Comparative example 7
The same experiment as in example 7 was performed except that the dynamic pruning method in example 7 was replaced with the Baseline method, SFP method, FPGM method, DSA method, LCCN method, cgnett method, FBS method, BAS method, respectively, for pruning.
LCCN methods are specifically referred to the article: x.dong, j.huang, y.yang, s.yan, More is less: a more populated network with less interference complexity, in: proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2017, pp.1895-1903.
The CGNeT method is specifically described in the paper: hua, y.zhou, c.d.sa, z.zhang, g.e.suh, Channel gating neural networks, in: proceedings of Advances in Neural Information Processing Systems (NeurIPS),2019, pp.1884-1894.
Experimental example 1
The pruning results of examples 1-3 and comparative examples 1-3 were counted, and the results are shown in Table III
Figure BDA0003315382500000221
Figure BDA0003315382500000231
Figure BDA0003315382500000241
The two sets of results in examples 1, 2, and 3 are the performance at the minimum classification error and the maximum compression ratio, respectively, and it can be seen from the two sets of results that examples 1-3 can achieve a better tradeoff between classification error and compression ratio.
In this experimental example, the classification error is the proportion of misclassified samples to the total number of samples, and the Top-1 classification error is calculated in the following manner:
Figure BDA0003315382500000242
wherein y is i Class number representing the ith sample, N represents the total number of samples, top1 (z) i ) Representative vector z i Subscript of the middle-largest element, z i Representing the network output:
Figure BDA0003315382500000243
wherein
Figure BDA0003315382500000244
An i-th sample representing the input to the net, i.e. the 0 th convolutional layer; NN represents a convolutional neural network, consisting of a plurality of convolutional layersAre connected to form;
Figure BDA0003315382500000245
representing the output of the network, N c Is the number of classes, representing the number of correctly identified samples, y i ==top1(z i ) The meaning of (1) is that when the prediction of the network top1 (z) i ) And class label y i If the two values are consistent, the value is 1, otherwise, the value is 0.
The pruning ratio is calculated in the following way:
Figure BDA0003315382500000246
wherein
Figure BDA0003315382500000247
Gating vector representing the first convolutional layer
Figure BDA0003315382500000248
The ratio of zero elements of (b) can also be regarded as the calculated clipping ratio of the current layer,
Figure BDA0003315382500000249
represents the calculated amount of the first convolution layer, comp NN_original Representing the calculated amount of the original (uncut) convolutional network, calculating the cutting calculated amount of each sample of the network according to the cutting ratio of the calculated amount of each gated convolutional layer, and obtaining the cutting ratio of the network by dividing the average value by the calculated amount of the original convolutional network.
As can be seen from table three, the methods in embodiments 1, 2, and 3 obtain lower classification errors at higher pruning rate, significantly improve pruning performance, and are superior to the existing advanced network pruning methods.
Experimental example 2
The pruning results of examples 4-6 and comparative examples 4-6 were counted, and the results are shown in Table IV
Figure BDA0003315382500000251
Figure BDA0003315382500000261
As with examples 1-3, the two sets of results in examples 4-6 are performance at minimum classification error and maximum compression ratio, respectively, and it can be seen from the two sets of results that examples 4-6 can achieve a better tradeoff between classification error and compression ratio.
As can be seen from table four, the methods in embodiments 4, 5, and 6 can obtain lower classification error at higher pruning rate, significantly improve pruning performance, and are superior to the existing advanced network pruning methods.
Experimental example 3
The pruning results of example 7 and comparative example 7 were counted and shown in Table five
Figure BDA0003315382500000262
Figure BDA0003315382500000271
As with examples 1-3, the two sets of results in example 7 are performance at minimum classification error and maximum compression ratio, respectively, and it can be seen from the two sets of results that example 7 can achieve a better tradeoff between classification error and compression ratio.
In this experimental example, the Top-5 classification error is calculated as follows:
Figure BDA0003315382500000272
wherein top5 (z) i ) Representative vector z i Set of subscripts of the largest 5 elements in If y i Belong to this set, then y i ∈top5(z i ) The value of (A) is 1, otherwise is 0; the meaning of the remaining parameters and ToThe corresponding parameters in the p-1 classification error have the same meaning.
As can be seen from table three, table four, and table five, the methods in embodiments 1 to 7 can obtain lower classification errors in different data sets at higher pruning rates, and are superior to the existing advanced network pruning methods.
In the description of the present invention, it should be noted that the terms "upper", "lower", "inner", "outer", "front", "rear", etc. indicate orientations or positional relationships based on operational states of the present invention, and are only for convenience of description and simplification of description, but do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," "third," and "fourth" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should be noted that, unless otherwise specifically stated or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; the connection may be direct or indirect via an intermediate medium, and may be a communication between the two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The present invention has been described above in connection with preferred embodiments, but these embodiments are merely exemplary and merely illustrative. On the basis of the above, the invention can be subjected to various substitutions and modifications, and the substitutions and the modifications are all within the protection scope of the invention.

Claims (5)

1. An image classification accuracy improving method is characterized by comprising the following steps:
acquiring an image dataset;
setting a convolutional neural network;
performing dynamic network pruning on the convolutional neural network by adopting a characteristic-gating coupling method to obtain an optimized network;
inputting the images to be classified into an optimization network, classifying the images by utilizing the optimization network,
the feature-gated coupling method comprises the steps of:
step 1, obtaining a characteristic space and a gating space in a dynamic pruning network;
step 2, obtaining an example neighborhood relationship in a feature space;
step 3, aligning the example neighborhood relationship between the gated space and the feature space;
step 4, acquiring a total target loss function of the dynamic pruning network;
step 5, updating network parameters;
step 2 comprises the following substeps:
step 21, pooling example characteristics to obtain example pooling vectors
Figure FDA0003734206030000011
Step 22, pooling vectors according to instances
Figure FDA0003734206030000012
Obtaining a similarity matrix for an instance
Figure FDA0003734206030000013
Step 23, similarity matrix by example
Figure FDA0003734206030000014
Determining a nearest neighbor instance of the instance, and taking a set of nearest neighbor instance sequence numbers as an automatic supervision signal;
in step 22, the vectors are pooled by measuring different instances
Figure FDA0003734206030000015
The similarity between different instances is obtained, and the measureBy dot product;
in step 23, for
Figure FDA0003734206030000016
The ith row of the sequence I is sequenced to obtain the column sequence numbers of k maximum elements in the sequence, and an example set corresponding to the sequence numbers is used as an automatic supervision signal of an example i;
step 3 comprises the following substeps:
step 31, acquiring the probability that the instance j is the positive instance of the instance i;
step 32, obtaining a contrast loss function according to the positive example probability, minimizing the contrast loss function, and realizing the alignment of the example neighborhood relationship of the gating space and the feature space;
the positive example refers to the nearest neighbor example of example i;
the contrast loss function
Figure FDA0003734206030000021
Comprises the following steps:
Figure FDA0003734206030000022
wherein the content of the first and second substances,
Figure FDA0003734206030000023
a nearest neighbor set of the example i is shown,
Figure FDA0003734206030000024
is the probability that an input instance is identified in gated space as a positive instance of instance i.
2. The image classification accuracy improvement method according to claim 1,
in step 31, the probability that the input instance is identified in the gated space as a positive instance of instance i is:
Figure FDA0003734206030000025
wherein l represents the first convolutional layer of the convolutional neural network,
Figure FDA0003734206030000026
is the gating probability of the output after the instance i passes through the gating module,
Figure FDA0003734206030000027
is the gating probability of the output after the example j passes through the gating module, and tau is a temperature over-parameter.
3. The image classification accuracy improvement method according to claim 1,
in step 4, the total target loss function loss is:
Figure FDA0003734206030000028
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003734206030000029
is a function of the cross-entropy loss,
Figure FDA00037342060300000210
the contrast loss function of the l-th layer is represented,
Figure FDA00037342060300000211
l0 norm loss function for layer L; eta and rho are coefficients, and omega is a set of network layer serial numbers.
4. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-3.
5. A computer-readable storage medium having computer instructions stored thereon for causing the computer to perform the method of any one of claims 1-3.
CN202111229240.5A 2021-10-21 2021-10-21 Image classification precision improving method Active CN114037857B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111229240.5A CN114037857B (en) 2021-10-21 2021-10-21 Image classification precision improving method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111229240.5A CN114037857B (en) 2021-10-21 2021-10-21 Image classification precision improving method

Publications (2)

Publication Number Publication Date
CN114037857A CN114037857A (en) 2022-02-11
CN114037857B true CN114037857B (en) 2022-09-23

Family

ID=80135096

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111229240.5A Active CN114037857B (en) 2021-10-21 2021-10-21 Image classification precision improving method

Country Status (1)

Country Link
CN (1) CN114037857B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210620A (en) * 2019-06-04 2019-09-06 北京邮电大学 A kind of channel pruning method for deep neural network
CN111242277A (en) * 2019-12-27 2020-06-05 中国电子科技集团公司第五十二研究所 Convolutional neural network accelerator supporting sparse pruning and based on FPGA design
CN111368699A (en) * 2020-02-28 2020-07-03 交叉信息核心技术研究院(西安)有限公司 Convolutional neural network pruning method based on patterns and pattern perception accelerator
CN111626330A (en) * 2020-04-23 2020-09-04 南京邮电大学 Target detection method and system based on multi-scale characteristic diagram reconstruction and knowledge distillation
CN112508955A (en) * 2021-02-08 2021-03-16 中国科学院自动化研究所 Method for detecting living cell morphology based on deep neural network and related product
CN113239981A (en) * 2021-04-23 2021-08-10 中国科学院大学 Image classification method of local feature coupling global representation

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2711153C2 (en) * 2018-05-23 2020-01-15 Общество С Ограниченной Ответственностью "Яндекс" Methods and electronic devices for determination of intent associated with uttered utterance of user
CN112734025B (en) * 2019-10-28 2023-07-21 复旦大学 Neural network parameter sparsification method based on fixed base regularization
CN113077044A (en) * 2021-03-18 2021-07-06 北京工业大学 General lossless compression and acceleration method for convolutional neural network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210620A (en) * 2019-06-04 2019-09-06 北京邮电大学 A kind of channel pruning method for deep neural network
CN111242277A (en) * 2019-12-27 2020-06-05 中国电子科技集团公司第五十二研究所 Convolutional neural network accelerator supporting sparse pruning and based on FPGA design
CN111368699A (en) * 2020-02-28 2020-07-03 交叉信息核心技术研究院(西安)有限公司 Convolutional neural network pruning method based on patterns and pattern perception accelerator
CN111626330A (en) * 2020-04-23 2020-09-04 南京邮电大学 Target detection method and system based on multi-scale characteristic diagram reconstruction and knowledge distillation
CN112508955A (en) * 2021-02-08 2021-03-16 中国科学院自动化研究所 Method for detecting living cell morphology based on deep neural network and related product
CN113239981A (en) * 2021-04-23 2021-08-10 中国科学院大学 Image classification method of local feature coupling global representation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于卷积神经网络的模型压缩研究;姚杨;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20210415(第04期);第I132-677页 *

Also Published As

Publication number Publication date
CN114037857A (en) 2022-02-11

Similar Documents

Publication Publication Date Title
WO2020083073A1 (en) Non-motorized vehicle image multi-label classification method, system, device and storage medium
US10354170B2 (en) Method and apparatus of establishing image search relevance prediction model, and image search method and apparatus
CN110175628A (en) A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation
WO2015165372A1 (en) Method and apparatus for classifying object based on social networking service, and storage medium
CN111709493B (en) Object classification method, training device, object classification equipment and storage medium
CN112232087B (en) Specific aspect emotion analysis method of multi-granularity attention model based on Transformer
CN110929848A (en) Training and tracking method based on multi-challenge perception learning model
US20230185998A1 (en) System and method for ai-assisted system design
US20200082213A1 (en) Sample processing method and device
CN112488301B (en) Food inversion method based on multitask learning and attention mechanism
CN111353534B (en) Graph data category prediction method based on adaptive fractional order gradient
CN112101364A (en) Semantic segmentation method based on parameter importance incremental learning
CN114417058A (en) Video material screening method and device, computer equipment and storage medium
JP2022117941A (en) Image searching method and device, electronic apparatus, and computer readable storage medium
WO2021178981A9 (en) Hardware-friendly multi-model compression of neural networks
CN114581868A (en) Image analysis method and device based on model channel pruning
CN111079011A (en) Deep learning-based information recommendation method
CN111126501B (en) Image identification method, terminal equipment and storage medium
CN113190696A (en) Training method of user screening model, user pushing method and related devices
CN114037857B (en) Image classification precision improving method
CN111651660A (en) Method for cross-media retrieval of difficult samples
CN112244863A (en) Signal identification method, signal identification device, electronic device and readable storage medium
CN116433980A (en) Image classification method, device, equipment and medium of impulse neural network structure
CN110990630A (en) Video question-answering method based on graph modeling visual information and guided by using questions
CN115601578A (en) Multi-view clustering method and system based on self-walking learning and view weighting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant