CN116630697B - Image classification method based on biased selection pooling - Google Patents

Image classification method based on biased selection pooling Download PDF

Info

Publication number
CN116630697B
CN116630697B CN202310552011.XA CN202310552011A CN116630697B CN 116630697 B CN116630697 B CN 116630697B CN 202310552011 A CN202310552011 A CN 202310552011A CN 116630697 B CN116630697 B CN 116630697B
Authority
CN
China
Prior art keywords
pooling
biased
alpha
parameters
image classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310552011.XA
Other languages
Chinese (zh)
Other versions
CN116630697A (en
Inventor
任璐
李�浩
柳文章
宋坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202310552011.XA priority Critical patent/CN116630697B/en
Publication of CN116630697A publication Critical patent/CN116630697A/en
Application granted granted Critical
Publication of CN116630697B publication Critical patent/CN116630697B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses an image classification method based on biased selection pooling, which comprises the following steps: data preprocessing and model definition, defining a group of super parameters [ alpha ] 12 ,…,α k ]Initializing a mask [ beta ] 12 ,…,β k ]Parameters of (a); defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like; and sending the training set into a model for forward propagation, and extracting local features of the image through a convolution layer. The invention realizes different feature extraction targets by adjusting the super parameters, solves the problems of inaccurate reserved information and the like caused by using maximum pooling or average pooling in the traditional image classification, improves the accuracy of image classification, is more flexible compared with maximum pooling and average pooling, and can better adapt to various different data features and tasks.

Description

Image classification method based on biased selection pooling
Technical Field
The invention relates to the technical field of deep learning, in particular to an image classification method based on biased selection pooling.
Background
With the rapid development of artificial intelligence, the deep neural network achieves excellent performance on tasks such as computer vision, voice recognition and automatic driving. Deep convolutional neural networks typically utilize pooling to reduce the size of the feature map and, in turn, the size of the model. The pool layer is introduced to increase the receptive field range and reduce the calculation requirement of subsequent convolution, and the parameter number and calculation amount of the model can be reduced along with the reduction of the size of the characteristic diagram, so that the occurrence of overfitting can be prevented to a certain extent.
Max pooling and mean pooling are two pooling operations common in deep learning. In the forward propagation process of maximum pooling, selecting the most responsive node value in the feature map to enter the next layer of operation, so that the edge and texture structure of the image can be captured well; while averaging, by averaging the selected region nodes, has the advantage that the shift in the estimated mean can be reduced, thus, being good at capturing the background features of the image.
Since there is no clear gap between the edge information and the background information, there is a lot of information between them on the feature map of the neural network. Therefore, depending on the two pooling modes, the neural network ignores the information, so that the reserved information is not accurate enough, and as the depth of the model is increased, more information can be lost in the two modes, so that the expression capability and generalization capability of the model are reduced.
Disclosure of Invention
The invention aims to provide an image classification method based on biased selection pooling so as to solve the problem that maximum pooling and average pooling extraction information is not accurate enough in the prior art, and can adaptively adjust pooling weights according to characteristic distribution of input data.
In order to achieve the above purpose, the present invention provides the following technical solutions: an image classification method based on biased selection pooling comprises the following steps:
s1: preprocessing data of a data set to obtain a training set and defining a model;
s2: defining a set of superparameters [ alpha ] 12 ,...,α k ]Initializing a mask [ beta ] 12 ,...,β k ]Defining an optimizer, a loss function and a learning rate decay strategy, and setting parameters including a learning rate lr and the iteration number epoch. Batch size batch and other super parameters;
s3: taking the training set as the input of the model, starting iterative training and carrying out forward propagation;
s4: obtaining corresponding k groups of outputs [ z ] by adjusting the super-parameters 1 ,z 2 ,...,z k ]To control the selection of the pooling area;
s5: output by inputting parameter pairs of mask [ z ] 1 ,z 2 ,...,z k ]Obtaining the m-th pooled region output y through weighted summation m The output characteristic diagram y of the polarized pooling layer is obtained by polarized pooling of all pooling areas;
s6: after the forward propagation is finished, classifying the feature map y output by the model through a classifier, and updating parameters of the model and the mask through calculating loss and carrying out backward propagation to adjust the learning rate;
s7: repeating the steps S3-S6 until all the iteration times are finished.
Preferably, in the step S4, the corresponding k groups of outputs [ z ] are obtained by adjusting the super-parameters 1 ,z 2 ,...,z k ]To control the selection of the pooled regions specifically comprises the steps of:
s41: the image is processed by a convolution layer to extract local features, when the image passes through a pooling layer, the feature image is sent to a polarized selective pooling layer, and the feature value of an mth pooling area is recorded as I m =[x 1 ,x 2 ,...,x n ];
Specifically, if the size of the pooled area in S41 is h×w, n=h×w, and the size of m is between 1 and the number of pooled areas.
And the biased selective pooling layer can be compatible with maximum pooling and average pooling by setting alpha, and the biased selective pooling layer is the average pooling layer by setting the number of alpha as 1 and setting the size of alpha as 0; if the size of α is set to a sufficiently large number, the selective pooling layer is the maximum pooling layer.
S42: based on a biased selection function and a given set of hyper-parameters [ alpha ] 12 ,...,α k ]Calculated outWeights of each characteristic value under different super parametersObtaining k sets of weights, i.e. when alpha is used 1 A corresponding set of weights is obtained
Specifically, the biased selection function in S42 is:
wherein for n eigenvalues on a pooled region, each α i Corresponds to a set of weightsThus k different hyper-parameters alpha i Corresponding to k groups of weights and k different output eigenvalues z i Wherein i is e [1, k],j∈[1,n]。
Specifically, the basic properties of the biased selection function in S32 are:
wherein for I m =[x 1 ,x 2 ,...,x n ]When α=0, each eigenvalue x i Weight coefficient of (2) isThe value of z corresponds to mean pooling at this time; when alpha is alpha to + _ infinity, the maximum eigenvalue x max The weight coefficient of the (2) is 1, the weight coefficients of the rest characteristic values are 0, and at the moment, the value of z is equivalent to the maximum value pooling; similarly, when alpha- & gt-infinity, z is equivalent to minimum pooling; when alpha takes other values, the weight coefficient of each characteristic value is +.>
S43: for characteristic value I m =[x 1 ,x 2 ,...,x n ]Using each super parameter alpha i Respectively combining the weights corresponding to the twoThe eigenvalues of the pooled regions are weighted and summed to obtain the corresponding k groups of outputs z 1 ,z 2 ,...,z k ]A feature map;
preferably, the parameter pair output [ z ] of S5 through the input mask 1 ,z 2 ,...,z k ]The weighted summation is as follows:
y m =β 1 z 12 z 2 +...+β k z k
preferably, the size of the mask in S5 is updated following the update of the neural network parameter, and the size of the mask β should be between 0 and 1.
Preferably, the specific way to calculate the loss in S6 is as follows:
wherein R is a residual term for evaluating the deviation between the sum of all mask weights and 1, if the sum of all mask weights is equal to 1, the residual term will be 0, otherwise, the residual term will be a positive number representing the degree of deviation between the sum of weights and 1.
Preferably, in said updating phase of the training in S6,
the gradient of all network parameters is updated according to the calculated error derivatives, and in the biased pooling layer, the gradient update is proportional to the weight calculated during forward propagation.
An image classification device based on biased selection pooling, comprising:
definition module, whichFor data preprocessing and defining a model, defining a set of hyper-parameters [ alpha ] 12 ,...,α k ]Initializing a mask [ beta ] 12 ,...,β k ]Parameters of (a); the method is also used for defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like;
the processing module is used for sending the training set into the model for forward propagation, and the image extracts local features through the convolution layer; and after the forward propagation is finished, classifying the characteristics output by the model through a classifier, calculating the loss, carrying out backward propagation, updating the parameters of the mask, and adjusting the learning rate until all the iteration times are finished.
Compared with the prior art, the invention has the beneficial effects that:
1. compared with maximum value pooling and average value pooling, the invention endows each characteristic value with corresponding weight through alpha, and outputs and fuses all characteristic values, so that information can be better reserved. Meanwhile, by adding a mask, the defect that only one strategy can be used for pooling each time is avoided, and the generalization capability of the model can be better improved;
2. the invention can also be compatible with maximum value pooling and average value pooling, when only one alpha is set, the mask layer can be removed at the same time by setting the alpha to 0 or a larger value, the time bias selection pooling is equivalent to the average value pooling or the maximum value pooling, the bias selection pooling has more operation modes, the selectivity is stronger, and the mode of extracting the characteristics is more flexible;
3. the selection of the pooled regions can be controlled by adjusting the hyper-parameters to accommodate different scenarios and data sets. Meanwhile, the biased selection pooling method can be used in combination with other neural network structures, such as a convolutional neural network, a cyclic neural network and the like, so that the performance of the model is improved;
4. the weighted summation in the biased selection function is performed on all the inputs, so that the model has stronger tolerance to noise, better stability of model output and the method provides assistance for accurate, stable and generalization performance model construction.
Drawings
FIG. 1 is a main flow chart of an image classification method based on biased selective pooling according to an embodiment of the present invention;
FIG. 2 is a flowchart of a specific method for obtaining k groups of corresponding outputs by adjusting super parameters based on a biased selective pooling image classification method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an algorithm framework of an image classification method based on biased selection pooling according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a biased selection function of an image classification method based on biased selection pooling according to an embodiment of the present invention;
fig. 5 is a flowchart illustrating specific steps of another image classification method based on biased selective pooling according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The main execution body of the method in this embodiment is a terminal, and the terminal may be a device such as a mobile phone, a tablet computer, a PDA, a notebook or a desktop, but of course, may be another device with a similar function, and this embodiment is not limited thereto.
Referring to fig. 1 to 5, the present invention provides a biased selection pooling-based image classification method, which is applied to the fields of pattern recognition and deep learning, and includes:
s1: preprocessing data of a data set to obtain a training set and defining a model;
the model comprises a plurality of convolution layers for extracting features, a pooling layer is arranged between the convolution layers for reducing the size of the activation mapping, and finally a full-connection layer is used as a classification layer for classification.
S2: defining a set of superparameters [ alpha ] 12 ,...,α k ]Initializing a mask [ beta ] 12 ,...,β k ]Defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like;
the number and the size of the super-parameters alpha need to be defined in advance, and have certain influence on the expression capacity of the model, so that multiple experiments are needed to determine the optimal combination of the optimal value range and the super-parameter size for different data characteristic distribution; the biased selective pooling layer can be compatible with maximum pooling and average pooling by setting alpha, and the biased selective pooling can be compatible with maximum and average values and can also use the characteristic graphs of the maximum and the average values at the same time;
by setting the number of alpha to 1, if the size of alpha is set to 0, the biased pooling layer is a mean pooling layer; if the size of α is set to a sufficiently large number, the selective pooling layer is the maximum pooling layer.
S3: taking the training set as the input of the model, starting iterative training and carrying out forward propagation;
s4: obtaining corresponding k groups of outputs [ z ] by adjusting the super-parameters 1 ,z 2 ,...,z k ]To control the selection of the pooling area;
wherein, in S4, the corresponding k groups of outputs [ z ] are obtained by adjusting the super parameters 1 ,z 2 ,...,z k ]To control the selection of the pooled regions specifically comprises the steps of:
s41: the image is processed by a convolution layer to extract local features, when the image passes through a pooling layer, the feature image is sent to a polarized selective pooling layer, and the feature value of an mth pooling area is recorded as I m =[x 1 ,x 2 ,...,x n ];
Specifically, if the size of the pooling area in S41 is h×w, n=h×w, and the size of m is between 1 and the number of pooling areas; in S41, the biased selective pooling layer can be compatible with maximum pooling and average pooling by setting alpha, and the biased selective pooling layer is the average pooling layer by setting the number of alpha as 1 and setting the size of alpha as 0; if the size of alpha is set to be a large enough number, the biased pooling layer is the maximum pooling layer;
s42: based on a biased selection function and a given set of hyper-parameters [ alpha ] 12 ,...,α k ]Calculating the weight of each characteristic value under different super parametersObtaining k sets of weights, i.e. when alpha is used 1 A corresponding set of weights is obtained
Specifically, the biased selection function in S42 is:
wherein each alpha i Corresponds to a set of weightsThus k different hyper-parameters alpha i Corresponding to k groups of weights and k different output eigenvalues z i Wherein i is e [1, k],j∈[1,n]。
Specifically, the basic properties of the biased selection function in S32 are:
wherein for I m =[x 1 ,x 2 ,...,x n ]When α=0, each eigenvalue x i Weight coefficient of (2) isThe value of z corresponds to mean pooling at this time; when alpha is → plus-infinity, the maximumIs the eigenvalue x of (2) max The weight coefficient of the (2) is 1, the weight coefficients of the rest characteristic values are 0, and at the moment, the value of z is equivalent to the maximum value pooling; similarly, when alpha- & gt-infinity, z is equivalent to minimum pooling; when alpha takes other values, the weight coefficient of each characteristic value is +.>
S43: for characteristic value I m =[x 1 ,x 2 ,...,x n ]Using each super parameter alpha i Respectively combining the weights corresponding to the twoThe eigenvalues of the pooled regions are weighted and summed to obtain the corresponding k groups of outputs z 1 ,z 2 ,...,z k ]And (5) a characteristic diagram.
In this embodiment, as shown in fig. 2, fig. 2 is a flowchart of a specific method for obtaining k groups of outputs by adjusting superparameters according to an image classification method based on biased selective pooling, and compared with maximum pooling and average pooling, the present invention provides a corresponding weight to each feature value through α, and outputs a fused feature value, so that information can be better retained. Controlling the selection of the pooling area by adjusting the super-parameters so as to adapt to different scenes and data sets;
s5: output by inputting parameter pairs of mask [ z ] 1 ,z 2 ,...,z k ]Obtaining the m-th pooled region output y through weighted summation m Obtaining an output characteristic diagram y of a polarized pooling layer by carrying out polarized pooling on all the materials;
the parameter pair output [ z ] through the input mask in the S5 1 ,z 2 ,...,z k ]The weighted summation is as follows:
y m =β 1 z 12 z 2 +...+β k z k
due to each z i The method is output under different weight groups, so that after the mask is added, the contribution among different feature graphs can be balanced better through an end-to-end training model, and the output can be selectively performed;
the size of the mask in the step S5 is updated along with the updating of the neural network parameters, and the size of the mask beta is between 0 and 1; the mask is added to enable the model to select corresponding pooling weights according to the characteristic distribution of the input data, wherein the size of the mask is updated along with the updating of the neural network parameters, and the size of the mask beta is between 0 and 1, so that in order to control the size of the mask, the value of the mask needs to be limited through amplitude limiting to ensure that the value of the mask is in a reasonable range, and the influence on the performance of the model is avoided.
In the embodiment, the defect that only one strategy can be used for pooling each time is avoided by adding the mask, so that the generalization capability of the model can be better improved; the invention can also be compatible with maximum value pooling and average value pooling, when only one alpha is set, the mask layer can be removed at the same time by setting the alpha to 0 or a larger value, the time bias selection pooling is equivalent to the average value pooling or the maximum value pooling, the bias selection pooling has more operation modes, the selectivity is stronger, and the mode of extracting the characteristics is more flexible; the biased selection pooling method can also be used in combination with other neural network structures, such as a convolutional neural network, a cyclic neural network and the like, so that the performance of the model is improved, wherein the weighted summation in the biased selection function is carried out on all inputs, so that the model has stronger tolerance to noise, better stability of model output and the assistance is provided for accurate, stable and generalization performance model construction.
S6: after the forward propagation is finished, classifying the feature map y output by the model through a classifier, and updating parameters of the model and the mask through calculating loss and carrying out backward propagation to adjust the learning rate;
the specific way of calculating the loss in S6 is as follows:
wherein R is a residual term for evaluating the deviation between the sum of all mask weights and 1, if the sum of all mask weights is equal to 1, the residual term will be 0, otherwise, the residual term will be a positive number representing the degree of deviation between the sum of weights and 1.
In the update phase trained in S6, the gradients of all network parameters are updated according to the calculated error derivatives, and in the biased selective pooling layer, gradient update is proportional to the weights calculated during forward propagation. Unlike max pooling, since softmax is differentiable, gradients are calculated for each non-zero node within the region.
S7: repeating the steps S3-S6 until all the iteration times are finished.
For better understanding of the foregoing embodiments, as shown in fig. 5, the present invention further provides a flowchart of specific steps of a method for classifying images based on biased selective pooling, where the method at least includes:
step 201, defining a model, wherein the model comprises 3 layers of convolution layers, 1 layer of biased selective pooling layer, 2 layers of convolution layers and 2 layers of full connection layers, the convolution kernel size of the convolution layers is 3, the filling is 1, and the step length is 1; the convolution kernel size of the biased selection pooling layer is 2, the filling is 0, the step length is 2, and the data preprocessing and the definition of related components comprise a learning rate attenuation strategy, the selection of an optimizer and the like;
step 202, initializing training turns i, i= 1:M;
step 203, forward propagation of the model, extracting features through the convolution layer until the model passes through the j-th layer biased pooling layer;
step 204, based on the selected bias function and a given set of superparameters [ alpha ] 12 ,...,α k ]Firstly, calculating the weight of each characteristic value under different super parameters to obtain k groups of weights, and obtaining k groups of output characteristic diagrams for each characteristic value by carrying out weighted summation on the characteristic values;
wherein the biased selection function is as follows:
by giving different superparameter alpha, the output z under different weights can be obtained simultaneously 1 ,z 2 ,...,z k ]And as alpha increases, the weight coefficient corresponding to the node with stronger response is larger, and the influence on the result is larger; conversely, the smaller the alpha is, the larger the weight coefficient corresponding to the node with the smaller characteristic value is, and the larger the effect on the result is;
step 205, output [ z ] using the parameter pair of mask 1 ,z 2 ,...,z k ]Obtaining the output of the biased selective pooling layer through weighted summation;
step 206, a structure frame of biased selection pooling is shown in fig. 3, fig. 3 is a schematic diagram of an algorithm frame of a method for classifying images based on biased selection pooling according to an embodiment of the present invention, properties of a biased selection function are shown in fig. 4, and fig. 4 is a schematic diagram of a biased selection function of a method for classifying images based on biased selection pooling according to an embodiment of the present invention; and (3) after the pooling of the biased pooling layer is finished, the pooled feature map is sent to the next convolution layer.
Step 207, judging whether the forward propagation in the model is finished, if so, proceeding to step 208; otherwise, continuing the process of step 203-step 207;
step 208, after the model is transmitted in the forward direction, starting backward transmission, and updating parameters and masks in the model;
step 209, determining whether the current iteration is performed, i.e. whether i=m is satisfied, if so, proceeding to step 210; otherwise, i=i+1, returning to step 202 to start the next iteration;
and 210, after model training, performing model test, and ending.
For a better understanding of the above embodiments, the following describes the technical effects of the present invention in combination with related experiments.
(1) Experimental details
Experimental hardware environment: ubuntu, quadro RTX 6000 graphic card.
Code execution environment: pycharm-2022.2.4, python3.8.
The datasets used for the experiments were the cifar-10, cifar-100 and imagenet-1k datasets. Wherein Cifar-10 is a pervasive object-identifying computer vision dataset comprising a total of 60,000 images; cifar-100 has 100 classes, where each image carries a fine granularity tag and a coarse granularity tag, so that 100 classes are further divided into 20 super classes, containing 60,000 images in total; imagnet-1k is a large-scale labeled image dataset organized according to the WordNet architecture, belonging to 1000 different categories.
Data preprocessing: for the Cifar10 and Cifar100 datasets, the size of the padding around the training set image is 4, the image size is randomly cropped to 32 x 32, then a random horizontal flip is performed, where p is set to 0.5, the image sizes of the imagenet-1k datasets are all unified to 448 x 448, where the training set image is randomly flipped, p is set to 0.5, then a random rotation is performed, and degree is set to (-30, 30). All image data is converted into tensor and normalized.
Definition of the model: the method comprises 3 layers of convolution layers, 1 layer of pooling layers, 2 layers of convolution layers and 2 layers of full-connection layers, wherein the convolution kernel of the convolution layers is 3, the filling is 1, and the step length is 1; the convolution kernel size of the biased selective pooling layer is 2, the filling is 0, the step length is 2, and the pooling layer is subjected to comparison test by using maximum pooling, average pooling and biased selective pooling respectively.
Super parameter setting: for [ alpha ] 12 ,...,α k ]Is set to [ -10, -3, -1,0,1,3,10 []The mask initial value is set to 1.
(2) Analysis of experimental results
The pooling layers were pooled using maximum pooling, mean pooling and biased selection pooling, respectively, and comparative experiments were performed on three data sets with the experimental results shown in table 1.
TABLE 1
Referring to Table 1, the classification precision of the invention on the cifar-10, cifar-100 and imagnet-1k test sets is 84.7%, 70.3% and 64.7%, respectively, and compared with maximum pooling, the classification precision can be improved by 1.6, 1.6 and 1.4 percentage points, and compared with mean pooling, the classification precision can be improved by 1.8, 0.8 and 1.6 percentage points. In summary, the method provided by the invention can effectively solve the problem that maximum value pooling and average value pooling are insufficient in extracting information in the traditional deep convolutional neural network, and meanwhile, the method can fully fuse information among different characteristic values, and can further solve the problems of stability of a model, tolerance to noise and the like by setting different hyper-parameters to use different pooling modes.
On the basis of the embodiment, the invention also provides an image classification device based on biased selective pooling, which is used for supporting the image classification method based on biased selective pooling of the embodiment, and comprises the following steps:
a definition module for data preprocessing and defining a model, defining a set of super parameters [ alpha ] 12 ,...,α k ]Initializing a mask [ beta ] 12 ,...,β k ]Parameters of (a); the method is also used for defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like;
the processing module is used for sending the training set into the model for forward propagation, and the image extracts local features through the convolution layer; and after the forward propagation is finished, classifying the characteristics output by the model through a classifier, calculating the loss, carrying out backward propagation, updating the parameters of the mask, and adjusting the learning rate until all the iteration times are finished.
Further, the image classification device based on biased selection pooling may operate the image classification method based on biased selection pooling, and specific implementation may refer to a method embodiment, which is not described herein.
On the basis of the above embodiment, the present invention further provides an image classification apparatus based on biased selective pooling, including:
the device comprises a processor and a memory, wherein the processor is in communication connection with the memory;
in this embodiment, the memory may be implemented in any suitable manner, for example: the memory can be read-only memory, mechanical hard disk, solid state disk, USB flash disk or the like; the memory is used for storing executable instructions executed by at least one of the processors;
in this embodiment, the processor may be implemented in any suitable manner, e.g., the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, etc.; the processor is configured to execute the executable instructions to implement the biased selection pooling-based image classification method as described above.
On the basis of the above embodiment, the present invention further provides a computer readable storage medium, in which a computer program is stored, which when executed by a processor implements the image classification method based on biased selection pooling as described above.
Those of ordinary skill in the art will appreciate that the modules and method steps of the examples described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or as a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus, device and module described above may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus, device, and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or units may be combined or integrated into another apparatus, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or apparatuses, which may be in electrical, mechanical or other form.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a read-only memory server, a random access memory server, a magnetic disk or an optical disk, or other various media capable of storing program instructions.
In addition, it should be noted that the combination of the technical features described in the present invention is not limited to the combination described in the claims or the combination described in the specific embodiments, and all the technical features described in the present invention may be freely combined or combined in any manner unless contradiction occurs between them.
It should be noted that the above-mentioned embodiments are merely examples of the present invention, and it is obvious that the present invention is not limited to the above-mentioned embodiments, and many similar variations are possible. All modifications attainable or obvious from the present disclosure set forth herein should be deemed to be within the scope of the present disclosure.
The foregoing is merely illustrative of the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. An image classification method based on biased selection pooling is characterized by comprising the following steps:
s1: preprocessing data of a data set to obtain a training set and defining a model;
s2: defining a set of superparameters [ alpha ] 12 ,...,α k ]Initializing a mask [ beta ] 12 ,...,β k ]Defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like;
s3: taking the training set as the input of the model, starting iterative training and carrying out forward propagation;
s4: by adjusting super-parameters [ alpha ] 12 ,...,α k ]Obtain the corresponding k groups of outputs z 1 ,z 2 ,...,z k ]To control the selection of the pooling area;
the method specifically comprises the following steps:
s41: the image is processed by a convolution layer to extract local features, when the image passes through a pooling layer, the feature image is sent to a polarized selective pooling layer, and the feature value of an mth pooling area is recorded as I m =[x 1 ,x 2 ,...,x n ];
If the size of the pooling area is h×w, the number of eigenvalues n=h×w in the pooling area, the size of m is between 1 and the number of pooling areas, and the biased pooling layer is compatible with maximum pooling and average pooling by setting alpha, and if the size of alpha is set to 0, the biased pooling layer is the average pooling layer by setting the number of alpha to 1; if the size of alpha is set to be a large enough number, the biased pooling layer is the maximum pooling layer;
s42: based on a biased selection function and a given set of hyper-parameters [ alpha ] 12 ,...,α k ]Calculating the weight of each characteristic value under different super parametersObtaining k sets of weights, i.e. when alpha is used i A corresponding set of weights is obtainedWherein i is e [1, k],j∈[1,n];
S43: for input eigenvalue I m =[x 1 ,x 2 ,...,x n ]Using each super parameter alpha i Respectively combining the weights corresponding to the twoThe eigenvalues of the pooled regions are weighted and summed to obtain the corresponding k groups of outputs z 1 ,z 2 ,...,z k ]A feature map;
s5: ginseng through input maskCouple of outputs [ z ] 1 ,z 2 ,...,z k ]Obtaining the m-th pooled region output y through weighted summation m The output characteristic diagram y of the polarized pooling layer is obtained by polarized pooling of all pooling areas;
s6: after the forward propagation is finished, classifying the feature map y output by the model through a classifier, and updating parameters of the model and the mask through calculating loss and carrying out backward propagation to adjust the learning rate;
s7: repeating the steps S3-S6 until all the iteration times are finished.
2. The biased selective pooling-based image classification method according to claim 1, wherein the biased selection function in S42 is:
wherein for n eigenvalues on a pooled region, each α i Corresponds to a set of weightsThus k different hyper-parameters alpha i Corresponding to k groups of weights and k different output eigenvalues z i Wherein i is e [1, k],j∈[1,n]。
3. The biased selective pooling-based image classification method according to claim 1, wherein the basic properties of the biased selection function in S42 are:
wherein for I m =[x 1 ,x 2 ,...,x n ]When α=0, each eigenvalue x i Weight coefficient of (2) isThe value of z corresponds to mean pooling at this time; when alpha is alpha to + _ infinity, the maximum eigenvalue x max The weight coefficient of the (2) is 1, the weight coefficients of the rest characteristic values are 0, and at the moment, the value of z is equivalent to the maximum value pooling; similarly, when alpha- & gt-infinity, z is equivalent to minimum pooling; when alpha takes other values, the weight coefficient of each characteristic value is +.>
4. The biased selective pooling-based image classification method according to claim 1, wherein the parameter pair output by the input mask in S5 is z 1 ,z 2 ,...,z k ]The weighted summation is as follows:
y m =β 1 z 12 z 2 +...+β k z k
5. the biased selective pooling-based image classification method according to claim 1, wherein the size of the mask parameter in S5 is updated following the update of the neural network parameter, and the size of the mask parameter β should be between 0 and 1.
6. The biased selective pooling-based image classification method according to claim 1, wherein the specific way of calculating the loss in S6 is:
wherein R is a residual term for evaluating the deviation between the sum of all mask weights and 1, if the sum of all mask weights is equal to 1, the residual term will be 0, otherwise, the residual term will be a positive number representing the degree of deviation between the sum of weights and 1.
7. The biased selective pooling-based image classification method according to claim 1, wherein in the update phase trained in S6, gradients of all network parameters are updated according to the calculated error derivatives, and in the biased selective pooling layer, gradient updates are proportional to weights calculated during forward propagation.
CN202310552011.XA 2023-05-17 2023-05-17 Image classification method based on biased selection pooling Active CN116630697B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310552011.XA CN116630697B (en) 2023-05-17 2023-05-17 Image classification method based on biased selection pooling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310552011.XA CN116630697B (en) 2023-05-17 2023-05-17 Image classification method based on biased selection pooling

Publications (2)

Publication Number Publication Date
CN116630697A CN116630697A (en) 2023-08-22
CN116630697B true CN116630697B (en) 2024-04-05

Family

ID=87620622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310552011.XA Active CN116630697B (en) 2023-05-17 2023-05-17 Image classification method based on biased selection pooling

Country Status (1)

Country Link
CN (1) CN116630697B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110069958A (en) * 2018-01-22 2019-07-30 北京航空航天大学 A kind of EEG signals method for quickly identifying of dense depth convolutional neural networks
CN114863348A (en) * 2022-06-10 2022-08-05 西安电子科技大学 Video target segmentation method based on self-supervision
CN115100076A (en) * 2022-07-24 2022-09-23 西安电子科技大学 Low-light image defogging method based on context-aware attention
CN115424076A (en) * 2022-09-16 2022-12-02 安徽大学 Image classification method based on self-adaptive pooling mode

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019079180A1 (en) * 2017-10-16 2019-04-25 Illumina, Inc. Deep convolutional neural networks for variant classification
US11922316B2 (en) * 2019-10-15 2024-03-05 Lg Electronics Inc. Training a neural network using periodic sampling over model weights

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110069958A (en) * 2018-01-22 2019-07-30 北京航空航天大学 A kind of EEG signals method for quickly identifying of dense depth convolutional neural networks
CN114863348A (en) * 2022-06-10 2022-08-05 西安电子科技大学 Video target segmentation method based on self-supervision
CN115100076A (en) * 2022-07-24 2022-09-23 西安电子科技大学 Low-light image defogging method based on context-aware attention
CN115424076A (en) * 2022-09-16 2022-12-02 安徽大学 Image classification method based on self-adaptive pooling mode

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Reinforcement Learning With Task Decomposition for Cooperative Multiagent Systems;Changyin Sun et al.;IEEE Transactions on Neural Networks and Learning Systems;20200617;全文 *
参数池化卷积神经网络图像分类方法;江泽涛;秦嘉奇;张少钦;;电子学报;20200915(第09期);全文 *
基于CNN的转子裂纹和不对中状态识别方法研究;赵旺;中国优秀硕士论文电子期刊网;20210715;全文 *

Also Published As

Publication number Publication date
CN116630697A (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN106845529B (en) Image feature identification method based on multi-view convolution neural network
WO2019136772A1 (en) Blurred image restoration method, apparatus and device, and storage medium
CN111507419B (en) Training method and device of image classification model
CN109754078A (en) Method for optimization neural network
CN109690576A (en) The training machine learning model in multiple machine learning tasks
CN109492674B (en) Generation method and device of SSD (solid State disk) framework for target detection
CN110852447A (en) Meta learning method and apparatus, initialization method, computing device, and storage medium
US11176672B1 (en) Machine learning method, machine learning device, and machine learning program
CN112580728B (en) Dynamic link prediction model robustness enhancement method based on reinforcement learning
CN112101207B (en) Target tracking method and device, electronic equipment and readable storage medium
CN114511576B (en) Image segmentation method and system of scale self-adaptive feature enhanced deep neural network
CN111626379B (en) X-ray image detection method for pneumonia
CN110991621A (en) Method for searching convolutional neural network based on channel number
CN111260660A (en) 3D point cloud semantic segmentation migration method based on meta-learning
CN112488313A (en) Convolutional neural network model compression method based on explicit weight
CN112837320A (en) Remote sensing image semantic segmentation method based on parallel hole convolution
CN112749737A (en) Image classification method and device, electronic equipment and storage medium
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN106651790A (en) Image de-blurring method, device and equipment
WO2022077894A1 (en) Image classification and apparatus, and related components
CN116630697B (en) Image classification method based on biased selection pooling
CN114758130B (en) Image processing and model training method, device, equipment and storage medium
CN114677547B (en) Image classification method based on self-holding characterization expansion type incremental learning
CN110866866A (en) Image color-matching processing method and device, electronic device and storage medium
CN116246126A (en) Iterative unsupervised domain self-adaption method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant