CN116630697B

CN116630697B - Image classification method based on biased selection pooling

Info

Publication number: CN116630697B
Application number: CN202310552011.XA
Authority: CN
Inventors: 任璐; 李�浩; 柳文章; 宋坤
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2023-05-17
Filing date: 2023-05-17
Publication date: 2024-04-05
Anticipated expiration: 2043-05-17
Also published as: CN116630697A

Abstract

The invention discloses an image classification method based on biased selection pooling, which comprises the following steps: data preprocessing and model definition, defining a group of super parameters [ alpha ] ₁ ,α ₂ ,…,α _k ]Initializing a mask [ beta ] ₁ ,β ₂ ,…,β _k ]Parameters of (a); defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like; and sending the training set into a model for forward propagation, and extracting local features of the image through a convolution layer. The invention realizes different feature extraction targets by adjusting the super parameters, solves the problems of inaccurate reserved information and the like caused by using maximum pooling or average pooling in the traditional image classification, improves the accuracy of image classification, is more flexible compared with maximum pooling and average pooling, and can better adapt to various different data features and tasks.

Description

Image classification method based on biased selection pooling

Technical Field

The invention relates to the technical field of deep learning, in particular to an image classification method based on biased selection pooling.

Background

With the rapid development of artificial intelligence, the deep neural network achieves excellent performance on tasks such as computer vision, voice recognition and automatic driving. Deep convolutional neural networks typically utilize pooling to reduce the size of the feature map and, in turn, the size of the model. The pool layer is introduced to increase the receptive field range and reduce the calculation requirement of subsequent convolution, and the parameter number and calculation amount of the model can be reduced along with the reduction of the size of the characteristic diagram, so that the occurrence of overfitting can be prevented to a certain extent.

Max pooling and mean pooling are two pooling operations common in deep learning. In the forward propagation process of maximum pooling, selecting the most responsive node value in the feature map to enter the next layer of operation, so that the edge and texture structure of the image can be captured well; while averaging, by averaging the selected region nodes, has the advantage that the shift in the estimated mean can be reduced, thus, being good at capturing the background features of the image.

Since there is no clear gap between the edge information and the background information, there is a lot of information between them on the feature map of the neural network. Therefore, depending on the two pooling modes, the neural network ignores the information, so that the reserved information is not accurate enough, and as the depth of the model is increased, more information can be lost in the two modes, so that the expression capability and generalization capability of the model are reduced.

Disclosure of Invention

The invention aims to provide an image classification method based on biased selection pooling so as to solve the problem that maximum pooling and average pooling extraction information is not accurate enough in the prior art, and can adaptively adjust pooling weights according to characteristic distribution of input data.

In order to achieve the above purpose, the present invention provides the following technical solutions: an image classification method based on biased selection pooling comprises the following steps:

s1: preprocessing data of a data set to obtain a training set and defining a model;

s2: defining a set of superparameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Initializing a mask [ beta ] ₁ ,β ₂ ,...,β _k ]Defining an optimizer, a loss function and a learning rate decay strategy, and setting parameters including a learning rate lr and the iteration number epoch. Batch size batch and other super parameters;

s3: taking the training set as the input of the model, starting iterative training and carrying out forward propagation;

s4: obtaining corresponding k groups of outputs [ z ] by adjusting the super-parameters ₁ ,z ₂ ,...,z _k ]To control the selection of the pooling area;

s5: output by inputting parameter pairs of mask [ z ] ₁ ,z ₂ ,...,z _k ]Obtaining the m-th pooled region output y through weighted summation _m The output characteristic diagram y of the polarized pooling layer is obtained by polarized pooling of all pooling areas;

s6: after the forward propagation is finished, classifying the feature map y output by the model through a classifier, and updating parameters of the model and the mask through calculating loss and carrying out backward propagation to adjust the learning rate;

s7: repeating the steps S3-S6 until all the iteration times are finished.

Preferably, in the step S4, the corresponding k groups of outputs [ z ] are obtained by adjusting the super-parameters ₁ ,z ₂ ,...,z _k ]To control the selection of the pooled regions specifically comprises the steps of:

s41: the image is processed by a convolution layer to extract local features, when the image passes through a pooling layer, the feature image is sent to a polarized selective pooling layer, and the feature value of an mth pooling area is recorded as I _m ＝[x ₁ ,x ₂ ,...,x _n ]；

Specifically, if the size of the pooled area in S41 is h×w, n=h×w, and the size of m is between 1 and the number of pooled areas.

And the biased selective pooling layer can be compatible with maximum pooling and average pooling by setting alpha, and the biased selective pooling layer is the average pooling layer by setting the number of alpha as 1 and setting the size of alpha as 0; if the size of α is set to a sufficiently large number, the selective pooling layer is the maximum pooling layer.

S42: based on a biased selection function and a given set of hyper-parameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Calculated outWeights of each characteristic value under different super parametersObtaining k sets of weights, i.e. when alpha is used ₁ A corresponding set of weights is obtained

Specifically, the biased selection function in S42 is:

wherein for n eigenvalues on a pooled region, each α _i Corresponds to a set of weightsThus k different hyper-parameters alpha _i Corresponding to k groups of weights and k different output eigenvalues z _i Wherein i is e [1, k],j∈[1,n]。

Specifically, the basic properties of the biased selection function in S32 are:

wherein for I _m ＝[x ₁ ,x ₂ ,...,x _n ]When α=0, each eigenvalue x _i Weight coefficient of (2) isThe value of z corresponds to mean pooling at this time; when alpha is alpha to + _ infinity, the maximum eigenvalue x _max The weight coefficient of the (2) is 1, the weight coefficients of the rest characteristic values are 0, and at the moment, the value of z is equivalent to the maximum value pooling; similarly, when alpha- & gt-infinity, z is equivalent to minimum pooling; when alpha takes other values, the weight coefficient of each characteristic value is +.>

S43: for characteristic value I _m ＝[x ₁ ,x ₂ ,...,x _n ]Using each super parameter alpha _i Respectively combining the weights corresponding to the twoThe eigenvalues of the pooled regions are weighted and summed to obtain the corresponding k groups of outputs z ₁ ,z ₂ ,...,z _k ]A feature map;

preferably, the parameter pair output [ z ] of S5 through the input mask ₁ ,z ₂ ,...,z _k ]The weighted summation is as follows:

y _m ＝β ₁ z ₁ +β ₂ z ₂ +...+β _k z _k 。

preferably, the size of the mask in S5 is updated following the update of the neural network parameter, and the size of the mask β should be between 0 and 1.

Preferably, the specific way to calculate the loss in S6 is as follows:

wherein R is a residual term for evaluating the deviation between the sum of all mask weights and 1, if the sum of all mask weights is equal to 1, the residual term will be 0, otherwise, the residual term will be a positive number representing the degree of deviation between the sum of weights and 1.

Preferably, in said updating phase of the training in S6,

the gradient of all network parameters is updated according to the calculated error derivatives, and in the biased pooling layer, the gradient update is proportional to the weight calculated during forward propagation.

An image classification device based on biased selection pooling, comprising:

definition module, whichFor data preprocessing and defining a model, defining a set of hyper-parameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Initializing a mask [ beta ] ₁ ,β ₂ ,...,β _k ]Parameters of (a); the method is also used for defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like;

the processing module is used for sending the training set into the model for forward propagation, and the image extracts local features through the convolution layer; and after the forward propagation is finished, classifying the characteristics output by the model through a classifier, calculating the loss, carrying out backward propagation, updating the parameters of the mask, and adjusting the learning rate until all the iteration times are finished.

Compared with the prior art, the invention has the beneficial effects that:

1. compared with maximum value pooling and average value pooling, the invention endows each characteristic value with corresponding weight through alpha, and outputs and fuses all characteristic values, so that information can be better reserved. Meanwhile, by adding a mask, the defect that only one strategy can be used for pooling each time is avoided, and the generalization capability of the model can be better improved;

2. the invention can also be compatible with maximum value pooling and average value pooling, when only one alpha is set, the mask layer can be removed at the same time by setting the alpha to 0 or a larger value, the time bias selection pooling is equivalent to the average value pooling or the maximum value pooling, the bias selection pooling has more operation modes, the selectivity is stronger, and the mode of extracting the characteristics is more flexible;

3. the selection of the pooled regions can be controlled by adjusting the hyper-parameters to accommodate different scenarios and data sets. Meanwhile, the biased selection pooling method can be used in combination with other neural network structures, such as a convolutional neural network, a cyclic neural network and the like, so that the performance of the model is improved;

4. the weighted summation in the biased selection function is performed on all the inputs, so that the model has stronger tolerance to noise, better stability of model output and the method provides assistance for accurate, stable and generalization performance model construction.

Drawings

FIG. 1 is a main flow chart of an image classification method based on biased selective pooling according to an embodiment of the present invention;

FIG. 2 is a flowchart of a specific method for obtaining k groups of corresponding outputs by adjusting super parameters based on a biased selective pooling image classification method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an algorithm framework of an image classification method based on biased selection pooling according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a biased selection function of an image classification method based on biased selection pooling according to an embodiment of the present invention;

fig. 5 is a flowchart illustrating specific steps of another image classification method based on biased selective pooling according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The main execution body of the method in this embodiment is a terminal, and the terminal may be a device such as a mobile phone, a tablet computer, a PDA, a notebook or a desktop, but of course, may be another device with a similar function, and this embodiment is not limited thereto.

Referring to fig. 1 to 5, the present invention provides a biased selection pooling-based image classification method, which is applied to the fields of pattern recognition and deep learning, and includes:

the model comprises a plurality of convolution layers for extracting features, a pooling layer is arranged between the convolution layers for reducing the size of the activation mapping, and finally a full-connection layer is used as a classification layer for classification.

S2: defining a set of superparameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Initializing a mask [ beta ] ₁ ,β ₂ ,...,β _k ]Defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like;

the number and the size of the super-parameters alpha need to be defined in advance, and have certain influence on the expression capacity of the model, so that multiple experiments are needed to determine the optimal combination of the optimal value range and the super-parameter size for different data characteristic distribution; the biased selective pooling layer can be compatible with maximum pooling and average pooling by setting alpha, and the biased selective pooling can be compatible with maximum and average values and can also use the characteristic graphs of the maximum and the average values at the same time;

by setting the number of alpha to 1, if the size of alpha is set to 0, the biased pooling layer is a mean pooling layer; if the size of α is set to a sufficiently large number, the selective pooling layer is the maximum pooling layer.

wherein, in S4, the corresponding k groups of outputs [ z ] are obtained by adjusting the super parameters ₁ ,z ₂ ,...,z _k ]To control the selection of the pooled regions specifically comprises the steps of:

Specifically, if the size of the pooling area in S41 is h×w, n=h×w, and the size of m is between 1 and the number of pooling areas; in S41, the biased selective pooling layer can be compatible with maximum pooling and average pooling by setting alpha, and the biased selective pooling layer is the average pooling layer by setting the number of alpha as 1 and setting the size of alpha as 0; if the size of alpha is set to be a large enough number, the biased pooling layer is the maximum pooling layer;

s42: based on a biased selection function and a given set of hyper-parameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Calculating the weight of each characteristic value under different super parametersObtaining k sets of weights, i.e. when alpha is used ₁ A corresponding set of weights is obtained

Specifically, the biased selection function in S42 is:

wherein each alpha _i Corresponds to a set of weightsThus k different hyper-parameters alpha _i Corresponding to k groups of weights and k different output eigenvalues z _i Wherein i is e [1, k],j∈[1,n]。

Specifically, the basic properties of the biased selection function in S32 are:

wherein for I _m ＝[x ₁ ,x ₂ ,...,x _n ]When α=0, each eigenvalue x _i Weight coefficient of (2) isThe value of z corresponds to mean pooling at this time; when alpha is → plus-infinity, the maximumIs the eigenvalue x of (2) _max The weight coefficient of the (2) is 1, the weight coefficients of the rest characteristic values are 0, and at the moment, the value of z is equivalent to the maximum value pooling; similarly, when alpha- & gt-infinity, z is equivalent to minimum pooling; when alpha takes other values, the weight coefficient of each characteristic value is +.>

S43: for characteristic value I _m ＝[x ₁ ,x ₂ ,...,x _n ]Using each super parameter alpha _i Respectively combining the weights corresponding to the twoThe eigenvalues of the pooled regions are weighted and summed to obtain the corresponding k groups of outputs z ₁ ,z ₂ ,...,z _k ]And (5) a characteristic diagram.

In this embodiment, as shown in fig. 2, fig. 2 is a flowchart of a specific method for obtaining k groups of outputs by adjusting superparameters according to an image classification method based on biased selective pooling, and compared with maximum pooling and average pooling, the present invention provides a corresponding weight to each feature value through α, and outputs a fused feature value, so that information can be better retained. Controlling the selection of the pooling area by adjusting the super-parameters so as to adapt to different scenes and data sets;

s5: output by inputting parameter pairs of mask [ z ] ₁ ,z ₂ ,...,z _k ]Obtaining the m-th pooled region output y through weighted summation _m Obtaining an output characteristic diagram y of a polarized pooling layer by carrying out polarized pooling on all the materials;

the parameter pair output [ z ] through the input mask in the S5 ₁ ,z ₂ ,...,z _k ]The weighted summation is as follows:

y _m ＝β ₁ z ₁ +β ₂ z ₂ +...+β _k z _k 。

due to each z _i The method is output under different weight groups, so that after the mask is added, the contribution among different feature graphs can be balanced better through an end-to-end training model, and the output can be selectively performed;

the size of the mask in the step S5 is updated along with the updating of the neural network parameters, and the size of the mask beta is between 0 and 1; the mask is added to enable the model to select corresponding pooling weights according to the characteristic distribution of the input data, wherein the size of the mask is updated along with the updating of the neural network parameters, and the size of the mask beta is between 0 and 1, so that in order to control the size of the mask, the value of the mask needs to be limited through amplitude limiting to ensure that the value of the mask is in a reasonable range, and the influence on the performance of the model is avoided.

In the embodiment, the defect that only one strategy can be used for pooling each time is avoided by adding the mask, so that the generalization capability of the model can be better improved; the invention can also be compatible with maximum value pooling and average value pooling, when only one alpha is set, the mask layer can be removed at the same time by setting the alpha to 0 or a larger value, the time bias selection pooling is equivalent to the average value pooling or the maximum value pooling, the bias selection pooling has more operation modes, the selectivity is stronger, and the mode of extracting the characteristics is more flexible; the biased selection pooling method can also be used in combination with other neural network structures, such as a convolutional neural network, a cyclic neural network and the like, so that the performance of the model is improved, wherein the weighted summation in the biased selection function is carried out on all inputs, so that the model has stronger tolerance to noise, better stability of model output and the assistance is provided for accurate, stable and generalization performance model construction.

the specific way of calculating the loss in S6 is as follows:

In the update phase trained in S6, the gradients of all network parameters are updated according to the calculated error derivatives, and in the biased selective pooling layer, gradient update is proportional to the weights calculated during forward propagation. Unlike max pooling, since softmax is differentiable, gradients are calculated for each non-zero node within the region.

S7: repeating the steps S3-S6 until all the iteration times are finished.

For better understanding of the foregoing embodiments, as shown in fig. 5, the present invention further provides a flowchart of specific steps of a method for classifying images based on biased selective pooling, where the method at least includes:

step 201, defining a model, wherein the model comprises 3 layers of convolution layers, 1 layer of biased selective pooling layer, 2 layers of convolution layers and 2 layers of full connection layers, the convolution kernel size of the convolution layers is 3, the filling is 1, and the step length is 1; the convolution kernel size of the biased selection pooling layer is 2, the filling is 0, the step length is 2, and the data preprocessing and the definition of related components comprise a learning rate attenuation strategy, the selection of an optimizer and the like;

step 202, initializing training turns i, i= 1:M;

step 203, forward propagation of the model, extracting features through the convolution layer until the model passes through the j-th layer biased pooling layer;

step 204, based on the selected bias function and a given set of superparameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Firstly, calculating the weight of each characteristic value under different super parameters to obtain k groups of weights, and obtaining k groups of output characteristic diagrams for each characteristic value by carrying out weighted summation on the characteristic values;

wherein the biased selection function is as follows:

by giving different superparameter alpha, the output z under different weights can be obtained simultaneously ₁ ,z ₂ ,...,z _k ]And as alpha increases, the weight coefficient corresponding to the node with stronger response is larger, and the influence on the result is larger; conversely, the smaller the alpha is, the larger the weight coefficient corresponding to the node with the smaller characteristic value is, and the larger the effect on the result is;

step 205, output [ z ] using the parameter pair of mask ₁ ,z ₂ ,...,z _k ]Obtaining the output of the biased selective pooling layer through weighted summation;

step 206, a structure frame of biased selection pooling is shown in fig. 3, fig. 3 is a schematic diagram of an algorithm frame of a method for classifying images based on biased selection pooling according to an embodiment of the present invention, properties of a biased selection function are shown in fig. 4, and fig. 4 is a schematic diagram of a biased selection function of a method for classifying images based on biased selection pooling according to an embodiment of the present invention; and (3) after the pooling of the biased pooling layer is finished, the pooled feature map is sent to the next convolution layer.

Step 207, judging whether the forward propagation in the model is finished, if so, proceeding to step 208; otherwise, continuing the process of step 203-step 207;

step 208, after the model is transmitted in the forward direction, starting backward transmission, and updating parameters and masks in the model;

step 209, determining whether the current iteration is performed, i.e. whether i=m is satisfied, if so, proceeding to step 210; otherwise, i=i+1, returning to step 202 to start the next iteration;

and 210, after model training, performing model test, and ending.

For a better understanding of the above embodiments, the following describes the technical effects of the present invention in combination with related experiments.

(1) Experimental details

Experimental hardware environment: ubuntu, quadro RTX 6000 graphic card.

Code execution environment: pycharm-2022.2.4, python3.8.

The datasets used for the experiments were the cifar-10, cifar-100 and imagenet-1k datasets. Wherein Cifar-10 is a pervasive object-identifying computer vision dataset comprising a total of 60,000 images; cifar-100 has 100 classes, where each image carries a fine granularity tag and a coarse granularity tag, so that 100 classes are further divided into 20 super classes, containing 60,000 images in total; imagnet-1k is a large-scale labeled image dataset organized according to the WordNet architecture, belonging to 1000 different categories.

Data preprocessing: for the Cifar10 and Cifar100 datasets, the size of the padding around the training set image is 4, the image size is randomly cropped to 32 x 32, then a random horizontal flip is performed, where p is set to 0.5, the image sizes of the imagenet-1k datasets are all unified to 448 x 448, where the training set image is randomly flipped, p is set to 0.5, then a random rotation is performed, and degree is set to (-30, 30). All image data is converted into tensor and normalized.

Definition of the model: the method comprises 3 layers of convolution layers, 1 layer of pooling layers, 2 layers of convolution layers and 2 layers of full-connection layers, wherein the convolution kernel of the convolution layers is 3, the filling is 1, and the step length is 1; the convolution kernel size of the biased selective pooling layer is 2, the filling is 0, the step length is 2, and the pooling layer is subjected to comparison test by using maximum pooling, average pooling and biased selective pooling respectively.

Super parameter setting: for [ alpha ] ₁ ,α ₂ ,...,α _k ]Is set to [ -10, -3, -1,0,1,3,10 []The mask initial value is set to 1.

(2) Analysis of experimental results

The pooling layers were pooled using maximum pooling, mean pooling and biased selection pooling, respectively, and comparative experiments were performed on three data sets with the experimental results shown in table 1.

TABLE 1

Referring to Table 1, the classification precision of the invention on the cifar-10, cifar-100 and imagnet-1k test sets is 84.7%, 70.3% and 64.7%, respectively, and compared with maximum pooling, the classification precision can be improved by 1.6, 1.6 and 1.4 percentage points, and compared with mean pooling, the classification precision can be improved by 1.8, 0.8 and 1.6 percentage points. In summary, the method provided by the invention can effectively solve the problem that maximum value pooling and average value pooling are insufficient in extracting information in the traditional deep convolutional neural network, and meanwhile, the method can fully fuse information among different characteristic values, and can further solve the problems of stability of a model, tolerance to noise and the like by setting different hyper-parameters to use different pooling modes.

On the basis of the embodiment, the invention also provides an image classification device based on biased selective pooling, which is used for supporting the image classification method based on biased selective pooling of the embodiment, and comprises the following steps:

a definition module for data preprocessing and defining a model, defining a set of super parameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Initializing a mask [ beta ] ₁ ,β ₂ ,...,β _k ]Parameters of (a); the method is also used for defining an optimizer, a loss function and a learning rate attenuation strategy, and setting super parameters including learning rate lr, iteration times epoch, batch size batch and the like;

Further, the image classification device based on biased selection pooling may operate the image classification method based on biased selection pooling, and specific implementation may refer to a method embodiment, which is not described herein.

On the basis of the above embodiment, the present invention further provides an image classification apparatus based on biased selective pooling, including:

the device comprises a processor and a memory, wherein the processor is in communication connection with the memory;

in this embodiment, the memory may be implemented in any suitable manner, for example: the memory can be read-only memory, mechanical hard disk, solid state disk, USB flash disk or the like; the memory is used for storing executable instructions executed by at least one of the processors;

in this embodiment, the processor may be implemented in any suitable manner, e.g., the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, etc.; the processor is configured to execute the executable instructions to implement the biased selection pooling-based image classification method as described above.

On the basis of the above embodiment, the present invention further provides a computer readable storage medium, in which a computer program is stored, which when executed by a processor implements the image classification method based on biased selection pooling as described above.

Those of ordinary skill in the art will appreciate that the modules and method steps of the examples described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or as a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus, device and module described above may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus, device, and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or units may be combined or integrated into another apparatus, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or apparatuses, which may be in electrical, mechanical or other form.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a read-only memory server, a random access memory server, a magnetic disk or an optical disk, or other various media capable of storing program instructions.

In addition, it should be noted that the combination of the technical features described in the present invention is not limited to the combination described in the claims or the combination described in the specific embodiments, and all the technical features described in the present invention may be freely combined or combined in any manner unless contradiction occurs between them.

It should be noted that the above-mentioned embodiments are merely examples of the present invention, and it is obvious that the present invention is not limited to the above-mentioned embodiments, and many similar variations are possible. All modifications attainable or obvious from the present disclosure set forth herein should be deemed to be within the scope of the present disclosure.

The foregoing is merely illustrative of the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An image classification method based on biased selection pooling is characterized by comprising the following steps:

s4: by adjusting super-parameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Obtain the corresponding k groups of outputs z ₁ ,z ₂ ,...,z _k ]To control the selection of the pooling area;

the method specifically comprises the following steps:

If the size of the pooling area is h×w, the number of eigenvalues n=h×w in the pooling area, the size of m is between 1 and the number of pooling areas, and the biased pooling layer is compatible with maximum pooling and average pooling by setting alpha, and if the size of alpha is set to 0, the biased pooling layer is the average pooling layer by setting the number of alpha to 1; if the size of alpha is set to be a large enough number, the biased pooling layer is the maximum pooling layer;

s42: based on a biased selection function and a given set of hyper-parameters [ alpha ] ₁ ,α ₂ ,...,α _k ]Calculating the weight of each characteristic value under different super parametersObtaining k sets of weights, i.e. when alpha is used _i A corresponding set of weights is obtainedWherein i is e [1, k],j∈[1,n]；

S43: for input eigenvalue I _m ＝[x ₁ ,x ₂ ,...,x _n ]Using each super parameter alpha _i Respectively combining the weights corresponding to the twoThe eigenvalues of the pooled regions are weighted and summed to obtain the corresponding k groups of outputs z ₁ ,z ₂ ,...,z _k ]A feature map;

s5: ginseng through input maskCouple of outputs [ z ] ₁ ,z ₂ ,...,z _k ]Obtaining the m-th pooled region output y through weighted summation _m The output characteristic diagram y of the polarized pooling layer is obtained by polarized pooling of all pooling areas;

s7: repeating the steps S3-S6 until all the iteration times are finished.

2. The biased selective pooling-based image classification method according to claim 1, wherein the biased selection function in S42 is:

3. The biased selective pooling-based image classification method according to claim 1, wherein the basic properties of the biased selection function in S42 are:

4. The biased selective pooling-based image classification method according to claim 1, wherein the parameter pair output by the input mask in S5 is z ₁ ,z ₂ ,...,z _k ]The weighted summation is as follows:

y _m ＝β ₁ z ₁ +β ₂ z ₂ +...+β _k z _k 。

5. the biased selective pooling-based image classification method according to claim 1, wherein the size of the mask parameter in S5 is updated following the update of the neural network parameter, and the size of the mask parameter β should be between 0 and 1.

6. The biased selective pooling-based image classification method according to claim 1, wherein the specific way of calculating the loss in S6 is:

7. The biased selective pooling-based image classification method according to claim 1, wherein in the update phase trained in S6, gradients of all network parameters are updated according to the calculated error derivatives, and in the biased selective pooling layer, gradient updates are proportional to weights calculated during forward propagation.