CN115797709B

CN115797709B - Image classification method, device, equipment and computer readable storage medium

Info

Publication number: CN115797709B
Application number: CN202310067363.6A
Authority: CN
Inventors: 沈艳梅; 宿栋栋; 刘伟; 阚宏伟; 王彦伟; 李仁刚
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2023-01-19
Filing date: 2023-01-19
Publication date: 2023-04-25
Anticipated expiration: 2043-01-19
Also published as: CN115797709A

Abstract

The application relates to the technical field of deep neural networks and discloses an image classification method, an image classification device, image classification equipment and a computer readable storage medium, wherein an acquired image data set is analyzed by utilizing a pre-training neural network model to obtain a characteristic image and a classification predicted value; carrying out correlation analysis on the same-category characteristic images to determine correlation coefficients among the same-category characteristic images; and determining the classification prediction deviation between the same-category characteristic images according to the classification prediction values corresponding to the same-category characteristic images. And carrying out parameter adjustment on the pre-training neural network model according to the association relation between the correlation coefficient and the classification prediction deviation, and the label value and the classification prediction value corresponding to each characteristic image, so as to analyze the acquired images to be classified by utilizing the trained neural network model and determine the image types contained in the images to be classified. And the model is finely adjusted based on the correlation between different samples, so that the accuracy of image classification and identification is improved.

Description

Image classification method, device, equipment and computer readable storage medium

Technical Field

The present disclosure relates to the field of deep neural networks, and in particular, to an image classification method, apparatus, device, and computer readable storage medium.

Background

Deep neural networks are widely used to solve various complex problems, whether classification, segmentation, object detection problems in the image field, natural language processing, speech recognition, etc., because of their powerful data processing and nonlinear data fitting capabilities. However, complex neural network structures may cause overfitting problems in addition to achieving improved performance. Overfitting means that neural network model learning too minimizes errors in the training dataset, but the performance of the test data is poor. Besides the excessively high complexity of the model, the problems of model overfitting are aggravated by reasons such as small sample number, unbalanced distribution of category number, unbalanced classification difficulty, unbalanced sample noise and the like.

In the conventional scheme for improving the model overfitting problem, improvement is mainly carried out from two aspects of model training configuration and model construction, the improvement effect of the model training configuration method is usually smaller, and the model construction can radically influence model training, so that the improvement effect is better, wherein the improvement of the model construction is mainly based on the construction of a loss function.

The current technical solutions for the construction of the loss function can be divided into two categories. One is weight fine-tuning to improve sample imbalance, and the other is to construct a loss function based on the regularization term of the distance between the sample feature images. The former focuses on the sample imbalance problem, and the latter focuses on the sample internal correlation. However, these schemes have limited improvement effects, and the neural network model still has a problem of over-fitting, so that the accuracy of image classification and identification is affected.

It can be seen that how to improve the accuracy of image classification and identification is a problem that needs to be solved by those skilled in the art.

Disclosure of Invention

An object of the embodiments of the present application is to provide an image classification method, apparatus, device, and computer readable storage medium, which can improve accuracy of image classification and identification.

In order to solve the above technical problems, an embodiment of the present application provides an image classification method, including:

analyzing the acquired image data set by utilizing a pre-training neural network model to obtain a characteristic image and a classification predicted value;

carrying out correlation analysis on the same-category characteristic images to determine correlation coefficients among the same-category characteristic images;

determining classification prediction deviation between the same-category characteristic images according to the classification prediction values corresponding to the same-category characteristic images;

According to the association relation between the correlation coefficient and the classification prediction deviation, and the label value and the classification prediction value corresponding to each characteristic image, carrying out parameter adjustment on the pre-training neural network model to obtain a trained neural network model;

and analyzing the acquired images to be classified by using the trained neural network model to determine the image types contained in the images to be classified.

Optionally, the performing parameter adjustment on the pre-training neural network model according to the association relationship between the correlation coefficient and the classification prediction bias, and the label value and the classification prediction value corresponding to each feature image, so as to obtain a trained neural network model includes:

fitting the correlation coefficient between the same-category characteristic images and the classification prediction deviation by using a support vector machine algorithm to construct a regular term for representing the correlation between samples;

constructing an error loss term based on the label value corresponding to each characteristic image and the classification predicted value;

taking the error loss term and the regular term as loss functions;

and training the pre-trained neural network model by using the loss function to obtain a trained neural network model.

Optionally, the fitting the correlation coefficient between the same class of feature images and the classification prediction bias by using a support vector machine algorithm to construct a regularization term for characterizing the correlation between samples includes:

taking the correlation coefficient and the classification prediction deviation corresponding to each category as training samples;

training the constructed regression model of the support vector machine by using the training sample to obtain a linear relation between the correlation coefficient and the classification prediction deviation;

and constructing the regular term based on the deviation between the classification prediction deviation fitted by the linear relation and the classification prediction value output by the pre-training neural network model.

Optionally, the constructing the regularization term based on a deviation between the classification prediction bias fitted by the linear relation and the classification prediction value output by the pre-trained neural network model includes:

fitting classification prediction deviation corresponding to any two characteristic images under the same class by using the linear relation;

acquiring the respective corresponding classification predicted values of any two feature images output by the pre-training neural network model;

taking the square of the difference value of the classification predicted values corresponding to any two feature images as model prediction deviation;

And taking the square of the difference value between the classification prediction deviation and the model prediction deviation as a regularization term.

Optionally, the constructing an error loss term based on the label value and the classification predicted value corresponding to each of the feature images includes:

and carrying out mean square error operation on the label value corresponding to each characteristic image and the classification predicted value to obtain an error loss term.

and performing cross entropy operation on the label value corresponding to each characteristic image and the classification predicted value to obtain an error loss term.

Optionally, the performing correlation analysis on the same-category feature images to determine the correlation coefficient between the same-category feature images includes:

calculating the image mean value corresponding to each of all the characteristic images in the target category; wherein the target category is any one of all categories contained in the pre-trained neural network model;

and constructing a correlation coefficient between any two characteristic images under the target category based on the image mean value and any two characteristic images under the target category.

Optionally, the constructing the correlation coefficient between any two feature images under the target category based on the image mean value and any two feature images under the target category includes:

inputting the image mean value and any two characteristic images in the target category into a preset correlation coefficient calculation formula to obtain a correlation coefficient between any two characteristic images in the target category;

the correlation coefficient calculation formula is as follows:

；

wherein,cthe category of the object is indicated and,irepresenting the first under target classiA number of samples of the sample were taken,jrepresenting the first under target classjA number of samples of the sample were taken,kthe number of channels is indicated and the number of channels is indicated,

representing target categoriescLower (th)iSample and the firstjThe correlation coefficient between the individual samples is such that,

represent the firstiCharacteristic image of individual samples, +.>

Represent the firstiImage mean value of characteristic images of individual samples, +.>

Represent the firstjCharacteristic image of individual samples, +.>

Represent the firstjThe image means of the feature images of the individual samples,ma row subscript representing a pixel point of the feature image,ncolumn subscripts of pixels representing the feature image.

Optionally, the determining, according to the classification prediction values corresponding to the same-category feature images, the classification prediction deviation between the same-category feature images includes:

Taking the square of L2 norms of classification predicted values corresponding to any two feature images in a target category as classification predicted deviation corresponding to any two feature images in the target category; the target category is any one of all categories contained in the pre-training neural network model.

Optionally, for the acquisition process of the image dataset, the method comprises:

acquiring an initial image dataset;

the initial image dataset is preprocessed to obtain the image dataset.

Optionally, the preprocessing the initial image dataset to obtain the image dataset includes:

cutting, turning and/or rotating the image contained in the initial image data set to obtain a new image data set;

according to the input image size of the pre-trained neural network model, and resizing the new image data set and the image contained in the initial image data set to obtain the image data set.

Optionally, after the resizing the new image dataset and the image contained in the initial image dataset according to the input image size of the pre-trained neural network model, the method further comprises:

Calculating brightness mean and variance corresponding to the size-adjusted image;

and carrying out normalization processing on the image with the adjusted size based on the brightness mean value and the variance so as to obtain a final image data set.

Optionally, before the correlation analysis is performed on the same-category feature images to determine the correlation coefficient between the same-category feature images, the method further includes:

and performing dimension reduction processing on the characteristic image according to a principal component analysis method to obtain the latest characteristic image.

Optionally, the performing the dimension reduction processing on the feature image according to the principal component analysis method to obtain the latest feature image includes:

performing decentering treatment on the target characteristic image to obtain a decentered characteristic image; wherein the target feature image is any one of the feature images;

forming an image matrix from the de-centered characteristic images under all channels;

performing eigenvalue decomposition on a covariance matrix corresponding to the image matrix to obtain eigenvalues and eigenvectors;

selecting a preset number of target characteristic values with the largest characteristic values, and forming a characteristic transformation matrix by the characteristic vectors corresponding to the target characteristic values; the preset number is smaller than the number of channels of the pre-training neural network model;

And converting the target feature images under all channels into the latest target feature images according to the feature transformation matrix.

Optionally, the performing the decentering processing on the target feature image to obtain a decentered feature image includes:

calculating image average values corresponding to the target feature images under all channels;

subtracting the image average value from the target characteristic image under each channel to obtain an off-center characteristic image. Optionally, the converting the target feature images under all channels into the latest target feature images according to the feature transformation matrix includes:

multiplying the feature transformation matrix with the target feature images under all channels to obtain the latest target feature image

Optionally, the analyzing the acquired image to be classified by using the trained neural network model to determine an image category contained in the image to be classified includes:

under the condition that an image to be classified is obtained, adjusting the size of the image to be classified according to the size of the input image of the trained neural network model;

inputting the images to be classified with the adjusted sizes into the trained neural network model to obtain classification predicted values corresponding to the images to be classified;

And taking the category corresponding to the classification predicted value with the highest value as the image category contained in the image to be classified.

The embodiment of the application also provides an image classification device, which comprises an analysis unit, a coefficient determination unit, a deviation determination unit, an adjustment unit and a category determination unit;

the analysis unit is used for analyzing the acquired image data set by utilizing the pre-training neural network model so as to obtain a characteristic image and a classification predicted value;

the coefficient determining unit is used for carrying out correlation analysis on the same-category characteristic images so as to determine the correlation coefficient between the same-category characteristic images;

the deviation determining unit is used for determining classification prediction deviation between the same-category characteristic images according to the classification prediction values corresponding to the same-category characteristic images;

the adjustment unit is used for performing parameter adjustment on the pre-training neural network model according to the association relation between the correlation coefficient and the classification prediction deviation, the label value corresponding to each characteristic image and the classification prediction value, so as to obtain a trained neural network model;

the category determining unit is used for analyzing the acquired images to be classified by using the trained neural network model so as to determine the image categories contained in the images to be classified.

Optionally, the adjusting unit comprises a fitting subunit, a constructing subunit, a training subunit and a fitting subunit;

the fitting subunit is used for fitting the correlation coefficient between the same-category characteristic images and the classification prediction deviation by using a support vector machine algorithm so as to construct a regular term for representing the correlation between samples;

the construction subunit is configured to construct an error loss term based on the label value and the classification predicted value corresponding to each of the feature images;

the sub-unit is used for taking the error loss term and the regular term as a loss function;

and the training subunit is used for training the pre-training neural network model by utilizing the loss function so as to obtain a trained neural network model.

Optionally, the fitting subunit is configured to use a correlation coefficient and a classification prediction deviation corresponding to each category as a training sample;

Optionally, the fitting subunit is configured to fit, by using the linear relationship, classification prediction deviations corresponding to any two feature images under the same class;

Optionally, the construction subunit is configured to perform a mean square error operation on the label value and the classification prediction value corresponding to each feature image, so as to obtain an error loss term.

Optionally, the construction subunit is configured to perform cross entropy operation on the tag value corresponding to each feature image and the classification prediction value, so as to obtain an error loss term.

Optionally, the coefficient determining unit includes a calculating subunit and a constructing subunit;

the calculating subunit is used for calculating the image mean value corresponding to each of all the characteristic images under the target category; wherein the target category is any one of all categories contained in the pre-trained neural network model;

The construction subunit is configured to construct a correlation coefficient between any two feature images under the target category based on the image mean value and any two feature images under the target category. Optionally, the construction subunit is configured to input the image mean value and any two feature images under the target category to a preset correlation coefficient calculation formula, so as to obtain a correlation coefficient between any two feature images under the target category;

the correlation coefficient calculation formula is as follows:

the method comprises the steps of carrying out a first treatment on the surface of the wherein,cthe category of the object is indicated and,irepresenting the first under target classiA number of samples of the sample were taken,jrepresenting the first under target classjA number of samples of the sample were taken,kindicates the number of channels>

Representing target categoriescLower (th)iSample and the firstjCorrelation coefficient between samples, < >>

Represent the firstiCharacteristic image of individual samples, +.>

Represent the firstjCharacteristic image of individual samples, +.>

Optionally, the deviation determining unit is configured to take the square of the L2 norm of the classification prediction value corresponding to any two feature images in the target category as the classification prediction deviation corresponding to any two feature images in the target category; the target category is any one of all categories contained in the pre-training neural network model. Optionally, for an acquisition process of the image dataset, the apparatus comprises an acquisition unit and a preprocessing unit;

The acquisition unit is used for acquiring an initial image data set;

the preprocessing unit is used for preprocessing the initial image data set to obtain the image data set.

Optionally, the preprocessing unit includes a processing subunit and an adjusting subunit;

the processing subunit is used for cutting, overturning and/or rotating the image contained in the initial image data set so as to obtain a new image data set;

the adjusting subunit is configured to adjust the sizes of the new image dataset and the image included in the initial image dataset according to the input image size of the pre-trained neural network model, so as to obtain the image dataset.

Optionally, the system further comprises a computing unit and a normalization processing unit;

the calculating unit is used for calculating the brightness mean and variance corresponding to the image after the size adjustment;

the normalization processing unit is used for performing normalization processing on the image with the adjusted size based on the brightness mean value and the variance so as to obtain a final image data set.

Optionally, the system further comprises a dimension reduction processing unit;

the dimension reduction processing unit is used for carrying out dimension reduction processing on the characteristic image according to a principal component analysis method so as to obtain the latest characteristic image.

Optionally, the dimension reduction processing unit comprises a decentralization subunit, a composition subunit, a decomposition subunit, a selection subunit and a conversion subunit;

the de-centering subunit is used for performing de-centering treatment on the target characteristic image to obtain a de-centering characteristic image; wherein the target feature image is any one of the feature images;

the composition subunit is used for composing the de-centered characteristic images under all channels into an image matrix;

the decomposition subunit is used for carrying out eigenvalue decomposition on the covariance matrix corresponding to the image matrix so as to obtain eigenvalues and eigenvectors;

the selecting subunit is configured to select a preset number of target feature values with the largest feature values, and form a feature transformation matrix from feature vectors corresponding to the target feature values; the preset number is smaller than the number of channels of the pre-training neural network model;

the conversion subunit is configured to convert the target feature images under all channels into the latest target feature images according to the feature transformation matrix.

Optionally, the de-centering subunit is configured to calculate an image average value corresponding to the target feature images under all channels; subtracting the image average value from the target characteristic image under each channel to obtain an off-center characteristic image.

Optionally, the conversion subunit is configured to multiply the feature transformation matrix with the target feature images under all channels to obtain the latest target feature image

Optionally, the analysis unit includes a resizing subunit, an input subunit, and a serving subunit;

the size adjustment subunit is configured to adjust, when an image to be classified is acquired, the size of the image to be classified according to the size of the input image of the trained neural network model;

the input subunit is used for inputting the images to be classified with the adjusted sizes into the trained neural network model so as to obtain classification predicted values corresponding to the images to be classified;

and the sub-unit is used for taking the category corresponding to the classification predicted value with the highest value as the image category contained in the image to be classified.

The embodiment of the application also provides electronic equipment, which comprises:

a memory for storing a computer program;

a processor for executing the computer program to implement the steps of the image classification method as described above.

Embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the image classification method as described above.

According to the technical scheme, the obtained image data set is analyzed by utilizing the pre-training neural network model so as to obtain a characteristic image and a classification predicted value; the feature image is an image containing the target object, and the classification predicted value reflects the score of each category to which the target object belongs in the feature image. In order to fully mine the correlation between the characteristic images, correlation analysis can be carried out on the characteristic images of the same category so as to determine the correlation coefficient between the characteristic images of the same category; and determining the classification prediction deviation between the same-category characteristic images according to the classification prediction values corresponding to the same-category characteristic images. The multi-channel characteristic image corresponding to each image can be regarded as one sample, and the association relationship between the correlation coefficient and the classification prediction deviation can reflect the correlation between different samples. And carrying out parameter adjustment on the pre-trained neural network model according to the association relation between the correlation coefficient and the classification prediction deviation, and the label value and the classification prediction value corresponding to each characteristic image so as to obtain the trained neural network model. And analyzing the acquired images to be classified by using the trained neural network model, so that the image types contained in the images to be classified can be determined. In the technical scheme, in order to solve the problem of over-fitting of the neural network model, the pre-trained neural network model is subjected to fine-tuning training based on correlation among different samples. By analyzing the correlation between the characteristic images of the same category, the influence of the correlation between samples on the output category is fully considered, so that the overfitting problem of the neural network model on the sample category labels is improved, the generalization capability of the neural network model is further enhanced, and the accuracy of image classification and identification is effectively improved.

Drawings

For a clearer description of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of an image classification method according to an embodiment of the present application;

fig. 2 is a flowchart of a method for acquiring an image dataset according to an embodiment of the present application;

FIG. 3 is a flowchart of a method for performing dimension reduction processing on a feature image according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an image classification device according to an embodiment of the present application;

fig. 5 is a block diagram of an electronic device according to an embodiment of the present application.

Description of the embodiments

The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments herein without making any inventive effort are intended to fall within the scope of the present application.

The terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the foregoing drawings are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may include other steps or elements not expressly listed.

In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description.

Next, an image classification method provided in the embodiments of the present application will be described in detail. Fig. 1 is a flowchart of an image classification method according to an embodiment of the present application, where the method includes:

s101: and analyzing the acquired image data set by using the pre-training neural network model to obtain a characteristic image and a classification predicted value.

The image dataset may originate from the acquisition means and the common dataset.

The pre-training neural network model may employ a convolutional neural network model. The general structure of the convolutional neural network model can be divided into a convolutional hidden layer structure and a fully connected layer structure. The convolution hidden layer structure is used for feature extraction and obtains feature images of a plurality of channels. For an image, the feature images of the corresponding channels are one sample. And the full-connection layer structure is used as a classifier model to output a classification predicted value, so that classification is realized.

The construction of the neural network model may include model structure configuration and loss function definition. The configuration of the neural network structure comprises the number of layers of the neural network, the number and dimension of neurons in each layer, the type of activation function and the like, and the neural network structure is divided into different types according to the structural difference, and ResNet, googleNet, VGG, imageNet, inceptionNet and the like are commonly used. In consideration of learning efficiency, a trained neural network model may be directly used.

The loss function is an optimized objective function in the neural network for measuring the deviation between the generated classification predictions and the actual observed training objectives. The neural network training or optimizing process is a process of minimizing a loss function, and the smaller the loss function is, the closer the classification predicted value of the model is to the true value, and the better the model accuracy is. Two commonly used loss functions: a mean square error function and a cross entropy (Sigmoid) loss function. The mean square error function is commonly used for linear regression. Cross entropy is used to measure the difference between two probability distributions.

In embodiments of the present application, the image dataset may be analyzed using a neural network model that is trained in a conventional manner. The neural network model that is trained in a conventional manner may be referred to simply as a pre-trained neural network model.

S102: and carrying out correlation analysis on the same-category characteristic images to determine correlation coefficients among the same-category characteristic images.

In the conventional mode, only the deviation of the classification predicted value and the label value is considered in the training process of the neural network model, and the correlation between samples is not concerned, so that the problem of over-fitting is easily caused. The feature images of the same category have certain correlation, and the higher the feature correlation is, the closer the classification prediction result is. Therefore, in the embodiment of the application, the class relativity can be used as an optimization term of the neural network model to correct the model classification predicted value. Class relatedness refers to relatedness between samples under the same class.

The feature images under each category are processed in a similar manner, and the description is developed by taking any one of all categories contained in the pre-training neural network model, namely the target category as an example.

In a specific implementation, image means corresponding to all the feature images under the target category can be calculated; and constructing a correlation coefficient between any two characteristic images under the target category based on the image mean value and any two characteristic images under the target category.

In the embodiment of the application, a correlation coefficient calculation formula can be constructed, and when the correlation coefficient between any two feature images in the target category needs to be determined, the image mean value and any two feature images in the target category can be input into the preset correlation coefficient calculation formula to obtain the correlation coefficient between any two feature images in the target category. By category cFor example, categorycThe corresponding feature image set is

，TFor the number of samples, the characteristic image of each sample isMEach channel is of the size offw×fhIs a picture of the image of (a). Taking the VGG16 model as an example,M=128，fw=fh=112. calculating the correlation coefficient between the characteristic images of the same category, and classifyingcMiddle samplei，jThe correlation coefficient between them is marked as->

Calculating a correlation coefficient for each channel, < >>

IncludedMFor each correlation coefficient, then for the categorycIn total +.>

A plurality of inter-sample correlation coefficient vectors, each vector includingMAnd a correlation coefficient.

For channelskThe calculation formula of the correlation coefficient between the characteristic images is as follows:

the method comprises the steps of carrying out a first treatment on the surface of the wherein,cthe category of the object is indicated and,irepresenting the first under target classiA number of samples of the sample were taken,jrepresenting the first under target classjA number of samples of the sample were taken,krepresent the firstkMultiple channels (I)>

Represent the firstiCharacteristic image of individual samples, +.>

Represent the firstiThe image means of the feature images of the individual samples,

represent the firstjCharacteristic image of individual samples, +.>

Represent the firstjThe image means of the feature images of the individual samples,mimage representing characteristic imageThe row subscript of the pixel point,ncolumn subscripts of pixels representing the feature image. S103: and determining the classification prediction deviation between the same-category characteristic images according to the classification prediction values corresponding to the same-category characteristic images.

In the embodiment of the present application, the square of the L2 norm of the classification prediction value corresponding to any two feature images in the target class may be used as the classification prediction deviation corresponding to any two feature images in the target class. Assume that the total category number isCModel classification prediction result isCDimension vector, samplei，jClassification prediction results of (2) are recorded asy _i 、y _j Then the classification prediction bias is

。CThe value of (c) is equal to the output class of the model setting, e.g.,Cmay be set to 1000.

S104: and carrying out parameter adjustment on the pre-trained neural network model according to the association relation between the correlation coefficient and the classification prediction deviation, and the label value and the classification prediction value corresponding to each characteristic image so as to obtain the trained neural network model.

The correlation of the same category features affects the classification prediction bias, so that a relation exists between the correlation coefficient between the images of the same category features and the classification prediction bias, and in the embodiment of the application, a support vector machine regression model (SVM regression) can be adopted to carry out fitting modeling on the correlation relationship between the images of the same category features.

In specific implementation, a support vector machine algorithm can be utilized to fit the correlation coefficient and the classification prediction deviation between the characteristic images of the same category so as to construct a regular term for representing the correlation between samples; constructing an error loss term based on the label value and the classification predicted value corresponding to each characteristic image; taking the error loss term and the regular term as loss functions; and training the pre-trained neural network model by using the loss function to obtain a trained neural network model.

For the construction of the regular term, the correlation coefficient and the classification prediction deviation corresponding to each category can be used as training samples; training the constructed regression model of the support vector machine by using a training sample to obtain a linear relation between the correlation coefficient and the classification prediction deviation; and constructing a regular term based on the deviation between the classification prediction deviation fitted by the linear relation and the classification prediction value output by the pre-training neural network model.

The objective function of the SVM regression model is:

；

st.y _t -（w ^T φ(x _t )+b）≤ε

（w ^T φ(x _t )+b）-y _t ≤ε

wherein,w，brepresenting the SVM model weights and bias parameters respectively,φrepresenting a kernel function transformation, for non-linear fitting,εthe super-parameter is represented by a parameter,x _t representing the observed sample points, corresponding to the correlation coefficients in the present application

，y _t Representing observed output values corresponding to the neural network model classification prediction bias +.>

。

After the optimization and the solving of the SVM model are completed, an SVM regression model is obtained,g(x _t )=w ^T φ(x _t ) +b, input class dependent coefficient of relativityx _t To SVM regression model, outputting the deviation of the fitted classification prediction resultg(x _t )。

In the embodiment of the application, based on the deviation between the classification prediction deviation fitted by the linear relation and the classification prediction value output by the pre-training neural network model, the implementation mode for constructing the regular term can comprise the step of fitting the classification prediction deviation corresponding to any two characteristic images under the same category by using the linear relation; acquiring the respective corresponding classification predicted values of any two characteristic images output by the pre-training neural network model; taking the square of the difference value of the classification predicted values corresponding to any two characteristic images as model prediction deviation; and taking the square of the difference value between the classification prediction deviation and the model prediction deviation as a regularization term.

After the regular term is constructed, the regular term is added to the loss function of the neural network model, so that the problem of overfitting of the neural network model can be solved.

The error loss term can be constructed in two ways, and one way can be to perform the mean square error operation on the label value and the classification predicted value corresponding to each characteristic image so as to obtain the error loss term. And in another mode, the label value and the classification predicted value corresponding to each characteristic image can be subjected to cross entropy operation to obtain an error loss term.

In the embodiment of the application, the loss function comprises an error loss term and a regular term, the error term describes the deviation between the network predicted value and the target value, and the error loss term can be a mean square error function or a cross entropy function. The regular term can prevent excessive fitting of training data and improve generalization capability. The following is a specific construction of a loss function provided in the embodiment of the present application:

the method comprises the steps of carrying out a first treatment on the surface of the wherein,frepresenting the function of the neural network model,Nthe number of samples is represented and,x _i representing a picture of an input sample,f(x _i ) Representing the predicted values of the neural network model,Y _i representing the tag value, then the first item +.>

The error between the model predicted value and the tag value is described.iA sample index is indicated as a sample index, N _i Representing a sampleiThe number of samples of the sample feature image set of the corresponding class, assuming samplesiThe corresponding category isliThe sample feature image set of this class is noted asS _li ，iA sample index is indicated as a sample index,N _i representing a sampleiThe number of samples of the sample feature image set of the corresponding class, assuming samplesiThe corresponding category isliThe sample feature image set of this class is noted asS _li ，N _i Then is a setS _li And then (2) the size ofjRepresenting a collectionS _li Sample subscript of [ ]f(x _i )-f(x _j ) ) representation of a samplei、jDeviation of model predictive value of +.>

Sample representing SVM regression model fittingi、jDeviation of model predictive value of (2) will +.>

And taking the intra-class sample correlation into consideration as a regular term, the model predicted value can be corrected, and the over-fitting problem is improved. In addition to this, the process is carried out,λcalled penalty coefficient, is a coefficient that adjusts the relationship between the empirical error term and regularization term, balancing the weight relationship of the two. After obtaining the loss function comprising the error loss term and the regular term, the loss function is applied to training of the neural network model, the convolution hidden layer parameters are fixed, and the training process only updates the full-connection layer parameters, so that the trained neural network model is obtained.

S105: and analyzing the acquired images to be classified by using the trained neural network model to determine the image types contained in the images to be classified.

After the neural network model is trained, the size of the image to be classified can be adjusted according to the size of the input image of the trained neural network model under the condition that the image to be classified is acquired; inputting the images to be classified with the adjusted sizes into a trained neural network model to obtain classification predicted values corresponding to the images to be classified; and taking the category corresponding to the classification predicted value with the highest value as the image category contained in the image to be classified.

In the embodiment of the application, the acquired initial image data set can be transformed so as to improve the diversity of samples. Fig. 2 is a flowchart of a method for acquiring an image dataset according to an embodiment of the present application, where the method includes:

s201: an initial image dataset is acquired.

The image dataset is typically derived from the acquisition means and the common dataset, and to enhance the diversity of the image dataset, the initial image dataset may be preprocessed to obtain the image dataset.

The preprocessing procedure for the initial image data set may be to enhance randomness of the initial image data set, and a specific implementation may be referred to in the description of S202 and S203.

S202: the image contained in the initial image dataset is cropped, flipped and/or rotated to obtain a new image dataset.

Images containing different targets can be obtained through clipping, and enhancement of image data is achieved.

The turning can be to turn the image up and down, left and right. The rotation may be selecting a point on the image as the center point, rotating the image around any angle about the center point to obtain a new image, thereby increasing the randomness of the image dataset.

S203: and resizing the new image dataset and the image contained in the initial image dataset according to the input image size of the pre-trained neural network model.

The new image dataset and the images contained in the initial image dataset are resized to the input image size of the pre-trained neural network model, such as by using the VGG16 model, the picture size must be resized (224 ).

S204: calculating brightness mean and variance corresponding to the size-adjusted image; and carrying out normalization processing on the image with the adjusted size based on the brightness mean value and the variance so as to obtain a final image data set.

The process of normalizing the image by using the luminance mean and variance belongs to a relatively mature technology, and is not described in detail herein.

In the embodiment of the application, the randomness of the image data set can be enhanced through the processes of cutting, overturning, rotating and the like of the initial image data set. By normalizing the image based on the mean and variance of brightness, the neural network model can be more easily converged.

In the embodiment of the application, in order to reduce the influence caused by the interference factors in the feature images, the feature images can be subjected to dimension reduction processing according to the principal component analysis method to obtain the latest feature images, so that correlation analysis and classification prediction deviation determination are performed on the latest feature images. Fig. 3 is a flowchart of a method for performing dimension reduction processing on a feature image according to an embodiment of the present application, where the method includes:

S301: and performing decentering treatment on the target characteristic image to obtain a decentered characteristic image.

The target feature image is any one of feature images.

In a specific implementation, an image average value corresponding to the target feature image under all channels can be calculated; the image average value is subtracted from the target feature image under each channel to obtain a de-centered feature image. Assume, 1 stiThe characteristic image of each channel isF _i The average value of the characteristic image is

Decentralised feature image->

。

S302: and forming an image matrix from the off-center characteristic images under all the channels. The pre-training neural network model comprisesMThe image matrix formed by the decentered characteristic images comprisesMElements, i.e

. S303: and carrying out eigenvalue decomposition on the covariance matrix corresponding to the image matrix to obtain eigenvalues and eigenvectors.

The covariance matrix corresponding to the image matrix is

。

Performing eigenvalue decomposition on the covariance matrix to obtain eigenvaluesλAnd feature vectorPWill characteristic valueλThe feature values and feature vector sets are arranged in order from large to small as follows:

λ={λ ₁ ，λ ₂ ，…，λ _M }；

P={p ₁ ，p ₂ ，…，p _M }。

s304: selecting a preset number of target feature values with the largest feature values, and forming a feature transformation matrix by feature vectors corresponding to the target feature values.

The preset number is smaller than the number of channels of the pre-training neural network model.

To be used forMThe number of channels may be as followsK，K<M. In the order of characteristic values from big to small, in the selectionλMaximum ofKThe feature values and the corresponding feature vectors are used as row vectors to form a feature transformation matrix

。/>

S305: and converting the target feature images under all channels into the latest target feature images according to the feature transformation matrix.

In a specific implementation, the feature transformation matrix may be multiplied with the target feature images under all channels to obtain the latest target feature image.

In combination with the above description, the feature transformation matrix is

The latest target feature image is +.>

，/>

Is composed ofKAnd a characteristic image formed by the channels. Through the feature transformation matrix, the method can be used forMFeature image transformation of individual channelsKOn each channel, the quality of the characteristic images is improved, and the subsequent operation amount for analyzing the characteristic images is reduced, so that the efficiency of image classification processing is improved.

Fig. 4 is a schematic structural diagram of an image classification device according to an embodiment of the present application, which includes an analysis unit 41, a coefficient determination unit 42, a deviation determination unit 43, an adjustment unit 44, and a category determination unit 45;

An analysis unit 41 for analyzing the acquired image dataset by using the pre-trained neural network model to obtain a feature image and a classification prediction value;

a coefficient determining unit 42, configured to perform correlation analysis on the same-category feature images to determine a correlation coefficient between the same-category feature images;

a deviation determining unit 43, configured to determine a classification prediction deviation between the feature images of the same category according to the classification prediction values corresponding to the feature images of the same category;

the adjusting unit 44 is configured to perform parameter adjustment on the pre-trained neural network model according to the association relationship between the correlation coefficient and the classification prediction bias, and the label value and the classification prediction value corresponding to each feature image, so as to obtain a trained neural network model;

the category determining unit 45 is configured to analyze the acquired image to be classified by using the trained neural network model, so as to determine an image category included in the image to be classified.

the fitting subunit is used for fitting the correlation coefficient and the classification prediction deviation between the characteristic images of the same category by using a support vector machine algorithm so as to construct a regular term for representing the correlation between samples;

A construction subunit, configured to construct an error loss term based on the label value and the classification prediction value corresponding to each feature image;

as a subunit, the error loss term and the regularization term are used as loss functions;

and the training subunit is used for training the pre-training neural network model by using the loss function so as to obtain a trained neural network model.

Optionally, the fitting subunit is configured to use a correlation coefficient and a classification prediction deviation corresponding to each category as a training sample; training the constructed regression model of the support vector machine by using a training sample to obtain a linear relation between the correlation coefficient and the classification prediction deviation; and constructing a regular term based on the deviation between the classification prediction deviation fitted by the linear relation and the classification prediction value output by the pre-training neural network model.

Optionally, the fitting subunit is configured to fit classification prediction deviations corresponding to any two feature images under the same class by using a linear relationship; acquiring the respective corresponding classification predicted values of any two characteristic images output by the pre-training neural network model; taking the square of the difference value of the classification predicted values corresponding to any two characteristic images as model prediction deviation; and taking the square of the difference value between the classification prediction deviation and the model prediction deviation as a regularization term.

Optionally, the construction subunit is configured to perform cross entropy operation on the label value and the classification prediction value corresponding to each feature image, so as to obtain an error loss term.

Optionally, the coefficient determination unit comprises a calculation subunit and a construction subunit;

the calculating subunit is used for calculating the image mean value corresponding to each of all the characteristic images in the target category; the target category is any one of all categories contained in the pre-training neural network model;

the construction subunit is used for constructing a correlation coefficient between any two characteristic images under the target category based on the image mean value and any two characteristic images under the target category.

Optionally, the construction subunit is configured to input the image mean value and any two feature images under the target class into a preset correlation coefficient calculation formula, so as to obtain a correlation coefficient between any two feature images under the target class;

the correlation coefficient calculation formula is:

the method comprises the steps of carrying out a first treatment on the surface of the wherein,cthe category of the object is indicated and,irepresenting the first under target classiA number of samples of the sample were taken, jRepresenting the first under target classjA number of samples of the sample were taken,kindicates the number of channels>

represent the firstiCharacteristic image of individual samples, +.>

Represent the firstjCharacteristic image of individual samples, +.>

Represent the firstjThe image means of the feature images of the individual samples,ma row subscript representing a pixel point of the feature image,ncolumn subscripts of pixels representing the feature image. Optionally, the deviation determining unit is configured to take the square of the L2 norm of the classification predicted value corresponding to any two feature images in the target category as the classification predicted deviation corresponding to any two feature images in the target category; the target category is any one of all categories contained in the pre-training neural network model.

Optionally, for an acquisition process of an image dataset, the apparatus comprises an acquisition unit and a preprocessing unit;

an acquisition unit for acquiring an initial image dataset;

and the preprocessing unit is used for preprocessing the initial image data set to obtain the image data set.

A processing subunit, configured to perform clipping, flipping, and/or rotation processing on an image included in the initial image dataset to obtain a new image dataset;

and the adjusting subunit is used for adjusting the sizes of the new image data set and the images contained in the initial image data set according to the input image size of the pre-training neural network model so as to obtain the image data set.

and the normalization processing unit is used for performing normalization processing on the image with the adjusted size based on the brightness mean value and the variance so as to obtain a final image data set.

Optionally, the system further comprises a dimension reduction processing unit;

and the dimension reduction processing unit is used for carrying out dimension reduction processing on the characteristic image according to the principal component analysis method so as to obtain the latest characteristic image.

the decentering subunit is used for decentering the target characteristic image to obtain a decentered characteristic image; the target feature image is any one of feature images;

The composition subunit is used for composing the de-centered characteristic images under all the channels into an image matrix;

the selecting subunit is used for selecting a preset number of target characteristic values with the largest characteristic values, and forming a characteristic transformation matrix by the characteristic vectors corresponding to the target characteristic values; the preset number is smaller than the number of channels of the pre-training neural network model;

and the conversion subunit is used for converting the target feature images under all channels into the latest target feature images according to the feature transformation matrix.

Optionally, the de-centering subunit is configured to calculate an image average value corresponding to the target feature images under all channels; the image average value is subtracted from the target feature image under each channel to obtain a de-centered feature image.

the size adjustment subunit is used for adjusting the size of the image to be classified according to the size of the input image of the trained neural network model under the condition that the image to be classified is acquired;

The description of the features of the embodiment corresponding to fig. 4 may be referred to the related description of the embodiment corresponding to fig. 1 to 3, and will not be repeated here.

Fig. 5 is a block diagram of an electronic device according to an embodiment of the present application, as shown in fig. 5, where the electronic device includes: a memory 20 for storing a computer program;

the processor 21 is configured to implement the steps of the image classification method according to the above embodiment when executing the computer program.

The electronic device provided in this embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like.

Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 21 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 21 may also comprise a main processor, which is a processor for processing data in an awake state, also called CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 21 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 21 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

Memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing a computer program 201, which, when loaded and executed by the processor 21, is capable of implementing the relevant steps of the image classification method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may further include an operating system 202, data 203, and the like, where the storage manner may be transient storage or permanent storage. The operating system 202 may include Windows, unix, linux, among others. The data 203 may include, but is not limited to, an image dataset, an image to be classified, and the like.

In some embodiments, the electronic device may further include a display 22, an input-output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.

Those skilled in the art will appreciate that the structure shown in fig. 5 is not limiting of the electronic device and may include more or fewer components than shown.

It will be appreciated that the image classification method of the above embodiment, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution contributing to the prior art, or in a software product stored in a storage medium, performing all or part of the steps of the methods of the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random-access Memory (Random Access Memory, RAM), an electrically erasable programmable ROM, registers, a hard disk, a removable disk, a CD-ROM, a magnetic disk, or an optical disk, etc. various media capable of storing program codes.

Based on this, the embodiment of the present invention further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the image classification method as described above.

The functions of each functional module of the computer readable storage medium according to the embodiments of the present invention may be specifically implemented according to the method in the embodiments of the method, and the specific implementation process may refer to the relevant description of the embodiments of the method, which is not repeated herein.

The above describes in detail an image classification method, apparatus, device and computer readable storage medium provided in the embodiments of the present application. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The above describes in detail an image classification method, apparatus, device and computer readable storage medium provided in the present application. The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present invention and its core ideas. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present invention, and such improvements and modifications fall within the scope of the claims of the present application.

Claims

1. An image classification method, comprising:

carrying out correlation analysis on the same-category characteristic images to determine correlation coefficients among the same-category characteristic images; the correlation between the characteristic images under the same category is used as an optimization term of the neural network model;

analyzing the acquired images to be classified by using the trained neural network model to determine the image types contained in the images to be classified;

and performing parameter adjustment on the pre-trained neural network model according to the association relation between the correlation coefficient and the classification prediction deviation, and the label value and the classification prediction value corresponding to each feature image, so as to obtain a trained neural network model, wherein the method comprises the following steps:

taking the error loss term and the regular term as loss functions;

training the pre-trained neural network model by using the loss function to obtain a trained neural network model;

The step of carrying out correlation analysis on the same-category characteristic images to determine the correlation coefficient between the same-category characteristic images comprises the following steps:

2. The method of image classification according to claim 1, wherein fitting the correlation coefficients between the same class of feature images and the classification prediction bias using a support vector machine algorithm to construct a regularization term for characterizing inter-sample correlation comprises:

3. The image classification method according to claim 2, wherein constructing the regularization term based on a deviation between the classification prediction bias fitted by the linear relationship and the classification prediction value output by the pre-trained neural network model comprises:

4. The image classification method according to claim 1, wherein the constructing an error loss term based on the label value and the classification prediction value corresponding to each of the feature images includes:

5. The image classification method according to claim 1, wherein the constructing an error loss term based on the label value and the classification prediction value corresponding to each of the feature images includes:

6. The image classification method according to claim 1, wherein the constructing a correlation coefficient between any two feature images in the target category based on the image mean and any two feature images in the target category includes:

the correlation coefficient calculation formula is as follows:

；

represent the firstiCharacteristic image of individual samples, +.>

Represent the firstjCharacteristic image of individual samples, +.>

Represent the firstjThe image means of the feature images of the individual samples,ma row subscript representing a pixel point of the feature image, nColumn subscripts of pixels representing the feature image.

7. The image classification method according to claim 1, wherein determining the classification prediction bias between the same-category feature images according to the classification prediction values corresponding to the same-category feature images respectively comprises:

8. The image classification method according to claim 1, characterized in that for the acquisition process of the image dataset, the method comprises:

acquiring an initial image dataset;

the initial image dataset is preprocessed to obtain the image dataset.

9. The image classification method of claim 8, wherein preprocessing the initial image dataset to obtain the image dataset comprises:

And according to the input image size of the pre-training neural network model, carrying out size adjustment on the new image data set and the image contained in the initial image data set so as to obtain the image data set.

10. The image classification method of claim 9, further comprising, after said resizing the new image dataset and the image contained by the initial image dataset according to the input image size of the pre-trained neural network model:

11. The image classification method according to claim 1, further comprising, before said correlation analysis of the same-category feature images to determine correlation coefficients between the same-category feature images:

12. The image classification method according to claim 11, wherein the performing the dimension-reduction processing on the feature image according to the principal component analysis method to obtain the latest feature image comprises:

13. The method of image classification as claimed in claim 12, wherein said performing a de-centering process on the target feature image to obtain a de-centered feature image comprises:

subtracting the image average value from the target characteristic image under each channel to obtain an off-center characteristic image.

14. The method of image classification according to claim 12, wherein said converting the target feature images under all channels into the latest target feature images according to the feature transformation matrix comprises:

Multiplying the feature transformation matrix with the target feature images under all channels to obtain the latest target feature image.

15. The image classification method according to any one of claims 1 to 14, wherein analyzing the acquired image to be classified using the trained neural network model to determine the image class included in the image to be classified comprises:

16. An image classification device is characterized by comprising an analysis unit, a coefficient determination unit, a deviation determination unit, an adjustment unit and a category determination unit;

The coefficient determining unit is used for carrying out correlation analysis on the same-category characteristic images so as to determine the correlation coefficient between the same-category characteristic images; the correlation between the characteristic images under the same category is used as an optimization term of the neural network model;

the category determining unit is used for analyzing the acquired images to be classified by utilizing the trained neural network model so as to determine the image categories contained in the images to be classified;

the adjusting unit comprises a fitting subunit, a constructing subunit, a training subunit and a fitting subunit; the fitting subunit is used for fitting the correlation coefficient between the same-category characteristic images and the classification prediction deviation by using a support vector machine algorithm so as to construct a regular term for representing the correlation between samples; the construction subunit is configured to construct an error loss term based on the label value and the classification predicted value corresponding to each of the feature images; the sub-unit is used for taking the error loss term and the regular term as a loss function; the training subunit is configured to train the pre-trained neural network model by using the loss function, so as to obtain a trained neural network model;

The coefficient determining unit comprises a calculating subunit and a constructing subunit; the calculating subunit is used for calculating the image mean value corresponding to each of all the characteristic images under the target category; wherein the target category is any one of all categories contained in the pre-trained neural network model; the construction subunit is configured to construct a correlation coefficient between any two feature images under the target category based on the image mean value and any two feature images under the target category.

17. An electronic device, comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the steps of the image classification method according to any one of claims 1 to 15.

18. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, implements the steps of the image classification method according to any one of claims 1 to 15.