CN117456284B

CN117456284B - Image classification method, device, equipment and storage medium

Info

Publication number: CN117456284B
Application number: CN202311765676.5A
Authority: CN
Inventors: 袁明冬; 阮威健; 胡金晖; 张力元
Original assignee: Smart City Research Institute Of China Electronics Technology Group Corp
Current assignee: Smart City Research Institute Of China Electronics Technology Group Corp
Priority date: 2023-12-21
Filing date: 2023-12-21
Publication date: 2024-05-10
Anticipated expiration: 2043-12-21
Also published as: CN117456284A

Abstract

The application is applicable to the technical field of image processing, and provides an image classification method, an image classification device and a storage medium, wherein the method comprises the following steps: acquiring a training image set, and performing image feature extraction processing on the plurality of training image samples according to a preset image feature extraction function to obtain low-rank sparse projection features of the plurality of training image samples; performing image classification training by using the low-rank sparse projection features of the plurality of training image samples to obtain a target image classifier; and inputting the low-rank sparse projection features of the plurality of image samples to be classified into the target image classifier, and generating the categories corresponding to the plurality of image samples to be classified. According to the scheme, the low-rank sparse projection features are extracted by applying low-rank and sparse constraint to the preset feature extraction function, so that the robustness of the image features to noise, shielding and the like is enhanced, and the accuracy of image classification is enhanced.

Description

Image classification method, device, equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image classification method, apparatus, device, and storage medium.

Background

Image classification is an important task in the fields of image processing and computer vision. The aim of image classification is to divide images into different categories or labels, so that the aim of automatically identifying the images by a computer can be fulfilled, for example, the technologies such as face recognition and the like are widely applied at present.

The image classification mainly comprises the steps of extracting the characteristics of images to enable a computer to learn the differences among different images, and in a related scheme, mainly comprises the steps of principal component analysis (PRINCIPAL COMPONENT ANALYSIS, PCA), linear discriminant analysis (LINEAR DISCRIMINANT ANALYSIS, LDA), local preserving projection (Locality Preserving Projection, LPP) and the like, wherein image data are converted into one-dimensional high-dimensional vectors in a column or row serial connection mode for processing, so that the effect of image classification is achieved.

However, the method is easy to destroy the internal topological structure information of the image, so that the accuracy of image classification is low, the efficiency of processing the high-dimensional data is low, and the popularization and the application are difficult.

Disclosure of Invention

The embodiment of the application provides an image classification method, an image classification device, image classification equipment and a storage medium, which can solve the technical problem of how to improve the accuracy of image classification.

In a first aspect, an embodiment of the present application provides an image classification method, including:

a training image set is acquired, the training image set including a plurality of training image samples.

And carrying out image feature extraction processing on the plurality of training image samples according to a preset image feature extraction function to obtain low-rank sparse projection features of the plurality of training image samples. By applying low-rank and sparse constraint to the projection matrix, the low-rank and sparse projection matrix is obtained, and the robustness of the feature extraction matrix to noise, shielding and the like is enhanced.

And performing image classification training by using the low-rank sparse projection features of the plurality of training image samples to obtain a target image classifier.

And acquiring an image set to be classified, wherein the image set to be classified comprises a plurality of image samples to be classified.

Extracting low-rank sparse projection features of a plurality of image samples to be classified, inputting the low-rank sparse projection features of the plurality of image samples to be classified into a target image classifier, and generating categories corresponding to the plurality of image samples to be classified. In the step, the similarity measurement is carried out on the two-dimensional image feature matrix by adopting a low-rank norm, so that the classification performance of an algorithm is further enhanced.

In a second aspect, embodiments of the present application provide an image classification apparatus having a function to implement the method of the first aspect or any possible implementation thereof. In particular, the apparatus comprises means for implementing the method of the first aspect or any possible implementation thereof.

In one embodiment thereof, the apparatus comprises:

And the acquisition unit is used for acquiring a training image set which comprises a plurality of training image samples.

And the processing unit is used for carrying out image feature extraction processing on the plurality of training image samples according to a preset image feature extraction function to obtain low-rank sparse projection features of the plurality of training image samples.

The processing unit is further used for performing image classification training by utilizing the low-rank sparse projection features of the plurality of training image samples to obtain a target image classifier.

The acquisition unit is also used for acquiring an image set to be classified, wherein the image set to be classified comprises a plurality of image samples to be classified.

The processing unit is further configured to extract low-rank sparse projection features of the plurality of image samples to be classified, input the low-rank sparse projection features of the plurality of image samples to be classified into the target image classifier, and generate categories corresponding to the plurality of image samples to be classified.

In a third aspect, an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to cause the computer device to implement a method according to any one of the implementation manners of the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a computer device causes the computer device to implement a method according to any implementation manner of the first aspect.

In a fifth aspect, embodiments of the present application provide a computer program product for, when run on a computer device, causing the computer device to perform the method of any one of the implementations of the first aspect described above.

Compared with the prior art, the embodiment of the application has the beneficial effects that: in the training stage, a low-rank sparse projection matrix is obtained by applying low-rank and sparse constraint to a preset feature extraction function (namely a projection matrix), so that the data redundancy of the extracted image features can be reduced, the robustness of the image features to noise, shielding and the like is enhanced, the training efficiency of a target image classifier and the accuracy of image classification are enhanced, namely, the optimization process in the training stage is favorable for obtaining the target image classifier with higher image classification accuracy; in the image classification stage, the image to be classified is input into a low-rank sparse projection matrix for feature extraction, and then the trained target image classifier is utilized for image classification, so that a more accurate image classification result is generated, that is, the image classifier with higher accuracy can be utilized for image recognition and image classification in the image classification stage to obtain the more accurate image classification result.

Drawings

Fig. 1 is an overall flow chart of an image classification method according to an embodiment of the present application.

Fig. 2 is a flow chart of an image classification method according to an embodiment of the present application.

Fig. 3 is a flowchart of another image classification method according to an embodiment of the present application.

Fig. 4 is an application scenario schematic diagram of an image classification method according to an embodiment of the present application.

Fig. 5 is a schematic structural diagram of an apparatus according to an embodiment of the present application.

Fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

Image classification is an important technology in the field of computer vision, and with the development of machine learning and deep learning, a computer can automatically identify and classify images, so that automatic image processing and understanding are realized. The technology has wide application in a plurality of fields including medical image analysis, security monitoring, unmanned driving, industrial quality inspection and the like.

The main flow of image classification is described in its entirety below in connection with fig. 1.

As shown in fig. 1, the image classification process can be divided into the following stages, namely a data acquisition stage, a feature extraction stage, and an image classification stage. In the data acquisition stage, the training image set and the test image set can be directly acquired, and the data preprocessing, such as cutting, shielding and the like, can be performed on the training image set to acquire the test image set. In the feature extraction stage, the prepared training image set is input into a set image feature extraction model, so that target image features can be extracted. In the image classification stage, the image classifier is trained by utilizing the extracted target image features, and the trained image classifier is obtained. In order to test the image recognition effect of the target image classifier, the images in the test image set may be input into the trained image classifier, and then the image class of the test image may be output. The identification accuracy of the image classifier can be judged according to the accuracy of the image class.

It can be understood that the images collected in real life are not regular, various problems such as shielding, light change, angle deformation and the like may exist, the images are greatly different from training images, and the accuracy of image classification is difficult to ensure. At the same time, the data volume of image data is often quite large, and a large number of high-dimensional data is a challenge for the image classification process.

Aiming at the problems, the application provides an image classification method, which is characterized in that the image data is extracted through a low-rank sparse projection matrix, the high-dimensional data can be subjected to dimension reduction processing, and the global and local information of the image are comprehensively considered, so that an image classifier with more robustness to noise and shielding is trained, and the accuracy of image classification is improved.

In order to further illustrate the technical scheme of the application, the following description is given by specific examples.

As shown in FIG. 2, the method includes the following steps S201 to S205.

S201, acquiring a training image set.

The training image set includes a plurality of training image samples. It will be appreciated that a training image set refers to a set of image data used to train an image classification model, and that training image samples refer to individual image samples in the training image set. The training image set is used to train an image classification model to enable it to identify and classify new image data. Training image samples are specific data points used to train the image classification model. By learning these samples, the image classification model can gradually increase the accuracy and generalization ability for new data.

Assuming that the training image set is represented by set A, we can getWherein, the method comprises the steps of, wherein,，/>，/>Representing the number of training image samples,/>Representing the dimensions of the rows of the matrix image,/>Representing the dimensions of the columns of the matrix image.

S202, performing image feature extraction processing on a plurality of training image samples according to a preset image feature extraction function to obtain low-rank sparse projection features of the plurality of training image samples.

An image feature extraction function may be understood as an algorithm or method for extracting meaningful features from an image. The preset image feature extraction function herein refers to a function capable of extracting low-rank sparse projection features from a plurality of training image samples, and may also be understood as a low-rank sparse projection matrix.

Image features refer to some representative information extracted from an image. Common image features include the following:

color characteristics: color information of each pixel in the image can be described in a histogram mode and the like;

Texture features: the texture information of different areas in the image can be described by means of gray level co-occurrence matrix, wavelet transformation and the like;

Shape characteristics: the shape information of the object in the image can be described by means of edge detection, contour extraction and the like;

scale features: the size information of the object in the image can be described by means of scale space analysis and the like;

directional characteristics: the orientation information of the object in the image can be described in a mode of a direction gradient histogram and the like;

Optical flow characteristics: refers to the motion information of objects in the image, and can be described in a mode of an optical flow field and the like.

The specific range and description of the image features are not limited herein, and may be selected according to actual situations.

Typically, the image feature extraction function performs a series of mathematical operations and processes on the input image to transform the information in the image into feature vectors or feature descriptors with identity and expressivity. The method comprises the steps of performing image feature extraction processing on a plurality of training image samples by using a preset image feature extraction function to obtain low-rank sparse projection features of the plurality of training image samples, and specifically describing the low-rank sparse projection features by combining related professional principles.

Projection matrices are a concept in linear algebra that is used to project one vector onto another. The projection matrix is generally denoted by P, which can project v onto a k-dimensional subspace (k < n) for a vector v in an n-dimensional vector space.

Sparsity and low rank of projection matrices are two properties commonly used in matrix decomposition and dimension reduction methods for extracting important features of data and reducing the dimension of data.

Wherein sparsity (Sparsity) refers to the property that most elements in the matrix are zero, with only a small number of non-zero elements. In the projection matrix, sparsity indicates that most elements in the projection matrix are zero, with only a small number of non-zero elements. The sparsity can help identify key features in the data, reduce redundant information of the data, and simultaneously have the effects of reducing computational complexity and improving model generalization capability.

Low rank (Low-rankness) refers to the relatively Low rank (rank) property of the matrix. In the projection matrix, low rank indicates that the rank of the projection matrix is relatively low, i.e., the projection matrix can use fewer basis vectors to represent changes in data. The low rank can help to extract the main features of the data, reduce the dimensionality of the data, and have the effects of reducing the influence of noise and improving the robustness of the model.

In view of the above, it can be understood that the meaning of the low-rank sparse projection matrix is to project a vector with a high dimension into a low-dimensional space, and the feature vector after projection has sparsity (i.e., most elements are zero and only the most important elements remain), and the image feature extracted by using the low-rank sparse projection matrix can be referred to as a low-rank sparse projection feature.

The low-rank sparse projection features can help to reduce the dimension of high-dimensional image data and remove noise in the image, so that the image is clearer and easier to analyze and process, and also can help to extract important features in the image and remove redundant information in the image, thereby better understanding and analyzing the content of the image. Meanwhile, the method can help to reduce the storage space and the transmission bandwidth of the image data, thereby achieving the purposes of data compression and resource saving, reducing the computational complexity of image processing and analysis, and improving the efficiency and the speed of an algorithm.

The sparsity and low rank of the projection matrix are utilized to realize efficient extraction and dimension reduction of the image features, so that the subsequent image classification task is facilitated.

It will be appreciated that the image may also be pre-processed, including graying, denoising, scale normalization, etc., prior to feature extraction of the plurality of image samples using the low-rank sparse projection matrix, in order to extract more stable and reliable image features.

S203, performing image classification training by using low-rank sparse projection features of a plurality of training image samples to obtain a target image classifier.

The image classifier refers to a model that classifies images using a machine learning algorithm. Common image classifiers include the following:

Support Vector Machines (SVMs), which are a method based on maximum interval classification, can be used for both linear and nonlinear classification problems. In image classification, SVMs are commonly used for training and classification tasks of classifiers;

The Convolutional Neural Network (CNN) is a special neural network structure, and can automatically extract features from images and perform classification tasks;

Decision Tree (Decision Tree), which is a Tree-based classification method, can decompose a complex classification problem into a plurality of simple classification problems. In image classification, decision trees may be used for feature selection and classification tasks;

Random Forest (Random Forest), which is a classification method based on a plurality of decision trees, can effectively avoid the problem of overfitting by randomly selecting features and samples for training;

k Nearest Neighbors (KNN), which is a classification method based on distance measurement, are selected for classification by calculating the distance between a sample to be classified and a training sample, and K nearest neighbors are selected for classification.

In practical applications, an appropriate classifier may be selected for training and classifying according to specific task requirements, which is not limited herein.

The target image classifier refers to a classifier model for image classification obtained through training.

It will be appreciated that image classification training using low-rank sparse projection features of multiple training image samples may result in a classifier for target image classification (i.e., a target classifier). It can also be understood that, because the training image set includes a plurality of image samples of different categories, the image classifier is trained by extracting low-rank sparse projection features of the image samples, and in the training process, the image classifier learns feature differences between images of different categories, thereby realizing a classification task. After training, the resulting classifier can be used to classify new images.

It can be appreciated that, because the low-rank sparse projection features of the plurality of training image samples are utilized for training, the obtained classifier can more accurately classify new images, has better generalization capability, has better robustness to noise and redundant information, can adapt to different image data sets, and can effectively classify unknown images.

S204, acquiring an image set to be classified.

The image set to be classified comprises a plurality of image samples to be classified.

The image set to be classified refers to a group of image samples to be classified, and can be understood as a test image set.

The data source in the image set may belong to the same category as the image data in the training set, or may belong to a different category from the image data in the training set, which is not limited herein.

Assuming that the set of images to be classified is represented by the set B, the image set to be classified can be obtainedWherein/>，，/>Representing the number of image samples to be classified,/>Representing the dimensions of the rows of the matrix image,/>Representing the dimensions of the columns of the matrix image.

S205, extracting low-rank sparse projection features of a plurality of image samples to be classified, inputting the low-rank sparse projection features of the plurality of image samples to be classified into a target image classifier, and generating categories corresponding to the plurality of image samples to be classified.

The method comprises the steps of inputting images to be classified into a low-rank sparse projection matrix for feature extraction, inputting low-rank sparse projection features of a plurality of image samples to be classified into a trained target image classifier, and processing the images through the target image classifier to obtain the category to which the images belong. In other words, this is a process of inputting low-rank sparse projection features of images into a classifier and obtaining their categories, and the trained target image classifier can automatically identify the categories of the images to be classified.

For example, assuming that all images can be divided into two types, namely a first type and a second type, a low-rank sparse projection characteristic of one image to be classified is input into a target image classifier, and the image can be obtained to belong to the first type or the second type or neither of the first type and the second type after the processing of the target image classifier, namely the type of the image to be classified is obtained.

And performing image feature extraction processing on a plurality of training image samples by using a low-rank sparse projection matrix to obtain low-rank sparse projection features of the plurality of training image samples, and effectively reducing the dimensionality of data by adding low-rank constraint in the projection matrix, thereby reducing the redundancy of the data and improving the compression and expression efficiency of the data. The low-rank and sparse constraint can enhance the robustness of the model to noise and interference, improve the processing capacity to abnormal values, enable the model to be more reliable and robust, reduce the overfitting risk of the model, improve the generalization capacity of the model, enable the model to be more suitable for unseen data, and in addition, the low-rank and sparse constraint can help the model learn to be more discriminative and robust feature representation, so that the performance of image processing and analysis tasks is improved.

In one implementation, the preset image feature extraction function includes a global reconstruction function, and image feature extraction processing is performed on a plurality of training image samples according to the preset image feature extraction function to obtain low-rank sparse projection features of the plurality of training image samples, including:

and constructing a global reconstruction function according to the plurality of training image samples, wherein the global reconstruction function is used for extracting global low-rank sparse projection features of the plurality of training image samples.

Global reconstruction is understood as a global reconstruction relationship, i.e. in image processing, the entire image is reconstructed or restored by some method or algorithm to maintain the global structure and the global features of the image, and is generally used to extract the global features of the image, such as color, brightness, contrast, etc., so as to ensure the overall quality and consistency of the image.

In one example, the global reconstruction function may be represented by the following formula (1):

（1）

Wherein, Representing a training image sample;

（/> ) Is a low-dimensional low-rank sparse projection matrix;

representing the core norms,/> And represents the sum of L2,1 norms,

Reconstructing the matrix;

Arbitrary matrix The kernel norm is denoted/>，/>For matrix/>I th singular value of >/>Squaring the F norm of the matrix, and representing the sum of squares of all elements in the matrix;

Arbitrary matrix L2,1 norm of (2) is expressed as/>For constraining/>For sparse rows, has the effect of feature selection, wherein/>For matrix/>Is the ij element of >/>For matrix/>I-th row vector of >/>Representing traces, i.e. representing/>Sum, matrix/>, of elements on solution of matrix main pairIs a diagonal array with diagonal elements of/>；

And/>For representing regularization parameters.

It can be understood that in the above formula (1), the singular value decomposition is mainly performed on the kernel matrix of the data by using the kernel norms, then the global feature of the image is extracted by using the singular values, the low rank and sparse constraint is further set on the projection matrix, and finally the extracted image feature can be called as the global low rank sparse projection feature.

The global low-rank sparse projection features which can be extracted by the method can keep lower data dimension, remove noise and redundant information, and improve the expression capability and classification accuracy of data; the global structure and key characteristics of the data can be captured, so that the data can be better understood and analyzed; meanwhile, the global features extracted by the kernel norms have certain robustness to the change and disturbance of the data, and can be better adapted to different data distribution and feature changes; in addition, the method has higher calculation efficiency, can process a large-scale data set and has better expandability.

In one implementation, the preset image feature extraction function includes a local neighborhood preserving function, and performs image feature extraction processing on a plurality of training image samples according to the preset image feature extraction function to obtain image features of the plurality of training image samples, and further includes:

And constructing a local neighborhood preserving function according to the plurality of training image samples, wherein the local neighborhood preserving function is used for extracting local low-rank sparse projection features of the plurality of training image samples.

Local domain preservation is understood to mean that local domain preservation relation, that is, the local structure and detail characteristics of an image are preserved in image processing, so that the characteristics of the image in a local area are kept unchanged, and the local domain preservation is generally used for extracting local characteristics of the image, such as textures, edges, corner points and the like, so as to ensure that local details of the image are preserved.

In one example, the local neighborhood preserving function may be represented by the following formula (2):

（2）

Wherein, Representing a training image sample;

（/> ) Is a low-dimensional low-rank sparse projection matrix;

For/> With k non-zero elements per row,/>Updating in the iteration;

and/> For representing regularization parameters.

It can be understood that, in the formula (2), the kernel norm is used as a matrix error metric, a low-rank projection matrix is obtained by minimizing the kernel norm, the effect of reducing the data dimension can be achieved, the low-rank and sparsity constraint is performed on the projection matrix by the kernel norm, a local feature extraction matrix which is more robust and sparse to noise, shielding and the like can be obtained, and finally the extracted image feature can be called as a local low-rank sparse projection feature.

The local features extracted by the method can retain more important image details, have stronger robustness to noise, illumination change and the like, and are beneficial to subsequent image classification processing.

In combination with the above, the global reconstruction relationship can ensure the quality and consistency of the image as a whole, while the local domain preserving relationship can preserve the details and characteristics of the image, so that the image is more real. By considering the global and local information of the image at the same time, the integral features and local details of the image can be better reserved, so that the accuracy of image classification is improved. The method uses the low-rank norm for global and local error measurement, and simultaneously considers and balances the two relations, so that the optimal image processing effect can be obtained, and a foundation is laid for the correctness of the subsequent image classification.

In one implementation, a preset image feature extraction function is constructed by combining a global reconstruction function and a local neighborhood preserving function;

And carrying out alternate iterative solution processing on the preset image feature extraction function to obtain a low-rank sparse projection matrix, wherein the low-rank sparse projection matrix is used for outputting low-rank sparse projection features of training image samples.

In one example, the above-mentioned preset image feature extraction function may be represented by the following formula (3):

（3）

it can be seen that the preset image feature extraction function has P, Q, W variables to be solved, and the alternating iteration solving process can be performed to obtain a low-rank sparse projection matrix, namely any two variables are fixed, and the third variable is solved, which is described below in combination with a specific mathematical calculation process.

1. Initializing variablesFor/>A unit array of dimensions and initializing a variable/>, by equation (2); Setting a superparameter/> greater than zero，/>And/>。

2. The unknown variable P, Q, W is iteratively determined.

Definition according to the core normsAnd L2,1 norm definition/>It is known that the number of the components,

The formula in formula (3) can be further transformed into the following formula (4):

（4）

Wherein, ，，

Matrix/>Is a diagonal array with diagonal elements of/>。

The first term in formula (4) can be expanded to the following formula (5):

（5）

The second term in formula (4) can be expanded to the following formula (6):

（6）

The third term in the formula (4) is developed as the following formula (7):

（7）

ignoring the constant term, the preset image feature extraction function (4) is known to be equivalent to the following equation (8):

（8）

(1) And fixing Q, W, and obtaining P.

The formula (8) derives P and makes the derivative 0, and the following formula (9) can be obtained

（9）

Wherein,

，

(2) And fixing P, W, and obtaining Q.

The formula (8) derives Q and makes the derivative 0, and the following formula (10) can be obtained

（10）

(3) And fixing P, Q, and obtaining W.

Here, the Lagrange function can be constructed using equation (2) to solve.

(4) It is checked whether a convergence condition is reached or a maximum number of iterations is reached. Stopping iteration if the convergence condition is reached, and outputting a preset low-rank sparse projection matrix; Otherwise, update/>，/>，/>And/>And repeating the steps (1), 2 and 3) in sequence until convergence or the maximum iteration number is reached.

By checking whether the convergence condition is reached, it can be ensured that the algorithm converges to the optimal solution within a certain number of iterations. This may save computational resources and ensure the validity of the algorithm. Meanwhile, the iteration solving process can be flexibly controlled according to convergence conditions, for example, whether iteration is continued or not can be determined according to the change condition of errors, and the finally obtained preset low-rank sparse projection matrix can be used for outputting the low-rank sparse projection characteristics of the training image samples.

It can be understood that, by obtaining the preset low-rank sparse projection matrix, the low-rank sparse projection features of the training image samples can be extracted, and by using the low-rank sparse projection features to train the target image classifier, such as a Support Vector Machine (SVM), a neural network, a decision tree, and the like, the finally trained target image classifier can realize the classification task of the images to be classified.

In one implementation, extracting low-rank sparse projection features of a plurality of image samples to be classified, and inputting the low-rank sparse projection features of the plurality of image samples to be classified into a target image classifier, generating categories corresponding to the plurality of image samples to be classified, includes:

performing feature extraction processing on a plurality of image samples to be classified by using a low-rank sparse projection matrix to obtain low-rank sparse projection features of the plurality of image samples to be classified;

And the target image classifier generates categories corresponding to the plurality of image samples to be classified according to the low-rank sparse projection features of the plurality of image samples to be classified.

Assume that a low-dimensional projection feature matrix of a training image set extracted by using a low-rank sparse projection matrix is as follows（/>) The low-dimensional projection characteristic matrix of the extracted test image set is as follows（/>）。

The low rank distance of two image feature matrices can be defined herein asFor test image samples/>Calculation/>Wherein/>For and test image sample feature matrix/>Training image sample feature matrix nearest to/>Subscript of (1)/>The label of (2) is/>Will/>Division into/>Class, that is/>Belongs to the c-th class of images.

In one implementation, the target image classifier is a nearest neighbor classifier of low rank metric.

A nearest-neighbor classifier of low rank metric may be understood as a nearest-neighbor classifier using low rank metric (low-RANK METRIC) as a distance metric. In the classifier, the distance metric is not a simple euclidean distance or manhattan distance, but is based on the metric mode of the low rank matrix, namely, the low rank metric is obtained by performing dimension reduction or feature selection on the data. That is, it is possible to perform classification prediction by calculating a low rank distance between the test sample and the training sample and based on the nearest neighbor tag.

The nearest neighbor classifier of low rank metric is advantageous in processing high dimensional data because it can better capture the inherent structure and correlation of the data, thereby improving classification accuracy.

In one implementation manner, performing key point detection processing on a plurality of training image samples and/or a plurality of image samples to be classified to obtain key points corresponding to the plurality of training image samples and/or the plurality of image samples to be classified; and carrying out image cutting processing on the plurality of training image samples and/or the plurality of image samples to be classified by taking key points corresponding to the plurality of training image samples and/or the plurality of image samples to be classified as centers to generate a plurality of training image samples and/or a plurality of image samples to be classified with the same size.

Key points (keypoint) of an image generally refer to particular points in the image that have distinct features in the image that can be used to represent local features of the image. The key points are usually extracted through a feature extraction algorithm, such as a Scale-invariant feature transform (Scale-INVARIANT FEATURE TRANSFORM, SIFT) algorithm, an accelerated robust feature (Speeded-Up Robust Features, SURF) algorithm, and the like.

For example, assuming we have a picture of a cat, we can use SIFT algorithm to extract key points in the picture, such as eyes, ears, nose, beard, etc., which have obvious features in the image that can be used to describe the local features of the cat. The specific key point may be selected according to the actual situation, and is not limited herein.

Key points generally represent important features in an image, and clipping centered around the key points can preserve information of these important features, thereby enhancing the robustness and discrimination of the image. By cutting the key points, the training image samples and/or the image samples to be classified can be consistent in size, so that the calculation complexity in subsequent processing can be reduced, and the subsequent feature extraction and image classification processing are facilitated.

The above-described image classification method is generally described below in conjunction with fig. 3.

As shown in FIG. 3, FIG. 3 includes the following steps S301-S306.

S301, acquiring a standard training image set, and constructing a test image set to be classified.

By way of example and not limitation, the test image set herein may be image data different from the standard training image set, or may be image data obtained by subjecting the standard training image set to an occlusion process, an angle distortion process, or a light shading process.

S302, establishing an objective function for representing the global reconstruction relationship of the image sample.

S303, establishing an objective function representing the local domain retention relation of the image sample.

S304, integrating global and local information, establishing a final objective function, and optimally solving an optimal low-rank sparse projection matrix.

The method and the device have the advantages that the overall structure and the local self-adaptive flow pattern structure information are fully considered, the internal topological structure information of the image is reserved, and meanwhile, the discrimination capability of algorithm feature extraction is enhanced. Compared with the traditional image feature extraction method, the method not only uses the low-rank norm for global and local error measurement, but also applies low-rank and sparse constraint to the projection matrix, so that the low-rank and sparse projection matrix is obtained, and the robustness of the feature extraction matrix to noise, shielding and the like is enhanced.

S305, acquiring a low-dimensional projection feature matrix of the training image set test image set by utilizing the projection matrix.

S306, classifying the test image by using the low-rank metric and the nearest-neck classifier to obtain the classification category of the test image.

In the traditional scheme, the main error term is measured mainly by utilizing the square of the F norm, and the processing effect on noise data and abnormal data is poor; in addition, when the image features are extracted, only image reconstruction errors or neighborhood preserving projection reconstruction errors are often considered, and the image recognition rate is easy to be reduced; in the classification and identification stage, the similarity measurement of the projection matrix is mainly carried out by adopting the F norm, and the robust classification performance cannot be obtained.

Compared with the traditional scheme, the method has the advantages that in the image feature extraction stage, based on a two-dimensional image matrix, the overall structure and the local self-adaptive flow pattern structure information are fully considered, the internal topological structure information of the image is reserved, and meanwhile, the discrimination capability of algorithm feature extraction is enhanced; the low-rank norm is used for global and local error measurement, and meanwhile, low-rank and sparse constraint is applied to the projection matrix, so that the low-rank and sparse projection matrix is obtained, and the robustness of the feature extraction matrix to noise, shielding and the like is enhanced; in the image classification and identification stage, the application adopts low-rank norm (i.e. kernel norm) to carry out similarity measurement on the two-dimensional image feature matrix, thereby further enhancing the classification performance of the algorithm.

In summary, the low-rank norm is used as matrix error measurement, and the low-rank and sparse constraint is applied to the projection matrix, so that the performance of the model can be effectively improved, the robustness and generalization capability of the model are enhanced, and the method is better applied to image classification tasks.

A specific application scenario of image classification is described below in connection with fig. 4.

As shown in fig. 4, taking a face image as an example, when the image category of the test image is to be judged, the low-rank sparse projection matrix mentioned above may be used to extract the low-rank sparse projection features of the test image, the similarity between the low-rank sparse projection features of the test image and the low-rank sparse projection features of each training image is measured, that is, the low-rank distance between the two is judged, and finally, the test image is classified into the image category corresponding to the training image with the nearest low-rank distance by using the nearest neighbor classifier, where fig. 4 only shows that the test image belongs to the (a) category, and the specific category is different according to the actual situation, and the present invention is not limited.

The method of the embodiment of the present application is mainly described above with reference to the drawings. It should be understood that, although the steps in the flowcharts related to the embodiments described above are shown in order, these steps are not necessarily performed in the order shown in the figures. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages. An apparatus according to an embodiment of the present application will be described with reference to the accompanying drawings. For brevity, the description of the apparatus will be omitted appropriately, and the relevant content may be referred to the relevant description in the above method, and the description will not be repeated.

As shown in fig. 5, the apparatus 1000 includes the following units.

An obtaining unit 1001 is configured to obtain a training image set, where the training image set includes a plurality of training image samples.

The processing unit 1002 is configured to perform image feature extraction processing on the plurality of training image samples according to a preset image feature extraction function, so as to obtain low-rank sparse projection features of the plurality of training image samples.

The processing unit 1002 is further configured to perform image classification training by using low-rank sparse projection features of the plurality of training image samples, to obtain a target image classifier.

The obtaining unit 1001 is further configured to obtain an image set to be classified, where the image set to be classified includes a plurality of image samples to be classified.

The processing unit 1002 is further configured to extract low-rank sparse projection features of the plurality of image samples to be classified, input the low-rank sparse projection features of the plurality of image samples to be classified into the target image classifier, and generate categories corresponding to the plurality of image samples to be classified.

In one implementation, the obtaining unit 1001 may also be configured to perform the method in step S301.

In one implementation, the processing unit 1002 may be further configured to perform the methods in steps S302 to S306.

In an implementation, the apparatus 1000 further comprises a storage unit 1003, which storage unit 1003 may be used to store instructions and/or data, thereby implementing the method in the above-described embodiment.

It should be noted that, because the content of information interaction and execution process between the above units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.

Fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 6, the computer device 3000 of this embodiment includes: at least one processor 3100 (only one is shown in fig. 6), a memory 3200, and a computer program 3210 stored in the memory 3200 and executable on the at least one processor 3100, which processor 3100, when executing the computer program 3210, causes the computer apparatus to carry out the steps in the embodiments described above.

The Processor 3100 may be a central processing unit (Central Processing Unit, CPU), but the Processor 3100 may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Memory 3200 may in some embodiments be an internal storage unit of computer device 3000, such as a hard disk or memory of computer device 3000. Memory 3200 may also be an external storage device of computer device 3000 in other embodiments, such as a plug-in hard disk provided on computer device 3000, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), or the like. Further, memory 3200 may also include both internal and external storage units of computer device 3000. The memory 3200 is used to store an operating system, application programs, boot Loader (Boot Loader) data, other programs, and the like, such as program codes of computer programs, and the like. The memory 3200 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that the above-described functional units are merely illustrated in terms of division for convenience and brevity, and that in practical applications, the above-described functional units and modules may be allocated to different functional units or modules according to needs, i.e., the internal structure of the apparatus may be divided into different functional units or modules to perform all or part of the above-described functions. The functional units in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present application. The specific working process of the units in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

The embodiment of the application also provides a computer readable storage medium, and the computer readable storage medium stores a computer program, and when the computer program is executed by computer equipment, the computer equipment realizes the steps in the method embodiments.

Embodiments of the present application provide a computer program product enabling a computer device to carry out the above-mentioned methods when the computer program product is run on the computer device.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may be implemented as a whole or as part of the flow of the method of the above embodiments, by a computer program, which may be stored in a computer readable storage medium, and which when executed by a processor, causes a computer device to implement the steps of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing device/terminal apparatus, recording medium, computer Memory, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), electrical carrier signals, telecommunications signals, and software distribution media. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application. In the description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

Furthermore, in the description of the application and the claims that follow, the terms "comprise," "include," "have" and variations thereof are used to mean "include but are not limited to," unless specifically noted otherwise.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus, computer device and method may be implemented in other manners. For example, the apparatus, computer device embodiments described above are merely illustrative, e.g., the partitioning of elements is merely a logical functional partitioning, and there may be additional partitioning in actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. An image classification method, comprising:

acquiring a training image set, wherein the training image set comprises a plurality of training image samples;

Performing image feature extraction processing on the plurality of training image samples according to a preset image feature extraction function to obtain low-rank sparse projection features of the plurality of training image samples, wherein the preset image feature extraction function comprises a global reconstruction function and a local neighborhood preserving function, and the global reconstruction function and the local neighborhood preserving function perform low-rank kernel norm and sparse norm regularization processing on a projection matrix corresponding to the low-rank sparse projection features at the same time;

The global reconstruction function is expressed as:

；

Wherein, Representing a training image sample; /(I)（/>) Is a low-dimensional low-rank sparse projection matrix;

representing the core norms,/> Represent L2,1 norm,/>Reconstructing the matrix; arbitrary matrix/>The core norms are expressed as，/>For matrix/>I th singular value of >/>Squaring the F norm of the matrix, and representing the sum of squares of all elements in the matrix; arbitrary matrix/>Expressed as L2,1 norm of (2)Wherein/>For matrix/>Is the ij element of >/>For matrix/>I-th row vector of >/>Representative trace, matrix/>For diagonal matrix,/>And/>Alpha, beta are used to represent regularization parameters;

The local neighborhood preserving function is expressed as:

；

Wherein, Representing a training image sample; /(I)（/>) Is a low-dimensional low-rank sparse projection matrix; arbitrary matrix/>The kernel norm is denoted/>，/>For matrix/>I th singular value of >/>Squaring the F norm of the matrix, and representing the sum of squares of all elements in the matrix; /(I)For/>With k non-zero elements per row,/>Updating in the iteration; /(I)And/>Alpha, beta are used to represent regularization parameters;

Performing image classification training by using the low-rank sparse projection features of the plurality of training image samples to obtain a target image classifier;

acquiring an image set to be classified, wherein the image set to be classified comprises a plurality of image samples to be classified;

Extracting low-rank sparse projection features of the plurality of image samples to be classified, inputting the low-rank sparse projection features of the plurality of image samples to be classified into the target image classifier, and generating categories corresponding to the plurality of image samples to be classified.

2. The method according to claim 1, wherein the performing image feature extraction processing on the plurality of training image samples according to a preset image feature extraction function to obtain low-rank sparse projection features of the plurality of training image samples includes:

And constructing the global reconstruction function according to the plurality of training image samples, wherein the global reconstruction function is used for extracting global low-rank sparse projection features of the plurality of training image samples.

3. The method according to claim 2, wherein the preset image feature extraction function includes a local neighborhood preserving function, the image feature extraction processing is performed on the plurality of training image samples according to the preset image feature extraction function, so as to obtain image features of the plurality of training image samples, and further comprising:

and constructing the local neighborhood preserving function according to the plurality of training image samples, wherein the local neighborhood preserving function is used for extracting local low-rank sparse projection features of the plurality of training image samples.

4. A method according to claim 3, characterized in that the method further comprises:

Constructing the preset image feature extraction function by combining the global reconstruction function and the local neighborhood preserving function;

And carrying out iterative solution processing on the preset image feature extraction function to obtain a low-rank sparse projection matrix, wherein the low-rank sparse projection matrix is used for outputting low-rank sparse projection features of the training image samples.

5. The method of claim 4, wherein the extracting low-rank sparse projection features of the plurality of image samples to be classified and inputting the low-rank sparse projection features of the plurality of image samples to be classified into the target image classifier generates a class corresponding to the plurality of image samples to be classified comprises:

Performing feature extraction processing on the plurality of image samples to be classified by using the low-rank sparse projection matrix to obtain low-rank sparse projection features of the plurality of image samples to be classified;

6. The method of claim 5, wherein the target image classifier is a nearest neighbor classifier of a low rank metric.

7. The method according to claim 1, wherein the method further comprises:

Performing key point detection processing on the training image samples and/or the image samples to be classified to obtain key points corresponding to the training image samples and/or the image samples to be classified;

And carrying out image cutting processing on the training image samples and/or the image samples to be classified by taking key points corresponding to the training image samples and/or the image samples to be classified as centers to generate the training image samples and/or the image samples to be classified with the same size.

8. An image classification apparatus, comprising:

The acquisition unit is used for acquiring a training image set, wherein the training image set comprises a plurality of training image samples;

The processing unit is used for carrying out image feature extraction processing on the training image samples according to a preset image feature extraction function to obtain low-rank sparse projection features of the training image samples, the preset image feature extraction function comprises a global reconstruction function and a local neighborhood preserving function, and the global reconstruction function and the local neighborhood preserving function simultaneously carry out low-rank nuclear norm regularization processing and sparse norm regularization processing on a projection matrix corresponding to the low-rank sparse projection features;

The global reconstruction function is expressed as:

；

The local neighborhood preserving function is expressed as:

；

The processing unit is further used for performing image classification training by utilizing the low-rank sparse projection features of the plurality of training image samples to obtain a target image classifier;

The acquisition unit is further used for acquiring an image set to be classified, wherein the image set to be classified comprises a plurality of image samples to be classified;

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, when executing the computer program, causing the computer device to implement the method of any one of claims 1 to 7.

10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a computer device, implements the method according to any of claims 1 to 7.