CN111881312B

CN111881312B - Image data set classification and division method

Info

Publication number: CN111881312B
Application number: CN202010722578.3A
Authority: CN
Inventors: 邓嘉新; 王亚强; 曹亮
Original assignee: Chengdu Cheng Xin High Tech Information Technology Co ltd; Chengdu University of Information Technology
Current assignee: Chengdu Cheng Xin High Tech Information Technology Co ltd; Chengdu University of Information Technology
Priority date: 2020-07-24
Filing date: 2020-07-24
Publication date: 2022-07-05
Anticipated expiration: 2040-07-24
Also published as: CN111881312A

Abstract

The invention discloses a method for classifying and dividing an image data set, which comprises the following steps: building a pre-training model, generating a projection matrix group R, zooming the image data of the picture, and importing the image data into the pre-training model; performing feature extraction on picture image data through a pre-training model to generate a feature vector x with a certain dimension; performing L2 regularization on each generated feature vector x, and scaling the feature vectors into unit vectors in a high-dimensional spherical space; for each projection matrix Ri, calculating the matrix to obtain a result vector, and taking an index corresponding to the maximum value in the result vector as a hash value hi of the characteristic vector; and combining all the calculated matrix hash values hi to obtain a group of hash values, taking the hash values as the hashes of the image characteristic quantities, and dividing the images with the same hash values into one class. By the scheme, the image features can be effectively extracted, the image data set is accurately divided, and the probability of Hash collision is reduced.

Description

Image data set classification and division method

Technical Field

The invention relates to the field of deep learning image data set classification processing, in particular to an image data set classification dividing method.

Background

With the development of multimedia technology, image classification has become the key point of research in the field of computer vision, image classification is to divide images into different preset categories according to certain attributes of the images, how to effectively express the images is the key point for improving the accuracy of image classification, and the problem of selecting and extracting features is the difficult problem existing in image classification at present. With the rapid development of mobile internet, human society has entered the big data era. Although some features of the images can be extracted through traditional feature learning such as SIFT and HOG, and a good effect is achieved in image classification, the artificial feature design method has certain defects. The existing image classification technology is not accurate enough in dividing an image data set, and when a Hash algorithm is used for processing image data, the probability of Hash collision cannot be effectively reduced.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide the image data set classification and division method, which can effectively extract image features, accurately divide the image data set and reduce the probability of Hash collision.

The purpose of the invention is realized by the following technical scheme:

a method for classifying and dividing an image data set comprises the following steps:

s1, building a pre-training model, randomly generating a projection matrix group R, zooming the image data of the pictures without the classification marks, and importing the image data into the pre-training model;

s2, performing feature extraction on the imported image data without the classified pictures through a pre-training model to generate a feature vector x with a certain dimensionality;

s3, performing L2 regularization on each generated feature vector x by using a regularization formula, and scaling the feature vectors into unit vectors in a high-dimensional spherical space;

s4, for each projection matrix Ri, calculating the matrix by using a Hash function calculation formula to obtain a result vector, and taking an index corresponding to the maximum value in the result vector as a Hash value hi of the feature vector;

s5, all the hash values hi calculated by the projection matrix group R in step S4 are combined to obtain a group of hash values and the group of hash values is used as a hash of the image feature quantity, and at this time, the images having the same hash value are classified into one type.

Specifically, the size of the projection matrix set in step S1 is 2048 × B, where B is the set hash bucket size.

Specifically, the pre-training model in step S1 is a ResNet50 model.

Specifically, the size of the scaled picture image data in step S1 is 224 × 224.

Specifically, the regularization formula in step S3 is:

wherein x is_iRepresenting the ith feature in the x vector.

Specifically, the hash function calculation formula in step S4 is as follows: hi ═ argmax (x × R)_i)。

Specifically, the hash of the image feature vector in step S5 is represented as: hash ═ h₀,h₁,h₂,...]。

The invention has the beneficial effects that: by the scheme, the image features can be effectively extracted, the image data set is accurately divided, and the probability of Hash collision is reduced.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

FIG. 2 is a diagram of a pre-trained model architecture of the present invention.

FIG. 3 is a schematic diagram of the image feature vector classification process according to the present invention.

Detailed Description

In order to more clearly understand the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings.

The image data set is classified and divided, and the classification result is that each unit belongs to a certain class, but the classification result is not allowed to belong to the class or belong to another class or is omitted. Such as classifying images as people, cats, dogs, tables, etc., the images in each class belong strictly to the class and not to the other classes, and the sum of the number of images in all classes is equal to the sum of the overall images.

In this embodiment, the present invention mainly provides a method for classifying and dividing an image data set for an image without a classification mark, as shown in fig. 1, the method includes the following steps:

step 1, firstly, building a pre-training ResNet50 model, randomly generating a projection matrix group R, scaling the image data of the pictures without the classification marks to 224 multiplied by 224, and importing the image data into the pre-training model. The size of the projection matrix set is 2048 × B, where B is the set hash bucket size.

And 2, performing feature extraction on the imported image data of the non-classified pictures through a pre-training model to generate a 2048-dimensional feature vector x.

And 3, performing L2 regularization on each generated feature vector x by using a regularization formula, and scaling the feature vectors into unit vectors in a high-dimensional spherical space. Wherein the regularization formula is shown as follows:

wherein x is_iRepresenting the ith feature in the x vector.

And 4, calculating a formula for each projection matrix Ri by using a hash function: hi ═ argmax (x × R)_i) And calculating the matrix to obtain a result vector, and taking an index corresponding to the maximum numerical value in the result vector as the hash value hi of the characteristic vector.

Step 5, combining all the hash values hi calculated by the projection matrix set R in step S4 to obtain a set of hash values, which are expressed as: hash ═ h₀,h₁,h₂,...]At this time, the images having the same hash value are classified into one class.

In this embodiment, image features are extracted mainly by a pre-trained ResNet50 model, and ResNet50 is a neural network formed by stacking a series of convolutional layers, and its structure is shown in fig. 2. Each layer other than the layer [3x3maxpool,64] representing the maximum pooling layer represents one convolutional layer, and the convolutional layer represented by [7x7,64/2] represents a convolutional layer having a convolutional kernel size of 7x7, a channel number of 64, and a step size of 2, and the other layers have a step size of 1, and have a total of 50 layers unless otherwise specified.

For each picture, before input, the picture is scaled to 224x224, and after feature extraction, a feature vector of 2048 dimensions is generated.

Then, for each feature vector x, using L2 regularization, after L2 regularization, the feature vector will be scaled to a unit vector in the high-dimensional spherical space.

Then, a projection matrix with a size of 2048xB is randomly generated, where B is the size of the set hash bucket. The projection matrix is globally unique, then matrix multiplication is carried out on the eigenvector and the matrix to obtain a 1xB result vector, and finally the index corresponding to the maximum numerical value in the result vector is taken as the hash value of the eigenvector.

In order to reduce the probability of hash collision, multiple projection matrices may be used for projection to obtain a set of hash values, and the eigenvectors that hold the same hash will be classified into a class, and the classification process is shown in fig. 3. By the image data dividing method, the image characteristics can be effectively extracted, the image data set can be accurately divided, and the probability of Hash collision is reduced.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, and such changes and modifications are within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method for classifying and partitioning an image data set, the method comprising the steps of:

s4, for each projection matrix Ri, calculating the matrix by using a Hash function calculation formula to obtain a result vector, and taking an index corresponding to the maximum value in the result vector as a Hash value hi of the feature vector; the hash function calculation formula in step S4 is: hi ═ argmax (x × R)_i)；

S5, all the hash values hi calculated by the projection matrix set R in step S4 are combined to obtain a set of hash values and the hash values are used as hashes of the image feature quantities, and at this time, the images with the same hash value are classified into one type.

2. The method for classifying and dividing an image data set according to claim 1, wherein the size of the projection matrix set in step S1 is 2048 × B, where B is the set hash bucket size.

3. The method as claimed in claim 1, wherein the pre-training model in step S1 is ResNet50 model.

4. The method for classifying and dividing an image data set according to claim 1, wherein the size of the scaled image data in step S1 is 224x 224.

5. The method for classifying and dividing an image data set according to claim 1, wherein the regularization formula in the step S3 is:

wherein x is_iRepresenting the ith feature in the x vector.

6. The method for classifying and dividing an image data set according to claim 1, wherein the hash of the image feature vector in step S5 is represented by: hash ═ h₀,h₁,h₂,...]。