Pearl classification method based on deep learning
Technical Field
The invention relates to the technical field of image processing and pattern recognition, in particular to a pearl classification method based on deep learning.
Background
China is a large country for producing fresh water pearls, the yield of the fresh water pearls accounts for 95% of the world yield, Zhejiang province and city are the largest bases for cultivation, processing and sale of fresh water pearls in China, and the total yield accounts for more than half of the total yield of the whole country and is known as 'county of Chinese pearls'. The cultivation area of all fresh water pearls reaches 38 ten thousand mu, and 1500 pearl processing enterprises are owned.
After most pearl enterprises collect a large number of pearls, in order to classify the pearls into different grades, the pearls need to be classified manually, the number of people used is very large, and the requirements on professional literacy of classifiers are high. Because manual classification can be influenced by various factors, especially under the condition that pearls are small in size and large in quantity, the classification result is unstable, and the subjective influence of individuals is large. Therefore, the rapid and accurate classification of pearls into a great number of pearl enterprises by using a machine is urgently needed.
Chinese patent application No. 201210411979.2 discloses an on-line automatic grading device for pearl color and gloss based on monocular multi-view machine vision, which comprises a production line for automatically detecting and classifying pearls, a monocular multi-view machine vision device for shooting images of detected pearls, and a microprocessor for performing image processing, detection, identification, classification and coordination control on the images of the detected pearls and coordinating and controlling the coordination action of each action mechanism on the production line, wherein the production line comprises a feeding action mechanism, a delivery inspection action mechanism, a blanking action mechanism, a grading action mechanism and a grading execution mechanism. The invention detects the color and the luster of the detected pearl, but does not detect the screw thread of the pearl, and the screw thread of the pearl is an important judgment standard for manual classification in the actual manual classification occasion.
The early image classification task solution mainly comprises two steps, wherein one step is to manually design features and classify the designed features by using classifiers such as SVM (support vector machine), and the other step is to construct a shallow learning system.
Support Vector Machine (SVM) is a Machine learning algorithm first proposed by Vapnik et al in 1995. The SVM is based on a VC (virtual component modeling) dimension theory of a statistical learning theory and a structure risk minimization principle, and can seek the best compromise between the complexity (namely the learning precision of a specific training sample) and the learning capability (namely the capability of identifying any sample without errors) of a model according to limited sample information so as to obtain the best classification and identification capability.
Because the number and the position of the threads of different pearls are different, and the clarity of the threads of the pearls is influenced by the color and the luster of the threads when the pearl images are collected, the manual design of the characteristics is not easy, and the designed characteristics are not necessarily suitable for the current classification task.
Disclosure of Invention
In order to solve the problems that the existing pearl classification system based on machine vision is difficult to detect and classify the screw thread of a pearl with high precision, the manual design of characteristics is not easy to extract the screw thread characteristics of the pearl and the like, the invention provides the pearl classification method based on deep learning.
The technical scheme adopted by the invention for realizing the aim of the invention is as follows:
a pearl classification method based on deep learning comprises the following steps:
1) collecting pearl images as sample data, wherein each pearl comprises a set number of images;
2) adjusting the collected pearl image into a set size, and carrying out image preprocessing on the pearl image to remove the noise of the image;
3) dividing sample data into training data and test data;
4) setting initial parameters of each network layer of the deep convolutional network, inputting the training data divided in the step 3) into the deep convolutional network, and training the network;
5) extracting the characteristics of training data and test data by using the trained deep convolutional network;
6) constructing an SVM classifier by using the characteristics of the training data extracted in the step 5), and classifying the test data by using the SVM classifier.
Further, the pearl images obtained in the step 1) comprise two types, namely a pearl image with threads and a pearl image without threads.
Further, bilateral filtering is adopted in the step 2) to remove noise of the image. Of course, other denoising methods may be employed.
Still further, the step 4) comprises the following steps:
4.1) determining the structure of the deep convolutional network;
4.2) initializing parameters to be trained in the network by using different small random numbers;
4.3) inputting the training data divided in the step 3) into the deep convolutional network, calculating the error between the output of the deep convolutional network and the actual class label, and adjusting the weight and the bias term of each layer of the deep convolutional network through an error back propagation algorithm until the network is stable or the set maximum iteration number is reached.
Furthermore, in the step 1), each pearl comprises 5 images which are respectively a top view, a left view, a right view, a rear view and a front view; the deep convolution network constructed in the step 4) is composed of 5 convolution networks, and 5 views of the pearl are respectively used as the input of the 5 convolution networks to extract the thread characteristics.
Preferably, the 5 convolutional networks have the same structure and respectively include 2 convolutional layers, 2 downsampling layers and 1 full-connection layer.
The output characteristics of the 5 convolutional networks need to be accumulated by an ELTWISE layer, combined into 1 eigenvector, input the eigenvector into an FC full connection layer, and input the eigenvector into an SOFTMAX layer after the FC full connection layer maps the eigenvector to two dimensions, wherein the SOFTMAX layer consists of 2 independent neurons and corresponds to a threaded pearl and a non-threaded pearl.
Further, the step 5) of extracting features by using the trained deep convolution network means that sample data is used as the input of the trained deep convolution network, and the output of the elttune layer of the deep convolution network is used as the thread features of the sample pearl.
Further, the step 6) includes the steps of:
6.1) preprocessing the characteristics of the training data extracted in the step 5);
pretreatment: normalizing the obtained feature data, and mapping the value of each dimension of the feature data to [0, 1]]An interval; the conversion function is
Wherein x
maxIs the maximum value of the characteristic data, x
minIs the minimum value of the characteristic data;
6.2) constructing an SVM classifier by using the training data after 6.1) normalization;
6.3) normalizing the test data according to the mapping relation during the normalization of the training data;
6.4) classifying the normalized test data by using an SVM classifier, and judging whether the pearl has threads.
The technical conception of the invention is as follows: deep learning is a hot point in recent years, the concept of deep learning is derived from the research of artificial neural networks, and a multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning forms a more abstract high-level representation (attribute classes or features) by combining low-level features to find a distributed feature representation of the data. The convolution network is a structure of deep learning, and the convolution network makes a great breakthrough in the problems of image classification and the like. The convolutional network is composed of one or more convolutional layers and a top full-connection layer, and also comprises an associated weight and a down-sampling layer, and the structure enables the convolutional neural network to utilize a two-dimensional structure of input data, and an image can be directly used as the input of the network, so that the complex processes of feature extraction and data reconstruction in the traditional recognition algorithm are avoided. The weight sharing network structure of the method is closer to a biological neural network, so that the complexity of a model is reduced, and the number of weights is reduced. The invention has the beneficial effects that: 1) the thread characteristics of the pearls are extracted by using a deep convolution network, the self-learning advantage of deep learning is fully exerted, good characteristics are automatically learned by a machine, the complex process of manually extracting the characteristics and designing the characteristics is omitted, the defect of manually and manually designing the characteristics is avoided, the process is simplified, and the time is saved; 2) 5 convolution networks are constructed aiming at 5 views of pearls, the image information of the whole pearl is fully utilized, and the defect that the image under a single visual angle has no integral representativeness is avoided; 3) the local features of the images are extracted by the deep convolutional network, and then the features are combined, so that the extracted features have certain translational and rotational invariance, and the influence caused by the randomness of the pearl placement positions when pearl images are collected is avoided to a certain extent; 4) the deep convolutional network is organically combined with a Support Vector Machine (SVM), the deep convolutional network is used for extracting features, and the SVM is used for classification and recognition, so that the classification accuracy is guaranteed.
Drawings
Fig. 1 is a flow chart of the whole classification method.
Fig. 2 is a schematic diagram of the overall structure of the deep convolutional network.
Fig. 3 is a schematic diagram of a specific network structure of a convolutional network.
Detailed Description
The following detailed description of embodiments of the invention is provided in connection with the accompanying drawings.
Referring to fig. 1 to 3, a pearl classification method based on deep learning includes the following steps:
1) acquiring pearl images as sample data, wherein each pearl comprises 5 images which are respectively a top view, a left view, a right view, a rear view and a main view;
the captured images include two types of pearl images, a threaded pearl image and a non-threaded pearl image.
2) Regulating the collected pearl image into 150 x 150 pixels, and carrying out image preprocessing on the pearl image to remove the noise of the image;
removing noise of the image by utilizing bilateral filtering, wherein the collected image is a color image, and the R, G, B three color channels need to be subjected to bilateral filtering respectively; marking the pearl image as I, marking the current pixel point as (x, y), and then marking the brightness value of the R component of the current pixel as I
r(x, y), the brightness value of the G component of the current pixel is denoted as I
g(x, y), the brightness value of the B component of the current pixel is denoted as I
b(x, y), the brightness difference between the current pixel point and the adjacent pixel point (x-i, y-j) is recorded as
The calculation formula of (2) is as follows:
the pixel brightness value after the pixel (x, y) is filtered is denoted as F (x, y), and the calculation formula is:
wherein sigmar、σdIs the smoothing parameter, m and n are the length and width of the filtering window, respectively; the three color channel filtered F are separately computed R, G, B to obtain filtered images.
3) Dividing sample data into training data and test data;
according to the proportion of training data to test data being 8: and 2, dividing sample data, wherein the training data is used for training parameters of the deep convolutional network, and the test data is used for testing the performance of the classifier.
4) Setting initial parameters of each network layer of the deep convolutional network, inputting the training data obtained in the step 3) into the deep convolutional network, and training the network;
4.1) determining the structure of the deep convolutional network;
the structural schematic diagram of the deep convolutional network is shown in FIG. 2, and a top view, a left view, a right view, a rear view and a main view of a preprocessed pearl are respectively used as the input of 5 convolutional networks CNN 1-5;
the ELTWISE layer superposes the outputs of the 5 convolution networks to obtain a feature vector of the pearl image, and the feature vector is used as the input of the FC full-connection layer;
the FC layer maps the feature vector to a two-dimensional vector and takes the two-dimensional vector as the input of the SOFTMAX layer;
the SOFTMAX layer is mainly used for normalizing data after the full connection layer to enable the range of the data to be between [0 and 1], and consists of 2 independent neurons corresponding to threaded pearls and unthreaded pearls;
the 5 convolutional networks CNN 1-5 adopt the same network structure, as shown in FIG. 3, including:
the CONV1 convolutional layer is obtained by multiplying 20 5 × 5 convolutional kernels by corresponding elements of the image sample data, summing the products and adding a bias term to obtain a characteristic diagram of the CONV1 convolutional layer;
POOLING1 downsampling layer, which can utilize the principle of image local correlation to subsample the image and can reduce the data processing amount and retain useful information; sub-sampling the characteristic diagram of the CONV1 convolutional layer by using 20 2 × 2 sampling cores of the POOLING1 downsampling layer to obtain the characteristic diagram of the POOLING1 downsampling layer; the common sampling method of the down-sampling layer comprises average value down-sampling, random down-sampling and maximum value down-sampling, wherein the down-sampling layer adopts maximum value down-sampling. The average value down-sampling refers to outputting the average value of all elements in a sampling window as a sampling result; random down-sampling refers to randomly selecting a value of an element in a sampling window to be output as a sampling result; the maximum value downsampling refers to taking the maximum value of all elements in a sampling window as a sampling result;
the CONV2 convolutional layer is obtained by performing convolution operation on 50 convolution kernels of 5 multiplied by 5 and a feature map of a POOLING1 downsampling layer, summing the convolution kernels and adding a bias term to the sum, and then obtaining a feature map of the CONV2 convolutional layer;
the POOLING2 downsampling layer, and sub-sampling the characteristic diagram of the CONV2 convolutional layer by adopting 50 2 × 2 sampling cores to obtain the characteristic diagram of the POOLING2 downsampling layer;
the FC full-connection layer is used for mapping the characteristic diagram of the POOLING2 to a vector with 500 dimensions;
the RELU active layer adopts an active function max (x,0), when x is greater than 0, x is output, otherwise, 0 is output, and x refers to the numerical value of the output vector of the FC full-connection layer;
4.2) initializing parameters to be trained in the network by using different small random numbers;
4.3) inputting the training data divided in the step 3) into the deep convolutional network, calculating the error between the output of the deep convolutional network and the actual class label, and adjusting the weight and bias items of each layer of the deep convolutional network through an error back propagation algorithm until the network is stable or reaches the set maximum iteration times;
after a deep convolutional network is constructed and parameters to be trained in the network are initialized by different small random numbers, training data divided in the step 3) are used as input of the network, prediction output obtained by the network is compared with real class labels of the training data, errors of the prediction output and the real class labels are propagated reversely, and parameters of the deep convolutional network are updated by a random gradient descent method according to the errors, so that the errors between the prediction output and the real class labels are gradually reduced;
and training a plurality of training data for a plurality of times, updating the parameters of the deep convolutional network every training to continuously reduce the error between the prediction output and the real class label of the training data, and determining the currently learned model parameters as the trained model parameters when the error is less than a preset value or the training iteration times exceed the preset maximum iteration times so as to obtain the trained deep convolutional network.
5) Extracting the characteristics of training data and test data by using the trained deep convolutional network;
taking training data and test data as the input of a trained deep convolution network, and taking the output of an ELTWISE layer of the deep convolution network as the thread characteristics of the training data and the test data;
6) constructing an SVM classifier by using the thread characteristics of the training data extracted in the step 5), and classifying the test data by using the SVM classifier;
6.1) preprocessing the characteristics of the training data extracted in the step 5);
pretreatment: normalizing the obtained feature data, and mapping the value of each dimension of the feature data to [0, 1]]An interval; the conversion function is
Wherein x
maxIs the maximum value of the characteristic data, x
minIs the minimum value of the characteristic data;
6.2) constructing an SVM classifier by using the training data after 6.1) normalization;
6.3) normalizing the test data according to the mapping relation during the normalization of the training data;
6.4) classifying the normalized test data by using an SVM classifier, and judging whether the pearl has threads.