CN111401156B

CN111401156B - Image identification method based on Gabor convolution neural network

Info

Publication number: CN111401156B
Application number: CN202010134463.2A
Authority: CN
Inventors: 达飞鹏; 庄磊
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2020-03-02
Filing date: 2020-03-02
Publication date: 2022-11-18
Anticipated expiration: 2040-03-02
Also published as: CN111401156A

Abstract

The invention discloses an image identification method based on a Gabor convolutional neural network, which comprises the following processing steps: (1) Selecting Gabor wavelets with different parameters to construct a Gabor characteristic extraction module; (2) Building a parallel convolution module through the convolution layer shared by the weight; (3) Operating and designing a spatial transformation pooling module by taking the maximum value through element-wise; (4) Constructing a Gabor convolution layer according to the Gabor feature extraction module, the parallel convolution module and the spatial transformation pooling module; (5) Selecting a reference network for constructing a Gabor convolutional neural network, and determining a scheme for replacing a common convolutional layer in the reference network by the Gabor convolutional layer; (6) And training the Gabor convolutional neural network by utilizing an SGD algorithm with momentum and carrying out image recognition. The recognition method provided by the invention has low algorithm complexity, higher robustness for space transformation, and certain improvement on recognition accuracy and speed.

Description

Image identification method based on Gabor convolution neural network

Technical Field

The invention belongs to the technical field of image recognition, particularly relates to an image recognition method based on a Gabor convolutional neural network, particularly relates to an image recognition method combining knowledge in the field of Gabor traditional image processing and deep learning parameter learning, and is particularly suitable for occasions of space transformation such as large rotation, scale transformation and the like.

Background

The image recognition technology is an important field of artificial intelligence, and is more and more emphasized, the recognition technology mainly comprises optical character recognition, face recognition, vehicle recognition, biomedical image recognition and the like, and the method has important application value in many fields of public security criminal investigation, natural resource analysis, weather forecast, environment monitoring, physiological lesion research and the like. The traditional pattern recognition method Gabor wavelet has been widely applied to the field of image processing; in recent years, convolutional neural networks have greatly promoted the development of image recognition techniques in the field of computer vision, such as character recognition and face recognition. Therefore, the research on image identification of the combination of the Gabor wavelet and the convolutional neural network is a work which has great social application value and theoretical innovation.

The deep convolutional neural network can extract expressive features through network learning, however, since the classical convolutional neural network lacks a module specifically used for processing rotation and scale transformation, it is difficult to learn the features with high robustness to spatial transformation through the network, and the application of the features in actual scenes with angular rotation, scale transformation and the like is limited. The method for improving the network robustness has high calculation cost, a shallow network is constructed by using less direction information, and the problems that depth features with expressive force are difficult to extract, the extracted features are not robust enough to rotate, an efficient deep network is difficult to construct and the like exist.

Disclosure of Invention

The technical problem is as follows: in order to overcome the defects in the prior art, the invention provides an image identification method based on a Gabor convolutional neural network (a neural network constructed based on a Gabor convolutional layer), which can fully extract the depth characteristics of an image without improving the complexity of network operation, and has higher identification precision and robustness to spatial transformation.

The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:

an image identification method based on a Gabor convolutional neural network is characterized in that the Gabor convolutional neural network is obtained by replacing at least one convolutional layer in a convolutional neural network CNN with a Gabor convolutional layer, and the Gabor convolutional layer is composed of a Gabor feature extraction module, a parallel convolutional module and a spatial transform pooling module which are connected in sequence; the Gabor characteristic extraction module consists of M Gabor wavelet extraction modules with different direction and scale parameters, the parallel convolution module consists of M convolution layers shared by weights, and the spatial transformation pooling module performs element-wise maximization on the output of the parallel convolution module;

the image recognition method comprises the following steps:

step 1: preprocessing a sample image in a sample set, wherein the preprocessing comprises image graying and image space transformation;

step 2: training the Gabor convolutional neural network through the sample set preprocessed in the step 1, wherein a cost function of the Gabor convolutional neural network is minimized by using a SGD algorithm with momentum:

wherein: j (theta) is a cost function; theta is a parameter of the Gabor convolution neural network; m is the number of samples; x is the number of ⁽ⁱ⁾ Represents the ith sample; h is _θ (x ⁽ⁱ⁾ ) A prediction class label representing an ith sample; y is ⁽ⁱ⁾ A category label for the ith sample;

and step 3: after the image to be recognized is preprocessed in the step 1, the preprocessed image is input into the Gabor convolutional neural network trained in the step 2, and a recognition result of the image to be recognized is obtained.

Further, gabor wavelets with the same direction and different scales are selected to construct a Gabor convolution layer which is robust to scale conversion, or Gabor wavelets with different directions and different scales are selected to construct a Gabor convolution layer which is robust to space conversion.

Further, the step 3 adopts a cross validation method to select an optimal Gabor convolutional neural network, which specifically comprises the following steps: and randomly dividing the sample set into k parts, taking k-1 parts as a training set and 1 part as a verification set, sequentially rotating the training set and the verification set k times to respectively train and verify the Gabor convolutional neural network, and selecting the trained Gabor convolutional neural network with the minimum verification error as the optimal Gabor convolutional neural network and storing the optimal Gabor convolutional neural network.

Has the beneficial effects that: compared with the prior art, the invention has the following beneficial effects:

the invention provides an image identification method based on a Gabor convolutional neural network, which is characterized in that a Gabor convolutional layer is constructed by designing a Gabor characteristic extraction module, a parallel convolution module and a spatial transform pooling module, and compared with a common convolutional layer, the Gabor convolutional layer firstly extracts multi-direction and multi-scale shallow layer characteristics of an image through Gabor wavelets with different parameters and different characteristics, then learns the deep layer characteristics of the image through the parallel convolution module, and finally obtains the characteristics of spatial transform robustness through spatial change pooling operation. And replacing the common convolutional layer with a Gabor convolutional layer on the basis of the reference convolutional neural network to construct a Gabor convolutional neural network and extract the robust features of the image. Compared with other image identification technologies based on the neural network, the method has the following advantages:

1) The method utilizes the complementarity between the shallow feature and the deep feature, and effectively improves the precision and the robustness of the algorithm;

2) The traditional image processing experience information of the Gabor wavelet is introduced, so that the shallow layer characteristics of the image are effectively extracted, a foundation is laid for the subsequent efficient network learning, and the training efficiency of the network is improved;

3) Constructing a Gabor convolutional layer by using Gabor wavelets, a parallel convolutional module and a spatial pooling module, further replacing the corresponding convolutional layer, and improving the feature extraction capability of the network;

4) The Gabor convolutional neural network is constructed through a Gabor convolutional layer, a pooling layer, a full-connection layer and the like, and the robustness of the algorithm to space transformation such as rotation, scale transformation, translation and the like is effectively improved.

Drawings

FIG. 1 is an overall flow chart of an image recognition method based on a Gabor convolutional neural network provided by the invention;

FIG. 2 is a Gabor wavelet with different selected parameters; wherein (a) is Gabor wavelet with 0 dimension and 0 direction, and (b) is Gabor wavelet with 0 dimension and 0 direction

The Gabor wavelet (c) is the Gabor wavelet with the scale of 2 and the direction of 0, and the Gabor wavelet (d) is the Gabor wavelet with the scale of 2 and the direction of 0

Gabor wavelet of (1);

FIG. 3 is a graph of the effect of processing an image through a Gabor kernel with different parameters; wherein, (a) is the result of the face image after the Gabor wavelet processing with the scale of 0 and the direction of 0, and (b) is the result of the face image after the scale of 0 and the direction of 0

The result of the Gabor wavelet processing of (c) is that the face image is processed by the Gabor wavelet with the scale of 2 and the direction of 0As a result, (d) is a face image passing through a scale of 2 and a direction of

The result of the Gabor wavelet processing;

FIG. 4 is a structural view of a constructed Gabor convolution layer;

FIG. 5 is a diagram of a constructed Gabor convolutional neural network, in which (a) is a reference four-layer convolutional neural network structure and (b) is a Gabor convolutional neural network structure;

fig. 6 is a diagram showing the processing effect of the MNIST character set, where (a) is the MNIST data set original, and (b) is the processing effect of the Gabor convolutional neural network, and (c) is the processing effect of the general convolutional neural network.

Detailed Description

The invention relates to an image recognition method based on a Gabor convolutional neural network, which carries out image recognition through the trained Gabor convolutional neural network. The Gabor convolutional neural network is obtained by replacing at least one convolutional layer in the convolutional neural network CNN with a Gabor convolutional layer.

The Gabor convolution layer is composed of a Gabor feature extraction module, a parallel convolution module and a space transformation pooling module which are connected in sequence; the Gabor characteristic extraction module is composed of Gabor wavelet extraction modules with different direction and scale parameters, the parallel convolution module is composed of convolution layers shared by M weights, and the spatial transformation pooling module is used for carrying out element-wise maximum value extraction on the output of the parallel convolution module.

The image recognition method comprises the following steps:

The technical solution of the present invention is further illustrated by the following specific examples:

the first step, the Gabor convolution layer construction phase, includes:

step 1: gabor wavelets with different parameters are selected to construct a Gabor feature extraction module, gabor features in different directions and different scales of the image can be extracted by selecting Gabor wavelets with different parameters in M directions and scales, and the method specifically comprises the following steps:

M＝U×V

wherein: u is the number of the directions of the Gabor wavelet; v is the scale number of Gabor wavelets;

step 2: building a parallel convolution module through the convolution layers shared by the M weights, and further learning the shallow layer features extracted by the Gabor feature extraction module;

and step 3: designing a spatial transformation pooling module by performing element-wise maximum value operation on the output of the parallel convolution module, and reducing characteristic dimensions;

and 4, step 4: and (4) sequentially connecting the Gabor feature extraction module, the parallel convolution module and the spatial transformation pooling module designed in the step (1-3), and constructing a corresponding Gabor convolution layer according to the robustness requirements of different identification tasks on spatial transformation.

And secondly, constructing a Gabor convolution neural network, comprising the following steps of:

and 5: according to the difficulty degree of the recognition task, correspondingly selecting a reference network for building a Gabor convolutional neural network;

step 6: determining a scheme for replacing a common convolutional layer in a reference network by a Gabor convolutional layer;

step three, an off-line training stage comprises:

and 7: preprocessing a two-dimensional image, including image graying and image space transformation;

and 8: after all samples in the sample set are processed in the step 7, extracting the depth features of the images through a Gabor convolutional neural network constructed in the step 5-6, and extracting by utilizing a full connection layer to obtain feature vectors;

and step 9: the method for training the Gabor convolutional neural network by utilizing the minimum cost function of the SGD algorithm with momentum specifically comprises the following steps:

wherein: j (theta) is a cost function; theta is a parameter of the Gabor convolution neural network; m is the number of samples; x is the number of ⁽ⁱ⁾ Represents the ith sample; h is _θ (x ⁽ⁱ⁾ ) A prediction class label representing the ith sample; y is ⁽ⁱ⁾ Is the category label of the ith sample.

Step four, an online testing stage:

step 10: after all samples in the test set are processed in the step 7, inputting the samples into a trained model in an off-line training stage to obtain the category of the image to be tested;

further, the step 1 specifically includes the following steps:

step 1.1: selecting M Gabor wavelets with different parameters to construct a Gabor feature extraction module, which specifically comprises the following steps:

G＝{g _μ , _ν (z):μ∈{0,1,...,U-1},ν∈{0,1,...,V-1}}

wherein: z = (x, y) spatial location coordinates; parameter σ =2 π; | | · | | represents a norm operator;

k _max in order to be the maximum frequency of the frequency,

mu and nu respectively represent the direction and the scale of the Gabor kernel, mu belongs to {0, 1., U-1}, nu belongs to {0, 1., V-1}, and G is the Gabor kernel with different selected parameters; u and V are the directions and the scale numbers of the Gabor cores respectively;

step 1.2: gabor feature F with different directions and different scales of image or feature mapping is extracted by using a Gabor feature extraction module _gab The method specifically comprises the following steps:

O _μ,ν (z)＝I(z)*g _μ,ν (z)

wherein: i (z) is an image or feature map; o is _μ,ν (z) is the convolution result of the image and the Gabor kernel in the mu direction and the v scale; denotes convolution; g is selected Gabor core with M different parameters;

a jth channel representing the input profile Fin;

representing the feature F after Gabor kernel convolution _gab The jth channel of (1); output F of Gabor feature extraction module _gab Can be seen as an input characteristic diagram F _in M Gabor features.

Further, the step 2 specifically includes the following steps:

step 2.1: selecting M convolutional layers with shared weights to construct a parallel convolutional module, so as to ensure that each Gabor feature can be further learned to obtain more expressive deep features;

step 2.2: further learning the extracted Gabor characteristics by adopting a parallel convolution module shared by weight values to obtain an image depth characteristic F _c o _nv The method specifically comprises the following steps:

wherein:

represents the jth Gabor feature; c represents a learnable convolution kernel (M convolution layers share weight values);

is the output characteristic diagram of the jth convolution layer;

step 2.3: the output of the parallel convolution module is the output of the M convolution layers, which specifically includes:

wherein: f _conv Representing the output signature of the convolution module.

Further, the step 3 specifically includes the following steps:

step 3.1: taking the maximum value through element-wise to perform spatial transformation pooling operation, and constructing a spatial transformation pooling module;

step 3.2: inputting the output characteristics of the parallel convolution module into a space transformation pooling module, and fusing to obtain robust characteristics F _out The method specifically comprises the following steps:

F _out ＝max(F _conv )

wherein: f _out And the output characteristic diagram of the spatial transformation pooling module, namely the output characteristic diagram of the whole Gabor convolution layer is shown.

Further, the step 4 specifically includes the following steps:

step 4.1: selecting Gabor wavelets with different parameters to construct a Gabor feature extraction module according to the robustness requirements of different identification tasks on space transformation;

step 4.2: and sequentially connecting the Gabor characteristic extraction module, the parallel convolution module and the spatial transformation pooling module to construct corresponding Gabor convolution layers, namely selecting Gabor wavelets with the same direction and different scales to construct a Gabor convolution layer robust to scale transformation, and selecting Gabor wavelets with different directions and different scales to construct a Gabor convolution layer robust to spatial transformation.

Further, the step 5 specifically includes the following steps:

step 5.1: selecting a reference network structure for building a Gabor convolutional neural network according to the difficulty degree of an identification task, selecting an AlexNet network structure for a simple identification task, selecting a ResNet network structure for a more complex identification task, and selecting a LightCNN network structure for a face identification task;

step 5.2: and correspondingly adjusting the number of convolution layers, pooling layers and full-connection layers and adjusting the size of a convolution kernel on the basis of the reference network structure.

Further, the step 6 specifically includes the following steps:

step 6.1: determining a replacement scheme of the Gabor convolutional layer, namely replacing only the first convolutional layer of the network, replacing the first two convolutional layers and replacing all convolutional layers of the network;

step 6.2: and adjusting the number of convolution kernels of the Gabor convolution layer, and reducing the parameter quantity of the network model to improve the subsequent network training efficiency.

Further, the step 7 specifically includes the following steps:

step 7.1: carrying out image graying to eliminate redundant information and converting a color image into a grayscale image, which specifically comprises the following steps:

Gray＝0.299*R+0.578*G+0.114*B

wherein: r is an image red channel component; g is the image green channel component; b is the image blue channel component; gray is an image Gray level image;

and 7.2: carrying out spatial transformation on the gray level image, wherein the spatial transformation comprises image translation, rotation and scale change, and the spatial transformation specifically comprises the following steps:

wherein: (x, y, 1) represents the matrix before image translation; (x ', y', 1) represents the matrix after image translation; d is a radical of _x And d _y Respectively the translation pixel quantity of the image on the x axis and the y axis;

wherein: (x, y, 1) represents a matrix before image scaling; (x ", y", 1) represents the matrix after image scaling; s is _x And s _y Scale transformation factors of the image on an x axis and a y axis respectively;

wherein: (x, y, 1) represents a matrix before image rotation; (x "', y"', 1) represents the matrix after image rotation;

is the angle of rotation.

Further, the specific method of step 8 is as follows: and (4) preprocessing all samples in the training set in the step (7), and inputting the samples into a Gabor convolutional neural network built by a Gabor convolutional layer, a common convolutional layer, a pooling layer, a BN layer, a nonlinear layer, a full-link layer and the like for training.

Further, the step 9 specifically includes the following steps:

step 9.1: selecting an SGD optimization algorithm with momentum to iterate a designed network structure, and gradually adjusting the learning rate in the training process;

step 9.2: selecting an optimal model by adopting cross validation, which specifically comprises the following steps: and randomly dividing the sample set into k parts, taking k-1 parts as a training set and 1 part as a verification set, sequentially rotating the training set and the verification set k times, selecting a model with the minimum verification error as an optimal model and storing the optimal model.

Further, the step 10 specifically includes the following steps:

step 10.1: carrying out image gray processing and random spatial transformation on the samples in the test set in the step 7;

step 10.2: and inputting the processed test set picture into the trained model in the off-line training stage to obtain the category of the image to be tested.

The technical solutions provided by the present invention will be described in detail below with reference to specific examples, and it should be understood that the following detailed description is only illustrative of the present invention and is not intended to limit the scope of the present invention, but only represents selected examples of the present invention.

Example 1:

the experimental platform is an Intel (R) Core (TM) i7-8700K processor and a 16GB memory, the display card is NVIDIA GeForce GTX1080Ti, and the deep learning frame is Pyorch. The experimental data were from the MNIST dataset of the national institute of standards and technology, and the training set contained 60000 handwritten numbers collected from 250 different people, 50% from high school students and 50% from the staff of the census. The test set is also handwritten digital data of the same scale.

In the experiment, 50000 samples are randomly selected from an MNIST data set as a training set, the remaining 10000 samples are selected as a verification set, and the average value of the experiment results on the test set for 5 times is taken as a final result. Gabor cores with the same dimension and different directions are selected to construct corresponding Gabor convolution layers. The identification rates corresponding to different Gabor nucleus direction numbers are slightly different, and the Rank-1 identification rate is shown in Table 1 and specifically comprises the following steps:

TABLE 1 Rank-1 identification Table on MNIST dataset

Number of directions	n＝4	n＝8	n＝12	n＝16
					Recognition rate	99.43	99.51	99.50	99.45

The implementation provides an image recognition method based on a Gabor convolutional neural network, the flow of which is shown in fig. 1, and the method specifically comprises the following steps:

the first step, the Gabor convolution layer construction phase, includes:

step 1: selecting M Gabor wavelets with different parameters to construct a Gabor feature extraction module, and respectively selecting Gabor wavelets with 4,8,12 and 16 directions to construct a Gabor feature extraction module G, which specifically comprises the following steps:

G＝{g _μ,ν (z):μ∈{0,1,...,U-1},ν∈{0,1,...,V-1}}

k _max in order to be the maximum frequency of the frequency,

mu and nu respectively represent the direction and the scale of the Gabor kernel, mu belongs to {0, 1., U-1}, nu belongs to {0, 1., V-1}, and G is the Gabor kernel with different selected parameters; u and V are the direction and the scale number of the Gabor kernel respectively; u ∈ {4,8,12,16}, V =1, m = U × V.

Step 2: and (3) building a parallel convolution module through convolution layers shared by M weights so as to ensure that each Gabor characteristic can be further learned to obtain a more expressive deep characteristic.

And step 3: and (4) operating and designing a spatial transformation pooling module by taking the minimum value through element-wise, and reducing the feature dimension.

And 4, step 4: sequentially connecting the Gabor feature extraction module, the parallel convolution module and the space transformation pooling module designed in the step 1-3 to construct a corresponding Gabor convolution layer, and specifically comprising the following steps:

step 4.1: and sequentially connecting the Gabor feature extraction module, the parallel convolution module and the space transformation pooling module to construct a Gabor convolution layer which is robust to rotation transformation.

Step 4.2: gabor feature extraction module in Gabor convolution layer extracts Gabor features F in different directions of image or feature mapping by using characteristics of Gabor wavelet _gab The method specifically comprises the following steps:

O _μ,ν (z)＝I(z)*g _μ,ν (z)

input feature representation F _in The jth channel of (1);

representing the feature F after Gabor kernel convolution _gab The jth channel of (1); output F of Gabor feature extraction module _gab Can be regarded as an input characteristic diagram F _in M Gabor features. FIG. 2 is Gabor wavelets in different directions and different scales; fig. 3 is a diagram of the effect of Gabor wavelet processing with different parameters.

Step 4.3: the extracted Gabor characteristics are further learned by a parallel convolution module shared by the weights in the Gabor convolution layer to obtain an image depth characteristic F _conv The method specifically comprises the following steps:

wherein:

is the output characteristic diagram of the jth convolution layer;

the output of the parallel convolution module is the output of the M convolution layers, which specifically includes:

wherein: f _conv Representing the output signature of the parallel convolution module.

Step 4.4: the output of the parallel convolution module is fused by the spatial transformation pooling module in the Gabor convolution layer to obtain robust characteristics F _out The method specifically comprises the following steps:

F _out ＝max(F _conv )

wherein: f _out Representation spaceAnd transforming the output characteristic diagram of the pooling module, namely the output characteristic diagram of the whole Gabor convolution layer. Thus, the structure of the Gabor convolution layer is completed, and the structure of the Gabor convolution layer is shown in fig. 4.

And secondly, building a Gabor convolution neural network, wherein the building stage comprises the following steps:

and 5: according to the difficulty degree of the recognition task, correspondingly selecting a reference network for building a Gabor convolutional neural network, specifically:

step 5.1: according to the difficulty degree of the recognition task, selecting a reference network structure for building a Gabor convolutional neural network, selecting an AlexNet network structure for a simple recognition task, selecting a ResNet network structure for a more complex recognition task, and selecting a LightCNN network structure for a face recognition task. Here the AlexNet network architecture is chosen.

Step 5.2: the structure of the reference four-layer convolutional neural network is shown in (a) in fig. 5.

And 6: determining a scheme for replacing a common convolutional layer in a reference network by a Gabor convolutional layer;

step 6.1: the positions of the Gabor convolutional layers are determined, i.e. only the first convolutional layer, the first two convolutional layers and all convolutional layers of the network are replaced, where only the first convolutional layer of the network is replaced, considering that the Gabor wavelet is similar to the mammalian visual sense cells and similar to the first convolutional layer of the neural network.

Step 6.2: and (b) adjusting the number of convolution kernels of the Gabor convolution layer, and reducing the number of network model parameters to improve the subsequent network training efficiency, wherein the structure of the constructed Gabor convolution neural network is shown in (b) in fig. 5.

Step three, an off-line training stage comprises:

and 7: preprocessing a two-dimensional image, including image graying and image space transformation, specifically comprising:

step 7.1: carrying out image graying to eliminate redundant information and converting a color image into a grayscale image, specifically comprising the following steps:

Gray＝0.299*R+0.578*G+0.114*B

step 7.2: carrying out spatial transformation on the gray level image, wherein the spatial transformation comprises image translation, rotation and scale change, and the spatial transformation specifically comprises the following steps:

wherein: (x, y, 1) represents the matrix before image translation; (x ', y', 1) represents the matrix after image translation; d _x And d _y Respectively the translation pixel quantity of the image on the x axis and the y axis;

wherein: (x, y, 1) represents a matrix before image scaling; (x ", y", 1) represents the matrix after image scaling; s _x And s _y Scale transformation factors of the image on an x axis and a y axis respectively;

is the angle of rotation.

And 8: and (3) after all samples in the training set are processed in the step (7), extracting the depth features of the image through a Gabor convolutional neural network built in the steps (5-6) and composed of a Gabor convolutional layer, a common convolutional layer, a pooling layer, a BN layer, a nonlinear layer and the like, and extracting by utilizing a full-connection layer to obtain the feature vector.

And step 9: training a Gabor convolutional neural network by utilizing an SGD algorithm minimizing cost function with momentum, which specifically comprises the following steps:

step 9.1: and (3) selecting an SGD optimization algorithm with momentum to iterate the designed network structure, wherein the batch size is set to be 128, the dropout ratio of a full connection layer is set to be 0.5, the weight attenuation is set to be 0.00005, the initial learning rate is set to be 0.001, the learning rate is gradually adjusted in the training process, and the learning rate is reduced by one tenth of the previous learning rate in every 25 iterations. The loss function adopts cross entropy, which is specifically defined as:

wherein: j (theta) is a cost function; theta is a parameter of the Gabor convolution neural network; m is the number of samples; x is a radical of a fluorine atom ⁽ⁱ⁾ Represents the ith sample; h is _θ (x ⁽ⁱ⁾ ) A prediction class label representing the ith sample; y is ⁽ⁱ⁾ A category label for the ith sample;

step 9.2: selecting an optimal model by adopting cross validation, which specifically comprises the following steps: and randomly dividing the sample set into 5 parts, using 4 parts as a training set and 1 part as a verification set, sequentially rotating the training set and the verification set for 5 times, selecting the model with the minimum verification error as the optimal model and storing the optimal model.

Step four, an online testing stage:

step 10.2: and inputting the processed test set picture into a trained model in an off-line training stage to obtain the category of the image to be tested.

In this embodiment, on the MNIST data set of the national institute of standards and technology in the united states, the image recognition method based on the Gabor convolutional neural network provided in this embodiment further extracts the deep robust features of the image by using the convolutional neural network while fully extracting the shallow features of the image by using the Gabor wavelet, so that the shallow features and the deep features are effectively combined, the robustness of the neural network to image space transformation is improved, and the accuracy, rapidity and robustness of the algorithm are considered, thereby meeting the application requirements in some fields.

The technical means disclosed in the scheme of the invention are not limited to the technical means disclosed in the above embodiments, but also include the technical means formed by any combination of the above technical features. It should be noted that modifications and adaptations can be made by those skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims

1. An image identification method based on a Gabor convolution neural network is characterized in that: the Gabor convolutional neural network is obtained by replacing at least one convolutional layer in the convolutional neural network CNN with a Gabor convolutional layer, and the Gabor convolutional layer is composed of a Gabor feature extraction module, a parallel convolutional module and a spatial transform pooling module which are sequentially connected; the Gabor characteristic extraction module consists of M Gabor wavelet extraction modules with different direction and scale parameters, the parallel convolution module consists of M convolution layers with shared weights, and the spatial transformation pooling module performs element-wise maximization on the output of the parallel convolution module;

the image recognition method comprises the following steps:

wherein: j (theta) is a cost function; theta is a parameter of the Gabor convolution neural network; m isIs the number of samples; x is the number of ⁽ⁱ⁾ Represents the ith sample; h is a total of _θ (x ⁽ⁱ⁾ ) A prediction class label representing an ith sample; y is ⁽ⁱ⁾ A category label for the ith sample;

2. The image identification method based on the Gabor convolutional neural network of claim 1, wherein Gabor wavelets with the same direction and different scales are selected to construct a Gabor convolutional layer robust to scale conversion, or Gabor wavelets with different directions and the same scales are selected to construct a Gabor convolutional layer robust to scale conversion, or Gabor wavelets with different directions and different scales are selected to construct a Gabor convolutional layer robust to space conversion.

3. The image recognition method based on the Gabor convolutional neural network of claim 1, wherein the optimal Gabor convolutional neural network is selected by adopting a cross validation method in the step 3, and specifically: and randomly dividing the sample set into k parts, taking k-1 parts as a training set and 1 part as a verification set, sequentially rotating the training set and the verification set k times to respectively train and verify the Gabor convolutional neural network, and selecting the trained Gabor convolutional neural network with the minimum verification error as the optimal Gabor convolutional neural network and storing the optimal Gabor convolutional neural network.