CN112819096B

CN112819096B - Construction method of fossil image classification model based on composite convolutional neural network

Info

Publication number: CN112819096B
Application number: CN202110219351.1A
Authority: CN
Inventors: 张蕾; 王晓宇; 罗杰; 卜起荣; 冯筠
Original assignee: NORTHWEST UNIVERSITY
Current assignee: NORTHWEST UNIVERSITY
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2024-01-19
Anticipated expiration: 2041-02-26
Also published as: CN112819096A

Abstract

The invention discloses a method for constructing a fossil image classification model based on a composite convolutional neural network, which comprises the following steps: s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model; s2: the fossil image feature extraction model performs feature extraction on an original fossil image to obtain a depth feature map, the fossil image feature extraction model performs feature extraction on a gradient image to obtain a primary visual feature map, the depth feature map and the primary visual feature map are fused and then sequentially processed through a global average pooling layer and a full-connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value; s3: training a primary fossil image classification model. According to the method, the depth features of the original fossil images and the primary visual features of the corresponding gradient images are extracted respectively, and the accuracy of fossil image classification tasks is further improved through feature fusion.

Description

Construction method of fossil image classification model based on composite convolutional neural network

Technical Field

The invention relates to the technical field of image processing, in particular to a method for constructing a fossil image classification model based on a composite convolutional neural network.

Background

In the conventional classification work of the micro fossil, the method used is usually manual classification, but since the individual micro fossil is too fine to be found or classified under visual observation, classification is performed manually by a microscope. Along with the progress of archaeological excavation and other works, new types of micro-fossil species are discovered continuously, so that the total sample species number is increased, and the efficiency of manual sorting and selecting and other works is lower. The whole process is boring and tedious, the error probability can be increased along with time, and meanwhile, the long-time high-intensity observation can cause serious damage to the vision and other physical health conditions of observers.

With the development of artificial intelligence, the study of deep learning has advanced greatly, plays an increasingly important role in daily work gradually, and has more and more remarkable effect in image classification task. The method can learn through a small amount of sample data, automatically acquire the most suitable features for classification through a network model, does not need to manually select the suitable features, and can obtain higher classification accuracy. By classifying the micro fossil images by utilizing the convolutional neural network, a great amount of artificial resources are saved, and meanwhile, higher classification accuracy and extremely high working efficiency can be ensured, so that a certain reference value is provided for archaeological work.

The existing research technology mainly comprises the following steps: image feature-based methods and deep learning-based methods. The traditional fossil image classification method based on image features needs complex feature extraction and feature optimization steps, and has high algorithm complexity; the existing fossil image model based on deep learning only extracts features from an original image, does not enhance primary visual features from the angle of image gradient change, has deeper network layers, and is easy to excessively fit fossil image classification tasks.

Disclosure of Invention

The invention aims to provide a method for constructing a fossil image classification model based on a composite convolutional neural network, aiming at the problems of higher algorithm complexity, slower training speed, easy occurrence of fitting and lower detection precision of a fossil image classification task.

In order to realize the tasks, the invention adopts the following technical scheme:

a construction method of fossil image classification model based on composite convolutional neural network comprises the following steps:

s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model;

s2: the fossil image feature extraction model performs feature extraction on an original fossil image to obtain a depth feature map, the fossil image feature extraction model performs feature extraction on a gradient image to obtain a primary visual feature map, the depth feature map and the primary visual feature map are fused and then sequentially processed through a global average pooling layer and a full-connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value;

s3: training a primary fossil image classification model.

Optionally, the step S2 further includes:

s21: the depth feature map and the primary visual feature map are fused from the channel dimension to obtain a fused feature map;

s22: inputting the fusion feature map into a global average pooling layer for outputting to obtain a feature vector;

s23: the feature vector is convolved and classified by the full-connection layer to obtain an image category probability value, and a primary fossil image classification model is constructed according to the image category probability value.

Optionally, processing the original fossil image by using a Canny operator to obtain a gradient image;

the classification of the fully connected layer was done by Softmax classifier.

Optionally, the step S3 further includes:

s31: constructing a target loss function;

s32: initializing network parameters of a fossil image feature extraction model by using a ResNet50 depth residual error network, and initializing network parameters of a full-connection layer according to normal distribution;

s33: for the original fossil images in the data set to be processed, randomly selecting 80% of the original fossil images as training set images and 10% of the original fossil images as verification set images;

s34: the original training set image and the preprocessed training set image are input into a fossil image classification model, and a training method of batch gradient descent is adopted to minimize the objective loss function value, so that the weight of a trained fossil image classification network is obtained;

s35: and feeding back the target loss function value of the preprocessed verification set image, updating the weight of the fossil image classification network if the target loss function value is smaller than the weight of the trained fossil image classification network, otherwise, storing the weight of the fossil image classification network.

Optionally, the step S31 further includes:

the objective loss function L is constructed, expressed as:

L＝-θα(1-b) ^γ log(b)-(1-θ)log(b)

wherein alpha epsilon [0,1] is a weight factor used for adjusting and balancing the importance degree of samples of different categories; gamma >0 is a modulation factor used to adjust the weight of the easily classified samples; θ is a hyper-parameter used to adjust the weights of the two loss functions; b epsilon [0,1] represents the estimated probability of the model to the sample real label.

Optionally, the method for constructing the fossil image feature extraction model includes:

s1: establishing stackable composite convolution residual blocks, and setting the number of basic channels and the void convolution expansion rate of the composite convolution residual blocks;

s2: stacking the composite convolution residual blocks to form nerve structures with different depths, wherein the depths of the nerve structures are determined by image data sets of different tasks;

the complex convolution residual block set described in S1:

a first layer: a convolution kernel with the size of 1 multiplied by 1, so that the calculation parameters of the model are reduced;

a second layer: 3X 3 convolution operation, combining the traditional convolution and the cavity convolution, capturing continuous structural dependency characteristics by the traditional convolution, and capturing structural dependency relations with longer interval distances by the cavity convolution;

third layer: the number of feature maps is restored by 1×1 convolution kernels.

Optionally, the number of basal channels increases linearly with the depth of the neural architecture.

Optionally, the fossil image feature extraction model has a structure shown in table 1:

TABLE 1 fossil image feature extraction model network structure

A fossil image classification method, comprising: preprocessing a fossil image to be detected to obtain a gradient image;

inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model constructed by the construction method of the fossil image classification model based on the composite convolutional neural network, and carrying out prediction classification on the classification of the fossil image classification model to obtain the classification of the fossil image.

A computer-readable storage medium storing computer instructions for causing the computer to perform the method of constructing a fossil image classification model based on a composite convolutional neural network according to the present invention.

Compared with the prior art, the invention has the following technical characteristics:

(1) In the aspect of constructing a feature extraction network, the invention provides stackable composite convolution residual blocks which are used for being stacked to form neural architectures with different depths, wherein the specific depth is determined by image data sets where different tasks are located. The module combines traditional convolution, which captures continuous structural dependency features, with hole convolution, which captures structural dependency relationships with longer separation distances, and increases receptive fields without increasing the number of parameters.

(2) In the aspect of constructing the fossil classification model, the depth features of the original fossil image and the primary visual features of the fossil gradient image, such as the structural components of the image, the edge textures and the like, are respectively extracted, and the accuracy of fossil image classification tasks is further improved through fusion among the features.

(3) Aiming at the problem that the classification loss function used by the existing fossil image classification model cannot well solve the problem of unbalanced sample class number and alleviate the problem of over-fitting, the invention constructs a new target loss function, so that the model can pay more attention to samples difficult to classify while reducing the loss proportion of samples easy to classify, and the accuracy of the fossil image classification model is effectively improved.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate the disclosure and together with the description serve to explain, but do not limit the disclosure. In the drawings:

FIG. 1 is a flow chart of a method for constructing a fossil image classification model based on a composite convolutional neural network;

FIG. 2 is a flow chart of a method for constructing a fossil image feature extraction model according to the present invention;

FIG. 3 is a schematic diagram of the structure of a composite convolution residual block of the present invention;

fig. 4 is a schematic diagram of the fossil image feature extraction model of the present invention extracting depth features from an original fossil image.

Detailed Description

For a clearer understanding of technical features, objects, and effects of the present invention, a specific embodiment of the present invention will be described with reference to the accompanying drawings.

Referring to fig. 1, 2 and 3, the method for constructing the fossil image classification model based on the composite convolutional neural network comprises the following steps:

s1: processing an original fossil image to obtain a gradient image, and constructing a fossil image feature extraction model; the original fossil image used in the invention is in JPEG format; the gradient image refers to an image obtained by calculating a gray image of an original fossil image by using a Canny operator.

S2: the fossil image feature extraction model performs feature extraction on an original fossil image to obtain depth features, the fossil image feature extraction model performs feature extraction on a gradient image to obtain primary visual features, the depth features and the primary visual features are fused and then sequentially processed through a global average pooling layer and a full connection layer to obtain image class probability values, and a primary fossil image classification model is constructed according to the image class probability values; the primary visual features refer to features such as color, edge texture, structural relation and the like extracted from gradient images corresponding to original fossil images; the depth features refer to abstract semantic features extracted from the original fossil image, and the depth features extracted from each stage of the fossil image feature extraction model are visualized as shown in fig. 4.

S3: training a primary fossil image classification model, wherein the training process comprises training a fossil image feature extraction model, the fossil image feature extraction model is embedded into the fossil image classification model to form a final fossil image classification model.

In an embodiment of the present disclosure, S2 further includes:

Specifically, the Canny operator is utilized to process an original fossil image to obtain a gradient image; the classification of the fully connected layer was done by Softmax classifier.

In an embodiment of the present disclosure, S3 further includes:

s31: constructing a target loss function;

Specifically, the target loss function L is expressed as:

L＝-θα(1-b) ^γ log(b)-(1-θ)log(b)

Referring to fig. 2, in an embodiment of the present disclosure, a method for constructing a fossil image feature extraction model includes:

s11: establishing stackable composite convolution residual blocks, and setting the number of basic channels and the void convolution expansion rate of the composite convolution residual blocks; for example, the cavity convolution expansion rate is 2 in the invention.

S12: stacking the composite convolution residual blocks to form nerve structures with different depths, wherein the depths of the nerve structures are determined by image data sets of different tasks; since the resolution of the input images in different image datasets is different, the depth of the network needs to be reset according to different datasets; the neural architecture of the fossil image feature extraction model formed in the present invention is: input image, convolution layer, maximum pooling layer, stacking of 8 composite convolution residual blocks and output of a feature map.

The complex convolution residual block set described in S11:

In embodiments of the present disclosure, the number of basal channels increases linearly as the depth of the neural architecture becomes deeper.

In the embodiment of the present disclosure, the fossil image feature extraction model has a structure shown in table 1.

Example 1:

the embodiment discloses a method for constructing a fossil image classification model based on a composite convolutional neural network, which comprises the following steps:

step 1, preprocessing an original fossil image, calculating gradients of the original fossil image by using a Canny operator, calculating edge texture information of the image to obtain a larger gradient value, and calculating a smoother part to obtain a smaller gradient value to obtain a final gradient image;

the method specifically comprises the following steps:

step 1.1, converting the colored fossil image in RGB format into gray scale image, such as formula:

f(x，y)＝＝0.299R+0.587G+0.114B

where f (x, y) represents a gray scale image generated from the original image, x, y represent pixel coordinate positions of the image, and R, G, B represents colors of three channels of red, green and blue, respectively.

Step 1.2, in order to reduce the extraction result of noise on the gradient image as much as possible, denoising the gray image f (x, y) by using gaussian filtering, which is called smoothing image, and setting the selected gaussian function as G (x, y) and the smoothed image as H (x, y), then:

H(x，y)＝f(x，y)*G(x，y)

where σ represents the standard deviation of the two-dimensional gaussian function, affecting the quality of the gaussian filtering. "x" is an operator that represents convolution.

Step 1.3, calculating the magnitude and direction of the gradient by using the finite difference of the first-order partial derivatives, wherein the gradient of the image is defined as the degree of change of the gray value of the pixel in the field of computer vision, and the calculated degree of change can be described as the partial derivatives of the corresponding pixel along the x-axis direction and the y-axis direction in differential integration, and then:

because the image can be regarded as a discrete matrix, the differential function is rewritten into a discrete differential operator, and on the basis, a Soble operator, also called a first-order differential operator, can be obtained by combining the Gaussian smoothing of the previous dimension, and the formula is as follows:

on the basis of the above deduction, the calculation process for solving the image gradient is abstracted mathematically as passing the image to be processed through S _x ，S _y Filtering calculation is carried out on Sobel operators in two directions to obtain gradient graphs G in two corresponding directions _x ，G _y From this, the gradient G and direction θ of the pixel point can be determined:

wherein G is gradient strength, θ represents gradient direction, arctan is arctangent direction;

step 1.4, based on the above operation steps, a gradient edge composed of a number of pixels can be obtained, the edge information being inaccurate and the edge being thicker. Therefore, in order to obtain accurate edge information composed of a single pixel, non-maximum suppression is required.

The method specifically comprises the following steps:

step 1.4.1, comparing the gradient intensity of the current pixel with two pixels along the positive and negative gradient directions;

step 1.4.2, if the gradient intensity of the current pixel is maximum compared with the gradient intensity of the other two pixels, the pixel point is reserved as an edge point, otherwise, the pixel point is suppressed;

and step 1.5, performing edge detection and connection on the basis of the step 1.4. Ideally, the edge detection only processes the pixel point set located on the edge, but in the actual processing process, noise is always present, and then the edge is broken, so that the pixel point set on the edge cannot describe the edge characteristic completely and effectively, and therefore threshold judgment is introduced. And carrying out threshold judgment on the pixel points on the edge by setting a proper threshold range.

The method specifically comprises the following steps:

step 1.5.1, if the gradient intensity value of the pixel point on the edge is larger than the maximum value of the threshold value, recording the pixel point as the edge point;

step 1.5.2, if the gradient intensity value of the pixel point on the edge is smaller than the minimum value of the threshold value, recording the pixel point as a non-edge point;

step 1.5.3, if the pixel point is located between the maximum value and the minimum value of the threshold value, calculating whether the pixel point is in eight-way with the edge point marked in the front, if so, marking the pixel point as the edge point, otherwise marking the pixel point as a non-edge point;

step 1.5.4, traversing all pixel points on the edge, so as to connect the edge points which are not closed to form a contour, and obtaining a gradient image;

and 2, performing feature extraction by using a fossil image feature extraction model based on the composite convolutional neural network, and constructing a fossil image classification model on the basis.

The method specifically comprises the following steps:

step 2.1, constructing a fossil image feature extraction model based on a composite convolutional neural network;

the method specifically comprises the following steps:

step 2.1.1, establishing stackable composite convolution residual blocks for stacking to form nerve structures with different depths, wherein the specific depths are determined by image data sets of different tasks; since the resolution of the input images in different image datasets is different, the depth of the network needs to be reset according to different datasets;

further, as shown in fig. 3, the first layer of the main path of the complex convolution residual block is a convolution kernel of size 1×1 for reducing the calculation parameters of the model. The 3 x 3 convolution operation of the middle layer combines the traditional convolution, which captures continuous structural dependency features, with the cavity convolution, which captures structural dependency relationships with longer separation distances, and increases receptive fields without increasing the number of parameters. The third layer uses a 1 x 1 convolution kernel to recover the number of feature maps to ensure that inputs can be added to outputs, ensuring model accuracy while reducing computational parameters.

The calculation process can be defined as:

F _l，i ＝ReLU(W _l ,p),

where p represents the input feature map, W is the weight of the convolution kernel, reLU is the activation function, F _l,i Representing the output profile of the first layer of the model, which is the output of the first stage of the complex convolution residual block, comprising i profiles.

In the second stage, will be composed ofF of the representation _l,i Half of the feature map of (a) is input into a conventional convolution and output is F _{l+1_conv,j} J feature maps are included; the other half of the feature map->Then input into the hole convolution, output is F _{l+1_dconv,k} Comprising k feature maps.

q＝ReLU(W _l+3i F _l+2,j+k )+W _sp，

In the third stage, the output characteristic diagram of the traditional convolution and the cavity convolution is overlapped from the channel dimension, and the result is F _l+2,j+k A total of j+k feature maps are included. And obtaining an output characteristic diagram q after the final layer 1 multiplied by 1 convolution operation and jump connection.

Step 2.1.2, setting the number of basic channels of the composite convolution residual block and the void convolution expansion rate; then stacking the composite convolution residual blocks to determine a final feature extraction model, as shown in table 1;

the number of base channels per composite convolution residual block in step 2.1.2 increases linearly as the network gets deeper;

step 2.2, constructing a fossil image classification model;

the method specifically comprises the following steps:

step 2.2.1, respectively sending the original fossil image and the gradient image obtained by pretreatment in the step 1 into a fossil image feature extraction model based on the composite convolutional neural network constructed in the step 2.1, and extracting 1024 feature images with 7 multiplied by 7 pixels containing depth features and 1024 feature images with 7 multiplied by 7 pixels containing primary visual features;

2.2.2, fusing the depth feature map and the primary visual feature map from the channel dimension to obtain 2048 feature maps with the size of 7 multiplied by 7 pixels;

step 2.2.3, sending the feature images fused in step 2.2.2 into a global averaging pooling layer, and carrying out global averaging pooling operation on each input feature image, namely calculating the average value of all pixel points for each feature image, outputting a data value, wherein 2048 feature images output 2048 data points, and the data points form a 2048-dimensional vector, namely a feature vector;

step 2.2.4, mapping the distributed characteristic representation obtained by the global average pooling layer into a sample marking space through a full connection layer, namely, carrying out convolution operation on an output result obtained by global average pooling by adopting C convolution kernels with the size of 1 multiplied by 2048, namely, the length of 1, the width of 1 and the channel number of 2048; finally, probability values divided into various categories are obtained through a Softmax classifier, and the category to which the final fossil image belongs is obtained. Where C is the number of fossil image categories.

Step 3, training the constructed fossil image classification model;

step 3.1, constructing a target loss function;

the method specifically comprises the following steps:

step 3.1.1, in the existing research work, the Cross Entropy (CE) loss function is widely applied to the training process of the fossil image classification model, such as the formula:

where a is the true tag of the sample (encoded in one-hot format).Representing the predicted outcome of the sample, C is the number of fossil image categories, for ease of representation, the formula is rewritten as:

CE(b)＝＝-log(b)

wherein b E [0,1] represents the estimated probability of the model to the sample real label. The CE loss function can reflect the difference between the predicted probability distribution and the true probability distribution, and during the training of the model, we expect to see that the predicted probability and the true probability are as close as possible, so the training goal of the model is to minimize the loss function. However, CE loss functions cannot solve the problem of sample class number imbalance and alleviate the problem of overfitting, so the CE loss functions are chosen to be improved.

Step 3.1.2, constructing an objective loss function L, and using a loss function modified based on focal loss as an objective function, the CE loss function is modified, which can be expressed as:

L＝-6α(1-b) ^γ log(b)-(1-θ)log(b)

wherein alpha is E [0,1]]Is a weight factor for adjusting and balancing the importance degree, count, of different types of samples _m Represents the number of m-th class samples, gamma>0 is a modulation factor used to adjust the weights of the easily classified samples, and θ is a hyper-parameter used to adjust the weights of the two loss functions.

Step 3.2, initializing parameters in a fossil image feature extraction model by using a pre-trained ResNet50 depth residual error network on an image large-scale data set as initial parameter values, and initializing network parameters of a full-connection layer according to normal distribution;

step 3.3, randomly selecting 80% of images in the data set as a training set, 10% of images as a verification set and 10% of images as a test set; preprocessing the images in the training set and the verification set according to the step 1 to obtain a preprocessed training set image and a preprocessed verification set image;

step 3.4, the preprocessed training set image, comprising an original fossil image and a corresponding gradient image, is sent into a built fossil image classification model, a batch gradient descent training method is adopted to minimize the objective loss function value, and then parameters of all layers in the network are adjusted to obtain the weight of the trained composite convolutional neural network;

in this embodiment, the batch size of training is set to 64, the parameter update momentum is set to 0.9, the learning rate is set to 0.001, the iteration number (Epoch) is set to 500, and the weight of the trained shallow convolutional neural network model is retrained by adopting a backward propagation algorithm to obtain a weight training result, and the batch random gradient is reduced, so that the loss value is minimized.

And 3.6, feeding back the weight training result through the loss value of the preprocessed verification set image after each generation of training is finished: inputting the verification set image into a fossil image classification model which is trained at present, calculating a loss value of the verification set image, updating the weight of the model if the current error value is smaller than the weight training result, and otherwise, continuously storing the previous weight training result; after 500 times of iterative training, the training of the model is terminated, and the optimal model result is stored.

And 4, classifying the fossil images by using a fossil image classification model based on the composite convolutional neural network.

Step 4.1, for a fossil image to be detected, firstly preprocessing the fossil image to be detected according to the step 1 to obtain a gradient image;

and 4.2, inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model trained in the step 3 to predict the category of the fossil image to obtain the final belonging category.

Example 2:

the dataset in this example was provided by the university of northwest geology college and contained a total of 2354 fossil images, of which there were 1392 of the Yunnan beetles, 852 of the Heteroides pteronyssinus, 85 of the Guan Yang Trifolius, and 25 of the Wudingzhong insects. The classification accuracy is used as an evaluation index of the model performance, the value is [0,1], and the higher the value is, the better the model performance is.

Table 2 comparison results between different methods

From the results in table 2, it can be seen that the performance of the present invention on this dataset is higher than the compared fossil image classification model. To further demonstrate that the innovations proposed by the present invention can have a beneficial effect on the final result, the present example compares the effects between four different methods, as follows:

n1: only one sub-network is included, namely the input is the original fossil image, and the cross entropy loss function is adopted to train the whole network end to end.

N2: only one sub-network is included, namely the input is the original fossil image, and the whole network is trained end to end by adopting the loss function L provided by the invention.

And N3: only one sub-network is included, namely, a gradient image corresponding to the original fossil image is input, and the loss function L provided by the invention is adopted to train the whole network end to end.

N4: the invention comprises two sub-networks, the input is the original fossil image and the corresponding gradient image, the extracted characteristics of the two sub-networks are spliced and fused, and the loss function L provided by the invention is adopted to train the whole network end to end.

Table 3 comparative effects of ablation experiments

From the results shown in table 3, the innovations proposed by the present invention can have a favorable effect on the final results, thereby further improving the performance of the fossil image classification model.

The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solutions of the present disclosure within the scope of the technical concept of the present disclosure, and all the simple modifications belong to the protection scope of the present disclosure.

In addition, the specific features described in the foregoing embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, the present disclosure does not further describe various possible combinations.

Moreover, any combination between the various embodiments of the present disclosure is possible as long as it does not depart from the spirit of the present disclosure, which should also be construed as the disclosure of the present disclosure.

Claims

1. The construction method of the fossil image classification model based on the composite convolutional neural network is characterized by comprising the following steps:

s3: training a primary fossil image classification model;

processing the original fossil image by using a Canny operator to obtain a gradient image; the classification of the full connection layer is completed through a Softmax classifier;

the step S3 further comprises:

s31: constructing a target loss function;

s35: feeding back the target loss function value of the preprocessed verification set image, updating the weight of the fossil image classification network if the target loss function value is smaller than the weight of the trained fossil image classification network, otherwise, storing the weight of the fossil image classification network;

the S31 further includes:

the objective loss function L is constructed, expressed as:

L＝-θα(1-) ^γ log(b)-(1-)log(b)

wherein alpha epsilon [0,1] is a weight factor; >0 is the modulation factor; θ is a super parameter; b e [0,1] represents the estimated probability;

the construction method of the fossil image feature extraction model comprises the following steps:

the complex convolution residual block set described in S1:

2. The method for constructing a fossil image classification model based on a composite convolutional neural network according to claim 1, wherein S2 further comprises:

3. The method for constructing a fossil image classification model based on a composite convolutional neural network according to claim 1 or 2, wherein the number of basic channels increases linearly with the depth of the neural architecture.

4. The method for constructing a fossil image classification model based on a composite convolutional neural network according to claim 1 or 2, wherein the fossil image feature extraction model has a structure shown in table 1:

TABLE 1 fossil image feature extraction model network structure

5. A fossil image classification method, comprising: preprocessing a fossil image to be detected to obtain a gradient image;

inputting the fossil image to be detected and the corresponding gradient image into the fossil image classification model constructed by the construction method of the fossil image classification model based on the composite convolutional neural network according to any one of claims 1-4, and carrying out prediction classification on the classification of the fossil image classification model to obtain the classification of the fossil image.

6. A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions for causing the computer to perform the method for constructing a composite convolutional neural network-based fossil image classification model according to any one of claims 1-4.