CN109145770B

CN109145770B - Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model

Info

Publication number: CN109145770B
Application number: CN201810863041.1A
Authority: CN
Inventors: 李�瑞; 王儒敬; 谢成军; 张洁; 陈天娇; 陈红波; 胡海瀛
Original assignee: Hefei Institutes of Physical Science of CAS
Current assignee: Hefei Institutes of Physical Science of CAS
Priority date: 2018-08-01
Filing date: 2018-08-01
Publication date: 2022-07-15
Anticipated expiration: 2038-08-01
Also published as: CN109145770A

Abstract

The invention relates to a method for automatically counting wheat spiders based on combination of a multi-scale feature fusion network and a positioning model, which overcomes the defect of high error rate of image detection aiming at small targets compared with the prior art. The invention comprises the following steps: establishing a training sample; constructing a wheat spider detection counting model; acquiring an image to be counted; and obtaining the number of the wheat spiders. The invention realizes the direct identification and counting of the wheat spiders under the field natural environment.

Description

Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model

Technical Field

The invention relates to the technical field of image recognition, in particular to an automatic counting method for wheat spiders based on combination of a multi-scale feature fusion network and a positioning model.

Background

Wheat is one of main grain crops in China, is easily damaged by various pests in the production process of the wheat, and wheat spiders are one of the main grain crops, and can suck wheat leaf juice and even dry up the wheat leaf juice, so that the yield of the wheat is seriously influenced. The detection of pest population quantity is an important means for pest control, and provides a theoretical basis for pest control decision. Therefore, the identification and counting of wheat spiders in the field are important for improving the yield of wheat.

With the rapid development of computer vision technology and image processing technology, image-based pest automatic identification and counting technology has become a research focus in recent years. Although the method is time-saving and labor-saving, has the advantages of intellectualization and the like, the method cannot be applied to the identification and counting of the wheat spiders in the field. The reason is that: firstly, the individual wheat spiders are only a few millimeters small, and are difficult to detect by using a traditional image recognition technology (SVM) aiming at such small targets; secondly, when the image is collected, the quality of the image is influenced by unstable and uneven illumination of the external environment; moreover, in practical application, the acquired image is often mixed with other impurities, and the background is complex.

Therefore, how to detect small targets such as wheat spiders in a complex environment has become an urgent technical problem to be solved.

Disclosure of Invention

The invention aims to solve the defect of high error rate of image detection for small targets in the prior art, and provides an automatic wheat spider counting method based on combination of a multi-scale feature fusion network and a positioning model to solve the problems.

In order to achieve the purpose, the technical scheme of the invention is as follows:

a method for automatically counting wheat spiders based on combination of a multi-scale feature fusion network and a positioning model comprises the following steps:

Establishing a training sample, namely acquiring more than 2000 images of the wheat spiders in the field natural environment as the training images, and marking the wheat spiders in the images to obtain the training sample;

constructing a wheat spider detection counting model;

constructing a positioning model;

constructing a multi-scale feature fusion network, and transforming a multi-scale feature fusion network structure;

training a multi-scale feature fusion network, training the features of the candidate region positioned by the training sample according to the positioning model, and taking the output result of each layer as a prediction result;

acquiring an image to be counted, acquiring a wheat spider image shot in the field, and preprocessing the image to be counted to obtain the image to be counted;

and (3) obtaining the number of the wheat spiders, namely inputting the image to be counted into a wheat spider detection counting model to obtain the number of the wheat spiders in the image.

The construction positioning model comprises the following steps:

setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R ═ R { R-₁,r₂,...r_n} segmented areas;

calculating the similarity of color information, obtaining 25 histograms of each color channel of the image by normalization in an L1 paradigm, and calculating the similarity of the color space by the following calculation formula:

Wherein f is_color(r_i,r_j) Indicates a divided region r_iAnd r_jThe color space similarity of (1);

denotes the ith channel, the kth histogram vector, i 1,2,3, k 0,1_iDenotes a divided region R ═ { R ═ R₁,r₂,...r_nThe ith area;

represents the jth channel, the kth histogram vector, j ═ 1,2,3, and m represents the m histograms;

calculating the similarity of the edge information, calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms of each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:

wherein f is_edage(r_i,r_j) Indicates a divided region r_iAnd r_jThe degree of similarity of the edge information of (2),

representing the ith channel, the kth histogram vector,

representing the jth channel, the kth histogram vector, where i 1,2,3, j 1,2,3, k 0,1.., 10; n represents the number of histograms, m represents the number of m histograms, r_iDenotes a divided region R ═ { R ═ R₁,r₂,...r_nThe ith area;

calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:

wherein f is_area(r_i,r_j) Representing the divided region r_iAnd r_jArea size similarity of() Representing the area of the region; area (img) represents the area of the picture, r_iDenotes a divided region R ═ { R ═ R ₁,r₂,...r_nI-th area, r_jDenotes a divided region R ═ R { (R)₁,r₂,...r_nJth area;

and fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:

f(r_i,r_j)＝w₁f_color(r_i,r_j)+w₂f_edage(r_i,r_j)+w₃f_area(r_i,r_j)，

wherein, f (r)_i,r_j) Represents a divided region r_iAnd r_jSimilarity after fusion, w₁、w₂、w₃Weights r representing information similarity, edge information similarity, and region size similarity, respectively_iDenotes a divided region R ═ R { (R)₁,r₂,...r_nI-th area, r_jDenotes a divided region R ═ { R ═ R₁,r₂,...r_nJth area.

The construction of the multi-scale feature fusion network comprises the following steps:

setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer;

setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2 and outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n;

and connecting the feature map of the layer 1 and the feature map of the layer 2 … of the nth layer with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the nth layer through 1 × 1 convolution kernels to generate the multi-scale feature fusion network.

The training multi-scale feature fusion network comprises the following steps:

Inputting the training sample into a positioning model, and positioning a candidate region of the training sample by the positioning model;

respectively inputting the candidate regions of the training sample into the 1 st layer of the multi-scale neural network, and outputting a 1 st layer characteristic diagram by the 1 st layer of the multi-scale neural network;

inputting the feature map of the layer 1 into the layer 2 of the multi-scale neural network, and outputting the feature map of the layer 2 from the layer 1 of the multi-scale neural network until the feature map of the layer n-1 is input into the nth layer of the multi-scale neural network;

performing deconvolution operation on the n-th layer of feature map to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1-th layer to generate an n-1-th deconvolution layer, so as to obtain a 1-st deconvolution layer;

the characteristic diagrams of the 1 st layer, the 2 nd layer, … to the nth layer are connected with the deconvolution layer of the 1 st layer, the deconvolution layer of the 2 nd layer, … to the nth layer by 1 × 1 convolution kernels;

connecting the feature map of the 1 st layer with the deconvolution layer of the 1 st layer through a 1 x 1 convolution kernel, extracting features of the first layer, and generating a prediction result of the first layer; connecting the feature map of the 2 nd layer with the deconvolution layer of the 2 nd layer through a 1 x 1 convolution kernel, extracting features of a second layer, and generating a prediction result of the second layer; …, generating an nth layer of prediction result until the nth layer of feature graph is connected with the nth layer of deconvolution layer through a 1 x 1 convolution kernel and then the nth layer of feature is extracted;

And (3) performing regression processing on the layer 1 prediction result, the layer 2 prediction result, … and the results of the nth layer to generate a final prediction result, wherein the regression function is as follows:

wherein C (lambda) is the final prediction result, lambda represents the training parameter, n represents the number of network layers, y^(j)Representing true classes, p_λ(x^(j)) Representing the result of the j-th layer prediction; x is a radical of a fluorine atom^(j)A feature vector representing a j-th layer;

and obtaining a final score through C (lambda) to predict the coordinates of the category and the position of the category in the graph.

The method for acquiring the number of the wheat spiders comprises the following steps:

inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;

inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, and counting the number of the wheat spiders to obtain the number of the wheat spiders in the image.

Advantageous effects

Compared with the prior art, the automatic wheat spider counting method based on the combination of the multi-scale feature fusion network and the positioning model realizes direct identification and counting of wheat spiders in the field natural environment.

The invention eliminates the influence of illumination on detection counting through pretreatment, and simplifies the complex environment; then, positioning a candidate area of the suspected wheat spider by a positioning model method; and performing feature extraction on the candidate region by using a multi-scale feature fusion network, and then finally determining the wheat spider region through multi-prediction result regression. The positioning (determination) of the candidate region greatly reduces the feature extraction time and feature dimension, and enhances the counting instantaneity; meanwhile, regression fusion of multiple prediction results ensures that the wheat spiders of all scales can be accurately detected, and robustness and accuracy of automatic detection and counting are improved.

Drawings

FIG. 1 is a sequence diagram of the method of the present invention;

FIG. 2a is a diagram illustrating the detection result of training samples by using the conventional SVM technique in the prior art;

FIG. 2b is a graph showing the results of detection by the method of the present invention;

FIG. 3 is a schematic diagram of a multi-scale feature fusion network structure according to the present invention.

Detailed Description

So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:

as shown in fig. 1, the automatic wheat spider counting method based on the combination of the multi-scale feature fusion network and the positioning model includes the following steps:

in the first step, training samples are established. More than 2000 images of the wheat spiders in the field natural environment are obtained as training images, and the wheat spiders in the images are marked to obtain training samples.

And secondly, constructing a wheat spider detection counting model. And constructing a positioning model and a multi-scale feature fusion network, extracting candidate regions of the training sample by using the positioning model, and classifying the candidate regions after extracting the features of the candidate regions through the multi-scale fusion network, wherein if the candidate regions are the candidate regions of the wheat spider, the candidate regions of the wheat spider are cancelled.

First, a positioning model is constructed. In order to reduce the feature extraction time, reduce the feature vector dimension and enhance the real-time performance of automatic counting, a positioning model is firstly used for positioning a candidate region of the wheat spider, and then feature extraction is carried out according to the candidate region.

The method comprises the following steps:

(1) setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R (R) and { R ═ R₁,r₂,...r_nAnd } division areas.

(2) And calculating the similarity of the color information. The normalization of L1 paradigm is used to obtain 25 histograms of each color channel of the image, and the similarity of the color space is calculated according to the following formula:

wherein, f_color(r_i,r_j) Represents a divided region r_iAnd r_jThe color space similarity of (a);

represents the jth channel, the kth histogram vector, j ═ 1,2,3, and m represents the m histograms.

(3) And calculating the similarity of the edge information. Calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms for each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:

Wherein, f_edage(r_i,r_j) Represents a divided region r_iAnd r_jThe degree of similarity of the edge information of (a),

representing the ith channel, the kth histogram vector,

a k-th histogram vector representing the j-th channel, where i 1,2,3, j 1,2,3, k 0,1.., 10; n represents the number of histograms, m represents the number of m histograms, r_iDenotes a divided region R ═ { R ═ R₁,r₂,...r_nThe ith area.

(4) Calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:

wherein f is_area(r_i,r_j) Representing the divided region r_iAnd r_jArea () represents an area of the region; area (img) represents the area of the picture, r_iDenotes a divided region R ═ { R ═ R₁,r₂,...r_nI-th area, r_jDenotes a divided region R ═ { R ═ R₁,r₂,...r_nJth area.

(5) And fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:

f(r_i,r_j)＝w₁f_color(r_i,r_j)+w₂f_edage(r_i,r_j)+w₃f_area(r_i,r_j)，

wherein, f (r)_i,r_j) Indicates a divided region r_iAnd r_jSimilarity after fusion, w₁、w₂、w₃Weights r respectively representing information similarity, edge information similarity, and region size similarity_iDenotes a divided region R ═ { R ═ R₁,r₂,...r_nI-th area, r_jDenotes a divided region R ═ { R ═ R₁,r₂,...r_nJth area.

Through the combination of color information similarity, edge information similarity and region size similarity, r_iAnd r_jAnd continuously generating n regions which are the candidate regions of the wheat spider.

Secondly, constructing a multi-scale feature fusion network, and reconstructing a multi-scale feature fusion network structure. In order to better extract the characteristics of the wheat spiders with each scale and various forms, a multi-scale characteristic fusion network is designed to accurately distinguish accurate regions of candidate regions of the wheat spiders.

As shown in FIG. 3, the multi-scale feature fusion network structure constructs a marginal multi-scale feature network by utilizing the inherent multi-scale and cone-shaped hierarchical structure feature maps, and develops a top-down architecture with lateral connection for constructing a high-level semantic feature map on all scales. The method comprises the following steps:

(1) and setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer.

(2) Setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2, outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n.

(3) And connecting the feature map of the layer 1 and the feature map of the layer 2 … of the layer n with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the layer n through a 1 x 1 convolution kernel to generate the multi-scale feature fusion network.

Generating a feature map by downsampling, taking each training picture as input, extracting features by adopting a multi-scale neural network, and generating a feature map by downsampling each layer of the multi-scale neural network;

and then, the last layer of deconvolution generates a feature map with the size of the previous layer, and the iteration is carried out in turn until a second layer of size is generated. The characteristic graph is smaller and smaller due to down-sampling of each layer of the multi-scale network, and the number of the wheat spiders in the characteristic graph is smaller and even reaches a few pixels, so that the influence on the detection count of the wheat spiders is large. In order to avoid the problem, deconvolution operation is adopted for each layer of pyramid image, and the feature map is amplified to the size of the upper layer through upsampling, so that the pest features can be effectively extracted, and the size of the wheat spiders in the image is ensured;

and connecting the feature maps of the layers generated by deconvolution through 1-by-1 convolution kernels to generate a multi-scale feature fusion network.

And finally, training the multi-scale feature fusion network. And training by taking the candidate area positioned by the positioning model aiming at the training sample as a characteristic, and taking the output result of each layer as a prediction result. The method comprises the following specific steps:

(1) And inputting the training samples into a positioning model, and positioning the candidate regions of the training samples by the positioning model.

(2) And respectively inputting the candidate regions of the training sample into the layer 1 of the multi-scale neural network, and outputting a layer 1 characteristic diagram by the layer 1 of the multi-scale neural network.

(3) Inputting the layer 1 characteristic diagram into the layer 2 of the multi-scale neural network, and outputting the layer 2 characteristic diagram by the layer 1 of the multi-scale neural network until the n-1 layer characteristic diagram is input into the nth layer of the multi-scale neural network.

(4) And performing deconvolution operation on the n-th layer feature diagram to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1 st layer to generate an n-1-th deconvolution layer, so as to reach the 1 st deconvolution layer.

(5) The layer 1 signature, the layer 2 signature, … through the nth signature are connected to the layer 1 deconvolution layer, the layer 2 deconvolution layer, … through the nth deconvolution layer by a 1 x 1 convolution kernel.

(6) Connecting the layer 1 feature graph with the layer 1 deconvolution layer through a 1 x 1 convolution kernel, extracting first layer features, and generating a first layer prediction result; connecting the 2 nd layer feature graph with the 2 nd layer deconvolution layer through 1 x 1 convolution kernel, extracting second layer features, and generating a second layer prediction result; … until the n-th layer feature graph and the n-th layer deconvolution layer are connected through a 1 x 1 convolution kernel, and after the n-th layer features are extracted, the n-th layer prediction result is generated.

(7) And (3) performing regression processing on the layer 1 prediction result, the layer 2 prediction result, … and the results of the nth layer to generate a final prediction result, wherein the regression function is as follows:

(8) and obtaining a final score through C (lambda) to predict the category and the coordinate of the category in the graph, wherein the coordinate is the position of the wheat spider in the graph.

And thirdly, acquiring an image to be counted. And acquiring a wheat spider image shot in the field, and preprocessing the wheat spider image to obtain an image to be counted.

And fourthly, obtaining the number of the wheat spiders. And inputting the image to be counted into the wheat spider detection counting model to obtain the number of the wheat spiders in the image. The method comprises the following specific steps:

(1) inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;

(2) inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, counting the number of the wheat spiders, and obtaining the number of the wheat spiders in the image.

As shown in fig. 2a, it is a graph of the detection result of the wheat spider obtained by the SVM algorithm. As can be seen from fig. 2a, the area of the wheat spiders detected by the small boxes is very large, especially in the middle large box area of fig. 2a, which erroneously puts all of a plurality of relatively concentrated wheat spiders within the range of one large box. The reason for this false indication is that: the traditional SVM algorithm does not perform early positioning, and if a candidate area is positioned by adopting a positioning model, the phenomenon can be avoided; in fig. 2a, the reason why the wheat spider area detected by the small box is very large is that: the traditional SVM algorithm does not use regression fusion of multiple prediction results, and in fig. 2a, some small boxes are misidentified.

As shown in fig. 2b, compared with the conventional SVM algorithm, the method of the present invention can accurately locate the number and specific positions of the wheat spiders, and has high robustness and accuracy.

The foregoing shows and describes the general principles, principal features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A wheat spider automatic counting method based on combination of a multi-scale feature fusion network and a positioning model is characterized by comprising the following steps:

11) establishing a training sample, namely acquiring more than 2000 images of the wheat spiders in the field natural environment as training images, and marking the wheat spiders in the images to obtain the training sample;

12) constructing a wheat spider detection counting model;

121) constructing a positioning model; the construction positioning model comprises the following steps:

1211) Setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R (R) and { R ═ R₁,r₂,...r_n} segmented areas;

1212) calculating the similarity of color information, obtaining 25 histograms of each color channel of the image by normalization using an L1 paradigm, and calculating the similarity of color space according to the following calculation formula:

wherein, f_color(r_i,r_j) Represents a divided region r_iAnd r_jThe degree of similarity in the color space of (c),

denotes the a channel, k histogram vector, a is 1,2,3, k is 0,1_iDenotes a divided region R ═ { R ═ R₁,r₂,...r_nThe ith area;

represents the b-th channel, the k-th histogram vector, b ═ 1,2, 3;

1213) calculating the similarity of the edge information, calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms for each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:

representing the e-th channel, the k-th histogram vector,

represents the f-th channel, the k-th histogram vector, where e is 1,2,3, f is 1,2,3, k is 0,1.., 10; q represents the number of histograms, r _iDenotes a divided region R ═ R { (R)₁,r₂,...r_nThe ith area;

1214) calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:

wherein f is_area(r_i,r_j) Represents a divided region r_iAnd r_jArea () represents an area of the region; area (img) represents the area of the picture, r_iDenotes a divided region R ═ { R ═ R₁,r₂,...r_nI-th area, r_jDenotes a divided region R ═ { R ═ R₁,r₂,...r_nJth area;

1215) and fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:

f(r_i,r_j)＝w₁f_color(r_i,r_j)+w₂f_edage(r_i,r_j)+w₃f_area(r_i,r_j)，

wherein, f (r)_i,r_j) Represents a scoreCutting region r_iAnd r_jSimilarity after fusion, w₁、w₂、w₃Weights r respectively representing information similarity, edge information similarity, and region size similarity_iDenotes a divided region R ═ { R ═ R₁,r₂,...r_nI-th area, r_jDenotes a divided region R ═ { R ═ R₁,r₂,...r_nJth area;

122) constructing a multi-scale feature fusion network, and transforming a multi-scale feature fusion network structure; the construction of the multi-scale feature fusion network comprises the following steps:

1221) setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer;

1222) setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2 and outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n;

1223) Connecting the feature map of the layer 1 and the feature map of the layer 2 … of the nth layer with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the nth layer through 1-1 convolution kernels to generate a multi-scale feature fusion network;

123) training a multi-scale feature fusion network, training by taking a candidate region positioned by a positioning model aiming at a training sample as a feature, and taking an output result of each layer as a prediction result;

the training multi-scale feature fusion network comprises the following steps:

1231) inputting the training sample into a positioning model, and positioning a candidate region of the training sample by the positioning model;

1232) respectively inputting the candidate regions of the training sample into the 1 st layer of the multi-scale neural network, and outputting a 1 st layer characteristic diagram by the 1 st layer of the multi-scale neural network;

1233) inputting the feature map of the layer 1 into the layer 2 of the multi-scale neural network, and outputting the feature map of the layer 2 by the layer 2 of the multi-scale neural network until the feature map of the layer n-1 is input into the nth layer of the multi-scale neural network;

1234) performing deconvolution operation on the n-th layer of feature map to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1-th layer to generate an n-1-th deconvolution layer, so as to obtain a 1-st deconvolution layer;

1235) The characteristic diagrams of the 1 st layer, the 2 nd layer, … to the nth layer are connected with the deconvolution layer of the 1 st layer, the deconvolution layer of the 2 nd layer, … to the nth layer by 1 × 1 convolution kernels;

1236) connecting the feature map of the 1 st layer with the deconvolution layer of the 1 st layer through a 1 x 1 convolution kernel, extracting features of the first layer, and generating a prediction result of the first layer; connecting the feature map of the 2 nd layer with the deconvolution layer of the 2 nd layer through a 1 x 1 convolution kernel, extracting features of a second layer, and generating a prediction result of the second layer; …, generating an nth layer of prediction result until the nth layer of feature graph is connected with the nth layer of deconvolution layer through a 1 x 1 convolution kernel and then the nth layer of feature is extracted;

1237) and performing regression processing on the prediction result of the 1 st layer, the prediction result of the 2 nd layer, … till the result of the nth layer to generate a final prediction result, wherein the regression function is as follows:

wherein C (lambda) is the final prediction result, lambda represents the training parameter, n represents the number of network layers, y represents^(j)Representing true classes, p_λ(x^(j)) Representing the result of the prediction of the j-th layer; x is the number of^(j)A feature vector representing a j-th layer;

1238) obtaining a final score through C (lambda) to predict the coordinates of the category and the position of the category in the graph;

13) acquiring an image to be counted, acquiring a wheat spider image shot in the field, and preprocessing the image to be counted;

14) And (3) obtaining the number of the wheat spiders, namely inputting the image to be counted into a wheat spider detection counting model to obtain the number of the wheat spiders in the image.

2. The method for automatically counting the number of the wheat spiders based on the combination of the multi-scale feature fusion network and the positioning model as claimed in claim 1, wherein the obtaining of the number of the wheat spiders comprises the following steps:

21) inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;

22) inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, and counting the number of the wheat spiders to obtain the number of the wheat spiders in the image.