CN109145770B - Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model - Google Patents

Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model Download PDF

Info

Publication number
CN109145770B
CN109145770B CN201810863041.1A CN201810863041A CN109145770B CN 109145770 B CN109145770 B CN 109145770B CN 201810863041 A CN201810863041 A CN 201810863041A CN 109145770 B CN109145770 B CN 109145770B
Authority
CN
China
Prior art keywords
layer
wheat
similarity
deconvolution
spiders
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810863041.1A
Other languages
Chinese (zh)
Other versions
CN109145770A (en
Inventor
李�瑞
王儒敬
谢成军
张洁
陈天娇
陈红波
胡海瀛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Institutes of Physical Science of CAS
Original Assignee
Hefei Institutes of Physical Science of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Institutes of Physical Science of CAS filed Critical Hefei Institutes of Physical Science of CAS
Priority to CN201810863041.1A priority Critical patent/CN109145770B/en
Publication of CN109145770A publication Critical patent/CN109145770A/en
Application granted granted Critical
Publication of CN109145770B publication Critical patent/CN109145770B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for automatically counting wheat spiders based on combination of a multi-scale feature fusion network and a positioning model, which overcomes the defect of high error rate of image detection aiming at small targets compared with the prior art. The invention comprises the following steps: establishing a training sample; constructing a wheat spider detection counting model; acquiring an image to be counted; and obtaining the number of the wheat spiders. The invention realizes the direct identification and counting of the wheat spiders under the field natural environment.

Description

Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model
Technical Field
The invention relates to the technical field of image recognition, in particular to an automatic counting method for wheat spiders based on combination of a multi-scale feature fusion network and a positioning model.
Background
Wheat is one of main grain crops in China, is easily damaged by various pests in the production process of the wheat, and wheat spiders are one of the main grain crops, and can suck wheat leaf juice and even dry up the wheat leaf juice, so that the yield of the wheat is seriously influenced. The detection of pest population quantity is an important means for pest control, and provides a theoretical basis for pest control decision. Therefore, the identification and counting of wheat spiders in the field are important for improving the yield of wheat.
With the rapid development of computer vision technology and image processing technology, image-based pest automatic identification and counting technology has become a research focus in recent years. Although the method is time-saving and labor-saving, has the advantages of intellectualization and the like, the method cannot be applied to the identification and counting of the wheat spiders in the field. The reason is that: firstly, the individual wheat spiders are only a few millimeters small, and are difficult to detect by using a traditional image recognition technology (SVM) aiming at such small targets; secondly, when the image is collected, the quality of the image is influenced by unstable and uneven illumination of the external environment; moreover, in practical application, the acquired image is often mixed with other impurities, and the background is complex.
Therefore, how to detect small targets such as wheat spiders in a complex environment has become an urgent technical problem to be solved.
Disclosure of Invention
The invention aims to solve the defect of high error rate of image detection for small targets in the prior art, and provides an automatic wheat spider counting method based on combination of a multi-scale feature fusion network and a positioning model to solve the problems.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a method for automatically counting wheat spiders based on combination of a multi-scale feature fusion network and a positioning model comprises the following steps:
Establishing a training sample, namely acquiring more than 2000 images of the wheat spiders in the field natural environment as the training images, and marking the wheat spiders in the images to obtain the training sample;
constructing a wheat spider detection counting model;
constructing a positioning model;
constructing a multi-scale feature fusion network, and transforming a multi-scale feature fusion network structure;
training a multi-scale feature fusion network, training the features of the candidate region positioned by the training sample according to the positioning model, and taking the output result of each layer as a prediction result;
acquiring an image to be counted, acquiring a wheat spider image shot in the field, and preprocessing the image to be counted to obtain the image to be counted;
and (3) obtaining the number of the wheat spiders, namely inputting the image to be counted into a wheat spider detection counting model to obtain the number of the wheat spiders in the image.
The construction positioning model comprises the following steps:
setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R ═ R { R-1,r2,...rn} segmented areas;
calculating the similarity of color information, obtaining 25 histograms of each color channel of the image by normalization in an L1 paradigm, and calculating the similarity of the color space by the following calculation formula:
Figure BDA0001750193590000021
Wherein f iscolor(ri,rj) Indicates a divided region riAnd rjThe color space similarity of (1);
Figure BDA0001750193590000024
denotes the ith channel, the kth histogram vector, i 1,2,3, k 0,1iDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area;
Figure BDA0001750193590000025
represents the jth channel, the kth histogram vector, j ═ 1,2,3, and m represents the m histograms;
calculating the similarity of the edge information, calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms of each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:
Figure BDA0001750193590000022
wherein f isedage(ri,rj) Indicates a divided region riAnd rjThe degree of similarity of the edge information of (2),
Figure BDA0001750193590000023
representing the ith channel, the kth histogram vector,
Figure BDA0001750193590000031
representing the jth channel, the kth histogram vector, where i 1,2,3, j 1,2,3, k 0,1.., 10; n represents the number of histograms, m represents the number of m histograms, riDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area;
calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:
Figure BDA0001750193590000032
wherein f isarea(ri,rj) Representing the divided region riAnd rjArea size similarity of() Representing the area of the region; area (img) represents the area of the picture, riDenotes a divided region R ═ { R ═ R 1,r2,...rnI-th area, rjDenotes a divided region R ═ R { (R)1,r2,...rnJth area;
and fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:
f(ri,rj)=w1fcolor(ri,rj)+w2fedage(ri,rj)+w3farea(ri,rj),
wherein, f (r)i,rj) Represents a divided region riAnd rjSimilarity after fusion, w1、w2、w3Weights r representing information similarity, edge information similarity, and region size similarity, respectivelyiDenotes a divided region R ═ R { (R)1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area.
The construction of the multi-scale feature fusion network comprises the following steps:
setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer;
setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2 and outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n;
and connecting the feature map of the layer 1 and the feature map of the layer 2 … of the nth layer with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the nth layer through 1 × 1 convolution kernels to generate the multi-scale feature fusion network.
The training multi-scale feature fusion network comprises the following steps:
Inputting the training sample into a positioning model, and positioning a candidate region of the training sample by the positioning model;
respectively inputting the candidate regions of the training sample into the 1 st layer of the multi-scale neural network, and outputting a 1 st layer characteristic diagram by the 1 st layer of the multi-scale neural network;
inputting the feature map of the layer 1 into the layer 2 of the multi-scale neural network, and outputting the feature map of the layer 2 from the layer 1 of the multi-scale neural network until the feature map of the layer n-1 is input into the nth layer of the multi-scale neural network;
performing deconvolution operation on the n-th layer of feature map to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1-th layer to generate an n-1-th deconvolution layer, so as to obtain a 1-st deconvolution layer;
the characteristic diagrams of the 1 st layer, the 2 nd layer, … to the nth layer are connected with the deconvolution layer of the 1 st layer, the deconvolution layer of the 2 nd layer, … to the nth layer by 1 × 1 convolution kernels;
connecting the feature map of the 1 st layer with the deconvolution layer of the 1 st layer through a 1 x 1 convolution kernel, extracting features of the first layer, and generating a prediction result of the first layer; connecting the feature map of the 2 nd layer with the deconvolution layer of the 2 nd layer through a 1 x 1 convolution kernel, extracting features of a second layer, and generating a prediction result of the second layer; …, generating an nth layer of prediction result until the nth layer of feature graph is connected with the nth layer of deconvolution layer through a 1 x 1 convolution kernel and then the nth layer of feature is extracted;
And (3) performing regression processing on the layer 1 prediction result, the layer 2 prediction result, … and the results of the nth layer to generate a final prediction result, wherein the regression function is as follows:
Figure BDA0001750193590000041
wherein C (lambda) is the final prediction result, lambda represents the training parameter, n represents the number of network layers, y(j)Representing true classes, pλ(x(j)) Representing the result of the j-th layer prediction; x is a radical of a fluorine atom(j)A feature vector representing a j-th layer;
and obtaining a final score through C (lambda) to predict the coordinates of the category and the position of the category in the graph.
The method for acquiring the number of the wheat spiders comprises the following steps:
inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;
inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, and counting the number of the wheat spiders to obtain the number of the wheat spiders in the image.
Advantageous effects
Compared with the prior art, the automatic wheat spider counting method based on the combination of the multi-scale feature fusion network and the positioning model realizes direct identification and counting of wheat spiders in the field natural environment.
The invention eliminates the influence of illumination on detection counting through pretreatment, and simplifies the complex environment; then, positioning a candidate area of the suspected wheat spider by a positioning model method; and performing feature extraction on the candidate region by using a multi-scale feature fusion network, and then finally determining the wheat spider region through multi-prediction result regression. The positioning (determination) of the candidate region greatly reduces the feature extraction time and feature dimension, and enhances the counting instantaneity; meanwhile, regression fusion of multiple prediction results ensures that the wheat spiders of all scales can be accurately detected, and robustness and accuracy of automatic detection and counting are improved.
Drawings
FIG. 1 is a sequence diagram of the method of the present invention;
FIG. 2a is a diagram illustrating the detection result of training samples by using the conventional SVM technique in the prior art;
FIG. 2b is a graph showing the results of detection by the method of the present invention;
FIG. 3 is a schematic diagram of a multi-scale feature fusion network structure according to the present invention.
Detailed Description
So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:
as shown in fig. 1, the automatic wheat spider counting method based on the combination of the multi-scale feature fusion network and the positioning model includes the following steps:
in the first step, training samples are established. More than 2000 images of the wheat spiders in the field natural environment are obtained as training images, and the wheat spiders in the images are marked to obtain training samples.
And secondly, constructing a wheat spider detection counting model. And constructing a positioning model and a multi-scale feature fusion network, extracting candidate regions of the training sample by using the positioning model, and classifying the candidate regions after extracting the features of the candidate regions through the multi-scale fusion network, wherein if the candidate regions are the candidate regions of the wheat spider, the candidate regions of the wheat spider are cancelled.
First, a positioning model is constructed. In order to reduce the feature extraction time, reduce the feature vector dimension and enhance the real-time performance of automatic counting, a positioning model is firstly used for positioning a candidate region of the wheat spider, and then feature extraction is carried out according to the candidate region.
The method comprises the following steps:
(1) setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R (R) and { R ═ R1,r2,...rnAnd } division areas.
(2) And calculating the similarity of the color information. The normalization of L1 paradigm is used to obtain 25 histograms of each color channel of the image, and the similarity of the color space is calculated according to the following formula:
Figure BDA0001750193590000061
wherein, fcolor(ri,rj) Represents a divided region riAnd rjThe color space similarity of (a);
Figure BDA0001750193590000062
denotes the ith channel, the kth histogram vector, i 1,2,3, k 0,1iDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area;
Figure BDA0001750193590000067
represents the jth channel, the kth histogram vector, j ═ 1,2,3, and m represents the m histograms.
(3) And calculating the similarity of the edge information. Calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms for each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:
Figure BDA0001750193590000063
Wherein, fedage(ri,rj) Represents a divided region riAnd rjThe degree of similarity of the edge information of (a),
Figure BDA0001750193590000064
representing the ith channel, the kth histogram vector,
Figure BDA0001750193590000065
a k-th histogram vector representing the j-th channel, where i 1,2,3, j 1,2,3, k 0,1.., 10; n represents the number of histograms, m represents the number of m histograms, riDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area.
(4) Calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:
Figure BDA0001750193590000066
wherein f isarea(ri,rj) Representing the divided region riAnd rjArea () represents an area of the region; area (img) represents the area of the picture, riDenotes a divided region R ═ { R ═ R1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area.
(5) And fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:
f(ri,rj)=w1fcolor(ri,rj)+w2fedage(ri,rj)+w3farea(ri,rj),
wherein, f (r)i,rj) Indicates a divided region riAnd rjSimilarity after fusion, w1、w2、w3Weights r respectively representing information similarity, edge information similarity, and region size similarityiDenotes a divided region R ═ { R ═ R1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area.
Through the combination of color information similarity, edge information similarity and region size similarity, riAnd rjAnd continuously generating n regions which are the candidate regions of the wheat spider.
Secondly, constructing a multi-scale feature fusion network, and reconstructing a multi-scale feature fusion network structure. In order to better extract the characteristics of the wheat spiders with each scale and various forms, a multi-scale characteristic fusion network is designed to accurately distinguish accurate regions of candidate regions of the wheat spiders.
As shown in FIG. 3, the multi-scale feature fusion network structure constructs a marginal multi-scale feature network by utilizing the inherent multi-scale and cone-shaped hierarchical structure feature maps, and develops a top-down architecture with lateral connection for constructing a high-level semantic feature map on all scales. The method comprises the following steps:
(1) and setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer.
(2) Setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2, outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n.
(3) And connecting the feature map of the layer 1 and the feature map of the layer 2 … of the layer n with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the layer n through a 1 x 1 convolution kernel to generate the multi-scale feature fusion network.
Generating a feature map by downsampling, taking each training picture as input, extracting features by adopting a multi-scale neural network, and generating a feature map by downsampling each layer of the multi-scale neural network;
and then, the last layer of deconvolution generates a feature map with the size of the previous layer, and the iteration is carried out in turn until a second layer of size is generated. The characteristic graph is smaller and smaller due to down-sampling of each layer of the multi-scale network, and the number of the wheat spiders in the characteristic graph is smaller and even reaches a few pixels, so that the influence on the detection count of the wheat spiders is large. In order to avoid the problem, deconvolution operation is adopted for each layer of pyramid image, and the feature map is amplified to the size of the upper layer through upsampling, so that the pest features can be effectively extracted, and the size of the wheat spiders in the image is ensured;
and connecting the feature maps of the layers generated by deconvolution through 1-by-1 convolution kernels to generate a multi-scale feature fusion network.
And finally, training the multi-scale feature fusion network. And training by taking the candidate area positioned by the positioning model aiming at the training sample as a characteristic, and taking the output result of each layer as a prediction result. The method comprises the following specific steps:
(1) And inputting the training samples into a positioning model, and positioning the candidate regions of the training samples by the positioning model.
(2) And respectively inputting the candidate regions of the training sample into the layer 1 of the multi-scale neural network, and outputting a layer 1 characteristic diagram by the layer 1 of the multi-scale neural network.
(3) Inputting the layer 1 characteristic diagram into the layer 2 of the multi-scale neural network, and outputting the layer 2 characteristic diagram by the layer 1 of the multi-scale neural network until the n-1 layer characteristic diagram is input into the nth layer of the multi-scale neural network.
(4) And performing deconvolution operation on the n-th layer feature diagram to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1 st layer to generate an n-1-th deconvolution layer, so as to reach the 1 st deconvolution layer.
(5) The layer 1 signature, the layer 2 signature, … through the nth signature are connected to the layer 1 deconvolution layer, the layer 2 deconvolution layer, … through the nth deconvolution layer by a 1 x 1 convolution kernel.
(6) Connecting the layer 1 feature graph with the layer 1 deconvolution layer through a 1 x 1 convolution kernel, extracting first layer features, and generating a first layer prediction result; connecting the 2 nd layer feature graph with the 2 nd layer deconvolution layer through 1 x 1 convolution kernel, extracting second layer features, and generating a second layer prediction result; … until the n-th layer feature graph and the n-th layer deconvolution layer are connected through a 1 x 1 convolution kernel, and after the n-th layer features are extracted, the n-th layer prediction result is generated.
(7) And (3) performing regression processing on the layer 1 prediction result, the layer 2 prediction result, … and the results of the nth layer to generate a final prediction result, wherein the regression function is as follows:
Figure BDA0001750193590000081
wherein C (lambda) is the final prediction result, lambda represents the training parameter, n represents the number of network layers, y(j)Representing true classes, pλ(x(j)) Representing the result of the j-th layer prediction; x is a radical of a fluorine atom(j)A feature vector representing a j-th layer;
(8) and obtaining a final score through C (lambda) to predict the category and the coordinate of the category in the graph, wherein the coordinate is the position of the wheat spider in the graph.
And thirdly, acquiring an image to be counted. And acquiring a wheat spider image shot in the field, and preprocessing the wheat spider image to obtain an image to be counted.
And fourthly, obtaining the number of the wheat spiders. And inputting the image to be counted into the wheat spider detection counting model to obtain the number of the wheat spiders in the image. The method comprises the following specific steps:
(1) inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;
(2) inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, counting the number of the wheat spiders, and obtaining the number of the wheat spiders in the image.
As shown in fig. 2a, it is a graph of the detection result of the wheat spider obtained by the SVM algorithm. As can be seen from fig. 2a, the area of the wheat spiders detected by the small boxes is very large, especially in the middle large box area of fig. 2a, which erroneously puts all of a plurality of relatively concentrated wheat spiders within the range of one large box. The reason for this false indication is that: the traditional SVM algorithm does not perform early positioning, and if a candidate area is positioned by adopting a positioning model, the phenomenon can be avoided; in fig. 2a, the reason why the wheat spider area detected by the small box is very large is that: the traditional SVM algorithm does not use regression fusion of multiple prediction results, and in fig. 2a, some small boxes are misidentified.
As shown in fig. 2b, compared with the conventional SVM algorithm, the method of the present invention can accurately locate the number and specific positions of the wheat spiders, and has high robustness and accuracy.
The foregoing shows and describes the general principles, principal features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (2)

1. A wheat spider automatic counting method based on combination of a multi-scale feature fusion network and a positioning model is characterized by comprising the following steps:
11) establishing a training sample, namely acquiring more than 2000 images of the wheat spiders in the field natural environment as training images, and marking the wheat spiders in the images to obtain the training sample;
12) constructing a wheat spider detection counting model;
121) constructing a positioning model; the construction positioning model comprises the following steps:
1211) Setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R (R) and { R ═ R1,r2,...rn} segmented areas;
1212) calculating the similarity of color information, obtaining 25 histograms of each color channel of the image by normalization using an L1 paradigm, and calculating the similarity of color space according to the following calculation formula:
Figure FDA0003669924000000011
wherein, fcolor(ri,rj) Represents a divided region riAnd rjThe degree of similarity in the color space of (c),
Figure FDA0003669924000000012
denotes the a channel, k histogram vector, a is 1,2,3, k is 0,1iDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area;
Figure FDA0003669924000000013
represents the b-th channel, the k-th histogram vector, b ═ 1,2, 3;
1213) calculating the similarity of the edge information, calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms for each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:
Figure FDA0003669924000000014
wherein f isedage(ri,rj) Indicates a divided region riAnd rjThe degree of similarity of the edge information of (2),
Figure FDA0003669924000000015
representing the e-th channel, the k-th histogram vector,
Figure FDA0003669924000000016
represents the f-th channel, the k-th histogram vector, where e is 1,2,3, f is 1,2,3, k is 0,1.., 10; q represents the number of histograms, r iDenotes a divided region R ═ R { (R)1,r2,...rnThe ith area;
1214) calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:
Figure FDA0003669924000000021
wherein f isarea(ri,rj) Represents a divided region riAnd rjArea () represents an area of the region; area (img) represents the area of the picture, riDenotes a divided region R ═ { R ═ R1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area;
1215) and fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:
f(ri,rj)=w1fcolor(ri,rj)+w2fedage(ri,rj)+w3farea(ri,rj),
wherein, f (r)i,rj) Represents a scoreCutting region riAnd rjSimilarity after fusion, w1、w2、w3Weights r respectively representing information similarity, edge information similarity, and region size similarityiDenotes a divided region R ═ { R ═ R1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area;
122) constructing a multi-scale feature fusion network, and transforming a multi-scale feature fusion network structure; the construction of the multi-scale feature fusion network comprises the following steps:
1221) setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer;
1222) setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2 and outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n;
1223) Connecting the feature map of the layer 1 and the feature map of the layer 2 … of the nth layer with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the nth layer through 1-1 convolution kernels to generate a multi-scale feature fusion network;
123) training a multi-scale feature fusion network, training by taking a candidate region positioned by a positioning model aiming at a training sample as a feature, and taking an output result of each layer as a prediction result;
the training multi-scale feature fusion network comprises the following steps:
1231) inputting the training sample into a positioning model, and positioning a candidate region of the training sample by the positioning model;
1232) respectively inputting the candidate regions of the training sample into the 1 st layer of the multi-scale neural network, and outputting a 1 st layer characteristic diagram by the 1 st layer of the multi-scale neural network;
1233) inputting the feature map of the layer 1 into the layer 2 of the multi-scale neural network, and outputting the feature map of the layer 2 by the layer 2 of the multi-scale neural network until the feature map of the layer n-1 is input into the nth layer of the multi-scale neural network;
1234) performing deconvolution operation on the n-th layer of feature map to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1-th layer to generate an n-1-th deconvolution layer, so as to obtain a 1-st deconvolution layer;
1235) The characteristic diagrams of the 1 st layer, the 2 nd layer, … to the nth layer are connected with the deconvolution layer of the 1 st layer, the deconvolution layer of the 2 nd layer, … to the nth layer by 1 × 1 convolution kernels;
1236) connecting the feature map of the 1 st layer with the deconvolution layer of the 1 st layer through a 1 x 1 convolution kernel, extracting features of the first layer, and generating a prediction result of the first layer; connecting the feature map of the 2 nd layer with the deconvolution layer of the 2 nd layer through a 1 x 1 convolution kernel, extracting features of a second layer, and generating a prediction result of the second layer; …, generating an nth layer of prediction result until the nth layer of feature graph is connected with the nth layer of deconvolution layer through a 1 x 1 convolution kernel and then the nth layer of feature is extracted;
1237) and performing regression processing on the prediction result of the 1 st layer, the prediction result of the 2 nd layer, … till the result of the nth layer to generate a final prediction result, wherein the regression function is as follows:
Figure FDA0003669924000000031
wherein C (lambda) is the final prediction result, lambda represents the training parameter, n represents the number of network layers, y represents(j)Representing true classes, pλ(x(j)) Representing the result of the prediction of the j-th layer; x is the number of(j)A feature vector representing a j-th layer;
1238) obtaining a final score through C (lambda) to predict the coordinates of the category and the position of the category in the graph;
13) acquiring an image to be counted, acquiring a wheat spider image shot in the field, and preprocessing the image to be counted;
14) And (3) obtaining the number of the wheat spiders, namely inputting the image to be counted into a wheat spider detection counting model to obtain the number of the wheat spiders in the image.
2. The method for automatically counting the number of the wheat spiders based on the combination of the multi-scale feature fusion network and the positioning model as claimed in claim 1, wherein the obtaining of the number of the wheat spiders comprises the following steps:
21) inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;
22) inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, and counting the number of the wheat spiders to obtain the number of the wheat spiders in the image.
CN201810863041.1A 2018-08-01 2018-08-01 Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model Active CN109145770B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810863041.1A CN109145770B (en) 2018-08-01 2018-08-01 Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810863041.1A CN109145770B (en) 2018-08-01 2018-08-01 Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model

Publications (2)

Publication Number Publication Date
CN109145770A CN109145770A (en) 2019-01-04
CN109145770B true CN109145770B (en) 2022-07-15

Family

ID=64798885

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810863041.1A Active CN109145770B (en) 2018-08-01 2018-08-01 Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model

Country Status (1)

Country Link
CN (1) CN109145770B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428413B (en) * 2019-08-02 2021-09-28 中国科学院合肥物质科学研究院 Spodoptera frugiperda imago image detection method used under lamp-induced device
CN110689081B (en) * 2019-09-30 2020-08-21 中国科学院大学 Weak supervision target classification and positioning method based on bifurcation learning
CN112651462A (en) * 2021-01-04 2021-04-13 楚科云(武汉)科技发展有限公司 Spider classification method and device and classification model construction method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850836A (en) * 2015-05-15 2015-08-19 浙江大学 Automatic insect image identification method based on depth convolutional neural network
CN106845401A (en) * 2017-01-20 2017-06-13 中国科学院合肥物质科学研究院 A kind of insect image-recognizing method based on many spatial convoluted neutral nets
CN107016680A (en) * 2017-02-24 2017-08-04 中国科学院合肥物质科学研究院 A kind of insect image background minimizing technology detected based on conspicuousness
CN107133943A (en) * 2017-04-26 2017-09-05 贵州电网有限责任公司输电运行检修分公司 A kind of visible detection method of stockbridge damper defects detection
CN107292314A (en) * 2016-03-30 2017-10-24 浙江工商大学 A kind of lepidopterous insects species automatic identification method based on CNN
CN107346424A (en) * 2017-06-30 2017-11-14 成都东谷利农农业科技有限公司 Lamp lures insect identification method of counting and system
CN107808116A (en) * 2017-09-28 2018-03-16 中国科学院合肥物质科学研究院 A kind of wheat spider detection method based on the fusion study of depth multilayer feature
KR20180053003A (en) * 2016-11-11 2018-05-21 전북대학교산학협력단 Method and apparatus for detection and diagnosis of plant diseases and insects using deep learning

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016197303A1 (en) * 2015-06-08 2016-12-15 Microsoft Technology Licensing, Llc. Image semantic segmentation
US10354159B2 (en) * 2016-09-06 2019-07-16 Carnegie Mellon University Methods and software for detecting objects in an image using a contextual multiscale fast region-based convolutional neural network
US10262237B2 (en) * 2016-12-08 2019-04-16 Intel Corporation Technologies for improved object detection accuracy with multi-scale representation and training
CN107016405B (en) * 2017-02-24 2019-08-30 中国科学院合肥物质科学研究院 A kind of pest image classification method based on classification prediction convolutional neural networks
CN107368787B (en) * 2017-06-16 2020-11-10 长安大学 Traffic sign identification method for deep intelligent driving application
CN108062531B (en) * 2017-12-25 2021-10-19 南京信息工程大学 Video target detection method based on cascade regression convolutional neural network
CN108256481A (en) * 2018-01-18 2018-07-06 中科视拓(北京)科技有限公司 A kind of pedestrian head detection method using body context

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850836A (en) * 2015-05-15 2015-08-19 浙江大学 Automatic insect image identification method based on depth convolutional neural network
CN107292314A (en) * 2016-03-30 2017-10-24 浙江工商大学 A kind of lepidopterous insects species automatic identification method based on CNN
KR20180053003A (en) * 2016-11-11 2018-05-21 전북대학교산학협력단 Method and apparatus for detection and diagnosis of plant diseases and insects using deep learning
CN106845401A (en) * 2017-01-20 2017-06-13 中国科学院合肥物质科学研究院 A kind of insect image-recognizing method based on many spatial convoluted neutral nets
CN107016680A (en) * 2017-02-24 2017-08-04 中国科学院合肥物质科学研究院 A kind of insect image background minimizing technology detected based on conspicuousness
CN107133943A (en) * 2017-04-26 2017-09-05 贵州电网有限责任公司输电运行检修分公司 A kind of visible detection method of stockbridge damper defects detection
CN107346424A (en) * 2017-06-30 2017-11-14 成都东谷利农农业科技有限公司 Lamp lures insect identification method of counting and system
CN107808116A (en) * 2017-09-28 2018-03-16 中国科学院合肥物质科学研究院 A kind of wheat spider detection method based on the fusion study of depth multilayer feature

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Robust object tracking via multi-scale patch based sparse coding histogram;Zhong W 等;《Proc IEEE Conf Comput Vision Pattern Recognit》;20121231;第1838-1845页 *
Selective Search for Object Recognition;J.R.R. Uijlings 等;《International Journal of Computer Vision》;20130930;第3节 *
基于稀疏编码金字塔模型的农田害虫图像识别;谢成军 等;《农业工程学报》;20160930;第32卷(第7期);第144-151页 *
基于稀疏表示的多特征融合害虫图像识别;胡永强 等;《模式识别与人工智能》;20141130;第27卷(第11期);第985-992页 *
深度学习之检测模型-FPN;leo_whz;《https://blog.csdn.net/whz1861/article/details/79042283》;20180112;第1-3页 *

Also Published As

Publication number Publication date
CN109145770A (en) 2019-01-04

Similar Documents

Publication Publication Date Title
CN110770752A (en) Automatic pest counting method combining multi-scale feature fusion network with positioning model
CN109344701B (en) Kinect-based dynamic gesture recognition method
Li et al. SAR image change detection using PCANet guided by saliency detection
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
CN109154978B (en) System and method for detecting plant diseases
CN108009559B (en) Hyperspectral data classification method based on space-spectrum combined information
US8908919B2 (en) Tactical object finder
CN111753828B (en) Natural scene horizontal character detection method based on deep convolutional neural network
Li et al. A coarse-to-fine network for aphid recognition and detection in the field
CN109684922B (en) Multi-model finished dish identification method based on convolutional neural network
CN110766041B (en) Deep learning-based pest detection method
CN108960404B (en) Image-based crowd counting method and device
WO2022028031A1 (en) Contour shape recognition method
CN112200121B (en) Hyperspectral unknown target detection method based on EVM and deep learning
CN109145770B (en) Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model
CN112733614B (en) Pest image detection method with similar size enhanced identification
CN109801305B (en) SAR image change detection method based on deep capsule network
CN114897816A (en) Mask R-CNN mineral particle identification and particle size detection method based on improved Mask
CN113221956B (en) Target identification method and device based on improved multi-scale depth model
Trivedi et al. Automatic segmentation of plant leaves disease using min-max hue histogram and k-mean clustering
CN112464983A (en) Small sample learning method for apple tree leaf disease image classification
CN111882554B (en) SK-YOLOv 3-based intelligent power line fault detection method
CN111639697B (en) Hyperspectral image classification method based on non-repeated sampling and prototype network
CN116596875A (en) Wafer defect detection method and device, electronic equipment and storage medium
CN108960005B (en) Method and system for establishing and displaying object visual label in intelligent visual Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant