CN113379738A - Method and system for detecting and positioning epidemic trees based on images - Google Patents
Method and system for detecting and positioning epidemic trees based on images Download PDFInfo
- Publication number
- CN113379738A CN113379738A CN202110821397.0A CN202110821397A CN113379738A CN 113379738 A CN113379738 A CN 113379738A CN 202110821397 A CN202110821397 A CN 202110821397A CN 113379738 A CN113379738 A CN 113379738A
- Authority
- CN
- China
- Prior art keywords
- point
- image
- target
- rpn
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000001514 detection method Methods 0.000 claims abstract description 62
- 239000002023 wood Substances 0.000 claims abstract description 34
- 230000004927 fusion Effects 0.000 claims abstract description 29
- 235000008331 Pinus X rigitaeda Nutrition 0.000 claims abstract description 27
- 235000011613 Pinus brutia Nutrition 0.000 claims abstract description 27
- 241000018646 Pinus brutia Species 0.000 claims abstract description 27
- 238000010586 diagram Methods 0.000 claims abstract description 13
- 238000003384 imaging method Methods 0.000 claims abstract description 7
- 230000001629 suppression Effects 0.000 claims abstract description 6
- 238000012549 training Methods 0.000 claims description 26
- 238000007781 pre-processing Methods 0.000 claims description 19
- 238000013527 convolutional neural network Methods 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 10
- 239000013256 coordination polymer Substances 0.000 claims description 9
- 238000013135 deep learning Methods 0.000 claims description 8
- 238000012360 testing method Methods 0.000 claims description 7
- 238000012795 verification Methods 0.000 claims description 5
- 230000003321 amplification Effects 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 4
- 238000002372 labelling Methods 0.000 claims description 4
- 238000004519 manufacturing process Methods 0.000 claims description 4
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 4
- 238000013519 translation Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims description 2
- 230000009466 transformation Effects 0.000 claims description 2
- 230000005764 inhibitory process Effects 0.000 claims 1
- 238000012544 monitoring process Methods 0.000 abstract description 10
- 238000013528 artificial neural network Methods 0.000 abstract description 3
- 238000004880 explosion Methods 0.000 abstract 1
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 241000218631 Coniferophyta Species 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 2
- 241000243771 Bursaphelenchus xylophilus Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 241000233614 Phytophthora Species 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention provides an image-based epidemic wood detection and positioning method and system. The invention designs a multi-scale candidate region fusion network, wherein a characteristic diagram obtained by ResNet18 is used for detecting RPN (S-RPN) of a small target, a characteristic diagram obtained by ResNet32 is used for detecting RPN (M-RPN) of a medium target, and a characteristic diagram obtained by ResNet50 is used for detecting RPN (L-RPN) of a large target. Non-maximum suppression (NMS) is then employed to reduce redundant candidate regions. The network can solve the problem that the existing target detection network model based on the deep neural network is difficult to deal with multi-scale pine forest epidemic wood detection. The invention provides a method for constructing a log three-dimensional positioning geometric model according to a camera imaging principle of a holder monitoring tower and digital terrain elevation Data (DEM), which can accurately realize the three-dimensional positioning of a log explosion point. The method is simple and quick, and can improve the monitoring and early warning capability of the pine forest epidemic trees.
Description
Technical Field
The invention belongs to the technical field of image detection and positioning, and particularly relates to a multi-scale candidate region fusion epidemic wood detection network and an epidemic wood three-dimensional positioning geometric model based on deep learning.
Background
The pine wood nematode disease is one of pine wood diseases with the greatest damage to forestry in China and is called cancer of pine trees. It takes only 3-5 years from the infection of a pine tree to the death of the entire pine forest, and the infected pine is called an epidemic. Therefore, epidemic wood detection is the first major thing for pine forest protection. The most notable feature of infected pine trees over normal healthy pine trees is that the entire crown conifer of the tree is yellow-brown or reddish-brown, while the crown conifer of normal healthy pine trees is green. The existing pine forest epidemic wood detection method mainly comprises the following steps: 1) the ground investigation method includes collecting coordinate position information of single-plant epidemic trees and collecting samples by means of on-site visit of forest protection personnel, and is high in detection accuracy and reliability, but high in labor and time cost and low in efficiency; 2) according to the satellite remote sensing detection method, the area monitoring of large-area pine forest is mostly realized through remote sensing images in the current forestry monitoring, and because the satellite revisit period is long and is influenced by cloud and fog, the spatial resolution of the remote sensing images is not high, so that the monitoring of single-plant epidemic trees is difficult to realize at present; 3) an Unmanned Aerial Vehicle (UAV) detection method, which is limited by the endurance time of the UAV, and cannot detect pine forest in a large area in real time for a long time; 4) the cradle head observation tower monitoring method is characterized in that a monitoring camera is arranged at a high point of a cradle head observation tower to carry out near-ground low-altitude shooting on pine forest, shot video has high spatial resolution, and single-frame RGB images in the video are extracted to carry out detection and analysis, so that the method is an effective monitoring method for pine forest epidemic wood detection at present.
With the great progress of the deep learning target detection method, the target detection method based on the image can detect the interested target in the image through the deep neural network model. And features, especially high-level semantic features, can be automatically extracted through a Convolutional Neural Network (CNN), and good effects are achieved in the field of remote sensing image target detection. However, when the camera of the pan-tilt observation tower is used for shooting the image of the pine forest, the shooting has difference between a long shot and a short shot, and the size and the position of the crown of the pine tree in the high-resolution image are greatly changed along with the distance from the pan-tilt. The conventional target detection network based on the CNN has a single candidate frame size, so that the detection result is poor when epidemic trees with different sizes are detected. Meanwhile, the pixel coordinates of the epidemic trees which are detected only in the image level have no monitoring significance, and the spatial geographic coordinates of the epidemic trees relative to the mountain terrain need to be accurately positioned, so that the actual utility can be brought to the discovery and protection work of the epidemic trees.
Disclosure of Invention
The invention aims to provide an image-based epidemic wood detection and positioning method aiming at the epidemic wood detection and positioning of a pan-tilt observation tower image, and is used for solving the problems that the existing target detection network model based on a deep neural network is difficult to deal with multi-scale pine forest epidemic wood detection and cannot accurately acquire the geographic position of the epidemic wood.
The technical scheme of the invention is as follows:
an image-based epidemic wood detection and positioning method comprises the following steps:
step 1: video preprocessing: and extracting a single-frame image from a video shot by the holder observation tower, and then carrying out image preprocessing and labeling to manufacture a data set for network model training, verification and testing.
Step 2: and (3) constructing a multi-scale candidate region fusion network, and training the model by the data set obtained in the step (1).
And step 3: and inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result. And calculating specific geographical three-dimensional coordinates of the epidemic trees according to the three-dimensional positioning geometric model of the epidemic trees.
Further, step 2 comprises the following substeps:
step 2-1: firstly, preprocessing an input image of the multi-scale candidate regional fusion network, then fixing the image scaling to 224 × 224 pixels, and finally respectively taking the 224 × 224 pixel image and the enlarged image and the reduced image as the input of the multi-scale candidate regional fusion network.
Step 2-2: the 224 × 224 pixel images are input to a convolutional neural network (ResNet32), the 224 × 224 pixel image with one time enlargement is input to a convolutional neural network (ResNet18), and the 224 × 224 pixel image with one time reduction is input to a convolutional neural network (ResNet50), so that corresponding feature maps are obtained.
Step 2-3: and acquiring a epidemic wood candidate region from the characteristic map. A feature map obtained by using ResNet18 in the RPN (candidate area generation network) for detecting small targets is called S-RPN; detecting a characteristic graph obtained by ResNet32 of the RPN of the medium-sized target, and the characteristic graph is called M-RPN; the characteristic graph obtained by detecting the RPN of a large target with ResNet50 is called L-RPN. Since the regions generated by each RPN may overlap each other, non-maximum suppression (NMS) is employed to reduce redundant candidate regions.
Step 2-4: training the multi-scale candidate region fused epidemic wood detection network on a deep learning machine through the data set obtained in the step 1, repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
Further, step 3 comprises the following substeps:
step 3-1: firstly, the two-dimensional pixel coordinates of the target on the image are obtained by the target detection algorithm in the step 2, and meanwhile, a regular grid Digital Elevation Model (DEM) is introduced to obtain the terrain of the three-dimensional coordinates of the pine forest.
Step 3-2: establishing a geometric relation model according to a Digital Elevation Model (DEM) and an imaging principle of a camera, wherein the core is that a target detection algorithm determines pixel coordinates of a target on an image, and then according to a corresponding proportional relation of distances from any point to a boundary in a reference window, a corresponding coordinate A (x) of a target point on the reference window is obtained by solvingA,yA,zA)。
Step 3-3: according to the Janus visibility algorithm idea, calculating to obtain the k-th bisector N from the viewpoint C to the point Pk(xNk,yNk,ZNk) Three-dimensional coordinates of (c).
Step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Wherein n is 4, ZiElevation, d, of grid nodes around the grid node corresponding to the kth aliquot pointiThe distance from the mesh nodes to the interpolation points.
Step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
In the invention, when the epidemic trees of the video of the pan-tilt observation tower are detected and positioned, three stages are mainly adopted, firstly, the video data set of the pan-tilt observation tower is trained in the constructed multi-scale candidate area fusion network, and the strategy of fine tuning the learning rate is applied in the training process, so that the detection accuracy of the model is improved; then inputting the test image into a trained multi-scale candidate area fusion network to obtain two-dimensional coordinates of the epidemic wood target in the image; and finally, obtaining the specific three-dimensional geographic coordinates of the epidemic wood target according to the two-dimensional coordinates of the epidemic wood target in the image and the three-dimensional positioning geometric model of the epidemic wood.
The invention has the beneficial effects that:
the invention provides a multi-scale candidate area fusion network for epidemic wood detection, aiming at videos shot by a pinene holder observation tower. By analyzing the relation between the camera imaging principle and the camera related parameters and combining the characteristics of a Digital Elevation Model (DEM), the three-dimensional positioning method for the pine forest epidemic trees is provided. The invention can solve the problem that the existing target detection network model based on the deep neural network is difficult to deal with the detection of the multi-scale pine forest epidemic trees, and can realize the positioning of the geographic position of the epidemic trees. The method is simple and rapid, and can improve the monitoring and early warning capability of the pine forest epidemic trees.
Drawings
FIG. 1 is a multi-scale candidate area convergence network framework diagram of the present invention;
FIG. 2 is a three-dimensional positioning geometric model of Phytophthora.
Detailed Description
The following describes in further detail a specific embodiment of the present invention with reference to fig. 1 and 2.
As shown in fig. 1, an image-based epidemic wood detection and positioning method is specifically performed according to the following steps:
step 1: video preprocessing: and extracting a single-frame image from a video shot by the holder observation tower, and then carrying out image preprocessing and labeling to manufacture a data set for network model training, verification and testing.
step 1-1: single-frame images are extracted from videos shot by a holder observation tower, and then the single-frame images are subjected to manual data cleaning, wherein the manual data cleaning mainly comprises the steps of removing repeated images, missing images, useless images and the like.
Step 1-2: and according to the guidance suggestion of forestry disease and pest experts, carrying out data annotation on the cleaned image by adopting an image annotation tool for deep learning target detection. The shuffled data set is divided into a training set, a validation set, and a test set according to different ratios (e.g., 7: 2: 1).
Step 2: and (3) constructing a multi-scale candidate region fusion network, and training the model by the data set obtained in the step (1).
The step 2 comprises the following substeps:
step 2-1: firstly, preprocessing an input image of a multi-scale candidate region fusion network, including data enhancement processing such as image turning, rotation, color transformation, Gaussian noise addition and the like. And then, the enhanced image is fixed to 224 × 224 pixels in a scaling mode, and finally, the image of 224 × 224 pixels and the enlarged image and the reduced image are respectively used as the input of the multi-scale candidate region fusion network.
Step 2-2: inputting 224 × 224 pixel images into a convolutional neural network (ResNet32), inputting 224 × 224 pixel images with one time amplification into a convolutional neural network (ResNet18), and inputting 224 × 224 pixel images with one time reduction into a convolutional neural network (ResNet50) to obtain corresponding feature maps;
step 2-3: different candidate regions are derived from the feature map. A feature map obtained by using ResNet18 in the RPN (candidate area generation network) for detecting small targets is called S-RPN; detecting a characteristic graph obtained by ResNet32 of the RPN of the medium-sized target, and the characteristic graph is called M-RPN; the characteristic graph obtained by detecting the RPN of a large target with ResNet50 is called L-RPN. Since the regions generated by each RPN may overlap each other, non-maximum suppression (NMS) is employed to reduce redundant candidate regions. For each RPN model, first, each Anchor is assigned a positive or negative label. Positive labels follow two principles: (1) the Anchor with the highest IoU in the true value region; or (2) an Anchor with IoU values higher than 0.7 in the true area. Negative labels are anchors with IoU values below 0.3 in the true area. While the Anchor with IoU score in the range of 0.3-0.7 does not belong to either positive or negative label, and has no effect in RPN model training. Then, a multitask loss function is adopted as a loss function of model training, which is specifically defined as follows:
where i is the index of a Batch (Batch), piRepresenting the prediction probability that the candidate region generated by the ith anchor is the target region,the truth label for this region is represented, where 0 represents negative and 1 represents positive. Class penalty function of the above formulaAnd regression loss functionRespectively using Ncls、NregNormalized with a balance weight λ. For classification loss functionUsing log loss function, for regression loss functionA smoothL1 loss function was used. In the regression of the boundary frame, the coordinates of the central point of the prediction frame and the width and the height of the prediction frame are respectively regressed, and the prediction frame tiSum truth boxEach defined as follows.
Wherein x and y respectively represent the central coordinates of the prediction frame, and w and h respectively represent the width and height of the prediction frame; x is the number ofa、yaRespectively representing the center coordinates of the anchor, wa、haWidth and height of the anchor are indicated, respectively; x is the number of*、y*Respectively representing the central coordinates of the real value box, w*、h*Representing the width and height of the true value box, respectively.
Step 2-4: training the multi-scale candidate region fused epidemic wood detection network on a deep learning machine through the data set obtained in the step 1, repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
And step 3: and inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result. And calculating specific geographical three-dimensional coordinates of the epidemic trees according to the three-dimensional positioning geometric model of the epidemic trees. As shown in fig. 2, the method comprises the following substeps:
step 3-1: firstly, the two-dimensional pixel coordinates of the target on the image are obtained by the target detection algorithm in the step 2, and meanwhile, a regular grid Digital Elevation Model (DEM) is introduced to obtain the terrain of the three-dimensional coordinates of the pine forest.
Step 3-2: a geometric relation model is established according to a Digital Elevation Model (DEM) and the imaging principle of a camera, as shown in figure 2, the core is the pixel coordinates of a target on an image determined by a target detection algorithm, and then the corresponding coordinates of a target point on a reference window are obtained by solving according to the corresponding proportional relation of the distance from any point in the reference window to a boundary.
First, the distance between the camera C and the reference window HEFG is set to R, and the spatial position of the camera is set to C (x)c,yc,zc) The pitch angle is alpha, the azimuth angle is beta, the target surface size of the lens, the focal length is f, the field angle is theta and other parameters, the space analytic geometry is utilized, the solution model of the center point coordinate of the reference window is firstly solved to obtain the coordinate S (x) of the midpoint of the reference windows,ys,zs). Solving the model of the coordinates of each boundary point of the established reference window to obtain the coordinates H (x) of the four boundary points of the reference windowH,yH,zH)、E(xE,yE,zE)、F(xF,yF,zF)、G(xG,yG,zG) (ii) a Then, according to the corresponding proportion relation between the corresponding point of the target point on the reference window and the distance of the four boundaries, which is determined by the coordinates of the target point pixel on the image determined by the target detection algorithm, the solving model of the corresponding point coordinates of the target point on the reference window is solved to obtain the coordinates A (x) of the corresponding point of the target point on the reference windowA,yA,zA)。
Step 3-3: according to two coordinates C (x) in spacec,yc,zc)、A(xA,yA,zA) The equation for the CA space line is obtained as follows.
Further, the coordinates (x) of the intersection point P 'of the sight line CP' and the plane xoy can be obtainedP′,yP′,zP′)。
According to the Janus visibility algorithm idea, it is necessary to calculate the translation amount of the x and y coordinates from the viewpoint C to the point P ', take max Δ ═ max { Δ x, Δ y }, and divide the sight line CP' into segments of n ═ int (max Δ/m) according to the resolution (m) of the adopted regular grid DEM. Let the k-th division point use NkRepresents (k ═ 1, 2.). According to the proportional theorem of dividing the line segments equally by parallel lines, the single translation increment of the x, y and z coordinate directions after n is equally divided between the point C and the point P' are respectively (x, y and z)P′-xC)/n、(yP′-yC) N and zCN, then the k-th division point Nk(xNk,yNk,ZNk) Three-dimensional coordinates of
Step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Wherein n is 4, ZiElevation, d, of grid nodes around the grid node corresponding to the kth aliquot pointiThe distance from the mesh nodes to the interpolation points.
Step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
A further embodiment is an image-based log detection and location system, comprising:
and the preprocessing module is used for extracting a single-frame image from the video shot by the holder observation tower, preprocessing and labeling the image, and making a data set for network model training, verification and testing.
And the network construction training module is used for constructing a multi-scale candidate region fusion network and training the model on the data set obtained by preprocessing.
And the detection positioning module is used for inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result, and calculating the specific geographical three-dimensional coordinates of the epidemic wood according to the epidemic wood three-dimensional positioning geometric model.
In particular, the network construction training module is configured to perform the following steps:
step 2-1: preprocessing an input image of the multi-scale candidate regional fusion network, then scaling and fixing the image to 224 × 224 pixels, and finally respectively taking the image of 224 × 224 pixels and the amplified image and the reduced image as the input of the multi-scale candidate regional fusion network;
step 2-2: inputting 224 × 224 pixel images into a convolutional neural network (ResNet32), inputting 224 × 224 pixel images with one time amplification into a convolutional neural network (ResNet18), and inputting 224 × 224 pixel images with one time reduction into a convolutional neural network (ResNet50) to obtain corresponding feature maps;
step 2-3: different candidate regions are derived from the feature map. Detecting a candidate region of a small target by using a ResNet18 characteristic diagram to generate a network RPN, which is S-RPN; detecting the RPN of the medium-sized target by using the characteristic diagram of ResNet32, wherein the RPN is M-RPN; detecting the RPN of the large target by using the characteristic diagram of ResNet50, wherein the RPN is L-RPN; reducing redundant candidate regions using non-maximum suppression NMS;
step 2-4: and (3) training the multi-scale candidate area fusion network on a deep learning machine through the data set obtained in the step (1), repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
Specifically, the detection positioning module is configured to perform the following steps:
step 3-1: obtaining a two-dimensional pixel coordinate of a target on the image by the target detection algorithm in the step 2, and introducing a regular grid Digital Elevation Model (DEM) to obtain a terrain of a three-dimensional coordinate of the pine forest;
step 3-2: establishing a geometric relation model according to a Digital Elevation Model (DEM) and an imaging principle of a camera, wherein the core is that a target detection algorithm determines pixel coordinates of a target on an image, and then according to a corresponding proportional relation of distances from any point to a boundary in a reference window, a corresponding coordinate A (x) of a target point on the reference window is obtained by solvingA,yA,zA);
Step 3-3: according to the Janus visibility algorithm idea, calculating to obtain the k-th bisector N from the viewpoint C to the point Pk(xNk,yNk,ZNk) Three-dimensional coordinates of (a);
step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Wherein n is 4, ZiElevation, d, of grid nodes around the grid node corresponding to the kth aliquot pointiThe distance from the grid node to the interpolation point;
step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
Claims (10)
1. An image-based epidemic wood detection and positioning method is characterized by comprising the following steps:
step 1: video preprocessing: extracting a single-frame image from a video shot by a holder observation tower, and carrying out image preprocessing and labeling to manufacture a data set for network model training, verification and testing;
step 2: constructing a multi-scale candidate region fusion network, and training the model on the data set obtained in the step 1;
and step 3: and inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result, and calculating the specific geographical three-dimensional coordinates of the epidemic wood according to the epidemic wood three-dimensional positioning geometric model.
2. The image-based log detection and positioning method according to claim 1, characterized in that: the step 2 comprises the following specific steps:
step 2-1: preprocessing an input image of the multi-scale candidate regional fusion network, then scaling and fixing the image to 224 × 224 pixels, and finally respectively taking the image of 224 × 224 pixels and the amplified image and the reduced image as the input of the multi-scale candidate regional fusion network;
step 2-2: inputting 224 × 224 pixel images into a convolutional neural network (ResNet32), inputting 224 × 224 pixel images with one time amplification into a convolutional neural network (ResNet18), and inputting 224 × 224 pixel images with one time reduction into a convolutional neural network (ResNet50) to obtain corresponding feature maps;
step 2-3: different candidate regions are derived from the feature map. Detecting a candidate region of a small target by using a ResNet18 characteristic diagram to generate a network RPN, which is S-RPN; detecting the RPN of the medium-sized target by using the characteristic diagram of ResNet32, wherein the RPN is M-RPN; detecting the RPN of the large target by using the characteristic diagram of ResNet50, wherein the RPN is L-RPN; reducing redundant candidate regions using non-maximum suppression NMS;
step 2-4: and (3) training the multi-scale candidate area fusion network on a deep learning machine through the data set obtained in the step (1), repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
3. The image-based log detection and positioning method according to claim 2, characterized in that: and 2-1, preprocessing the input image of the multi-scale candidate region fusion network, including image turning, rotation, color transformation, Gaussian noise addition and other data enhancement processing.
4. The image-based log detection and positioning method according to claim 2, characterized in that: the step 2-3 of reducing the redundant candidate region by using non-maximum inhibition NMS specifically comprises the following steps:
for each RPN model, first, each anchor box is assigned a positive label or a negative label. Positive labels follow two principles: (1) the Anchor with the highest IoU (cross-over ratio) with the truth region; or (2) an Anchor with IoU values higher than 0.7 in the true area. Negative labels are anchors with IoU values below 0.3 in the true area. While the Anchor with the IoU score within the range of 0.3-0.7 neither belongs to a positive label nor a negative label, and has no effect in the training of the RPN model;
then, a multitask loss function is adopted as a loss function of model training, which is specifically defined as follows:
where i is an index of a batch, piRepresenting the prediction probability that the candidate region generated by the ith anchor is the target region,the truth label for this region is represented, where 0 represents negative and 1 represents positive. Class penalty function of the above formulaAnd regression loss functionRespectively using Ncls、NregNormalized with a balance weight λ. For classification loss functionUsing log loss function, for regression loss functionA smoothL1 loss function was used. In the boundary frame regression, the coordinates of the center point of the prediction frame and the width and height of the prediction frame are respectively regressed, and the prediction frame tiSum truth boxEach defined as follows.
Wherein x and y respectively represent the central coordinates of the prediction frame, and w and h respectively represent the width and height of the prediction frame; x is the number ofa、yaRespectively representing the center coordinates of the anchor, wa、haWidth and height of the anchor are indicated, respectively; x is the number of*、y*Respectively representing the central coordinates of the real value box, w*、h*Representing the width and height of the true value box, respectively.
5. The image-based log detection and positioning method according to claim 1, characterized in that: the step 3 comprises the following specific steps:
step 3-1: obtaining a two-dimensional pixel coordinate of a target on the image by the target detection algorithm in the step 2, and introducing a regular grid Digital Elevation Model (DEM) to obtain a terrain of a three-dimensional coordinate of the pine forest;
step 3-2: from a digital elevation model(DEM) and the imaging principle of a camera to establish a geometric relation model, wherein the core is that the pixel coordinates of a target on an image are determined by a target detection algorithm, and then the corresponding coordinate A (x) of a target point on a reference window is obtained by solving according to the corresponding proportional relation of the distance from any point in the reference window to a boundaryA,yA,zA);
Step 3-3: according to the Janus visibility algorithm idea, calculating to obtain the k-th bisector N from the viewpoint C to the point Pk(xNk,yNk,ZNk) Three-dimensional coordinates of (a);
step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Wherein n is 4, ZiElevation, d, of grid nodes around the grid node corresponding to the kth aliquot pointiThe distance from the grid node to the interpolation point;
step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
6. The image-based log detection and positioning method according to claim 5, characterized in that: the step 3-2 is specifically as follows:
first, the distance between the camera C and the reference window HEFG is set to R, and the spatial position of the camera is set to C (x)c,yc,zc) The pitch angle is alpha, the azimuth angle is beta, the size of the target surface of the lens, the focal length is f, the field angle is theta and the likeParameters, using space analysis geometry, firstly solving the solving model of the central point coordinate of the reference window to obtain the coordinate S (x) of the central point of the reference windows,ys,zs) (ii) a Solving the model of the coordinates of each boundary point of the established reference window to obtain the coordinates H (x) of the four boundary points of the reference windowH,yH,zH)、E(xE,yE,zE)、F(xF,yF,zF)、G(xG,yG,zG) (ii) a Then, according to the corresponding proportion relation between the corresponding point of the target point on the reference window and the distance of the four boundaries, which is determined by the coordinates of the target point pixel on the image determined by the target detection algorithm, the solving model of the corresponding point coordinates of the target point on the reference window is solved to obtain the coordinates A (x) of the corresponding point of the target point on the reference windowA,yA,zA)。
7. The image-based log detection and positioning method according to claim 5, characterized in that: the step 3-3 specifically comprises: according to two coordinates C (x) in spacec,yc,zc)、A(xA,yA,zA) Position, the CA space linear equation is obtained as follows:
further, the coordinates (x) of the intersection point P 'of the sight line CP' and the plane xoy are obtainedP′,yP′,zP′);
According to the Janus visibility algorithm idea, firstly, the translation amount of x and y coordinates from a viewpoint C to a point P 'is calculated, max delta is taken as max { delta x, delta y }, according to the resolution (m) of the adopted regular grid DEM, the sight line CP' is equally divided into N which is int (max delta/m), and the kth equally dividing point is set as N for NkExpressed as (k ═ 1, 2.,) each single translation increment in the x, y, z coordinate directions between point C and point P' after n equal division is (x, y, z coordinate directions), respectivelyP′-xC)/n、(yP′-yC)/n and zCN, the k-th division point Nk(xNk,yNk,ZNk) Three-dimensional coordinates of
8. An image-based log detection and positioning system is characterized by comprising:
the system comprises a preprocessing module, a network model training module, a network model verifying module and a network model marking module, wherein the preprocessing module is used for extracting a single-frame image from a video shot by a holder observation tower, preprocessing and marking the image and manufacturing a data set for network model training, verification and testing;
the network construction training module is used for constructing a multi-scale candidate region fusion network and training the model on a data set obtained by preprocessing;
and the detection positioning module is used for inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result, and calculating the specific geographical three-dimensional coordinates of the epidemic wood according to the epidemic wood three-dimensional positioning geometric model.
9. The image-based log detection and location system of claim 8, wherein the network construction training module is configured to perform the steps of:
step 2-1: preprocessing an input image of the multi-scale candidate regional fusion network, then scaling and fixing the image to 224 × 224 pixels, and finally respectively taking the image of 224 × 224 pixels and the amplified image and the reduced image as the input of the multi-scale candidate regional fusion network;
step 2-2: inputting 224 × 224 pixel images into a convolutional neural network (ResNet32), inputting 224 × 224 pixel images with one time amplification into a convolutional neural network (ResNet18), and inputting 224 × 224 pixel images with one time reduction into a convolutional neural network (ResNet50) to obtain corresponding feature maps;
step 2-3: different candidate regions are derived from the feature map. Detecting a candidate region of a small target by using a ResNet18 characteristic diagram to generate a network RPN, which is S-RPN; detecting the RPN of the medium-sized target by using the characteristic diagram of ResNet32, wherein the RPN is M-RPN; detecting the RPN of the large target by using the characteristic diagram of ResNet50, wherein the RPN is L-RPN; reducing redundant candidate regions using non-maximum suppression NMS;
step 2-4: and (3) training the multi-scale candidate area fusion network on a deep learning machine through the data set obtained in the step (1), repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
10. The image-based log detection and location system of claim 8, wherein the detection and location module is configured to perform the steps of:
step 3-1: obtaining a two-dimensional pixel coordinate of a target on the image by the target detection algorithm in the step 2, and introducing a regular grid Digital Elevation Model (DEM) to obtain a terrain of a three-dimensional coordinate of the pine forest;
step 3-2: establishing a geometric relation model according to a Digital Elevation Model (DEM) and an imaging principle of a camera, wherein the core is that a target detection algorithm determines pixel coordinates of a target on an image, and then according to a corresponding proportional relation of distances from any point to a boundary in a reference window, a corresponding coordinate A (x) of a target point on the reference window is obtained by solvingA,yA,zA);
Step 3-3: according to the Janus visibility algorithm idea, calculating to obtain the k-th bisector N from the viewpoint C to the point Pk(xNk,yNk,ZNk) Three-dimensional coordinates of (a);
step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Wherein n is 4, ZiFor grid nodes around the grid node corresponding to the kth aliquot pointElevation, diThe distance from the grid node to the interpolation point;
step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110821397.0A CN113379738A (en) | 2021-07-20 | 2021-07-20 | Method and system for detecting and positioning epidemic trees based on images |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110821397.0A CN113379738A (en) | 2021-07-20 | 2021-07-20 | Method and system for detecting and positioning epidemic trees based on images |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113379738A true CN113379738A (en) | 2021-09-10 |
Family
ID=77582501
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110821397.0A Pending CN113379738A (en) | 2021-07-20 | 2021-07-20 | Method and system for detecting and positioning epidemic trees based on images |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113379738A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113870254A (en) * | 2021-11-30 | 2021-12-31 | 中国科学院自动化研究所 | Target object detection method and device, electronic equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109376637A (en) * | 2018-10-15 | 2019-02-22 | 齐鲁工业大学 | Passenger number statistical system based on video monitoring image processing |
CN109685067A (en) * | 2018-12-26 | 2019-04-26 | 江西理工大学 | A kind of image, semantic dividing method based on region and depth residual error network |
CN109977963A (en) * | 2019-04-10 | 2019-07-05 | 京东方科技集团股份有限公司 | Image processing method, unit and computer-readable medium |
CN110909615A (en) * | 2019-10-28 | 2020-03-24 | 西安交通大学 | Target detection method based on multi-scale input mixed perception neural network |
CN111027547A (en) * | 2019-12-06 | 2020-04-17 | 南京大学 | Automatic detection method for multi-scale polymorphic target in two-dimensional image |
CN111091105A (en) * | 2019-12-23 | 2020-05-01 | 郑州轻工业大学 | Remote sensing image target detection method based on new frame regression loss function |
CN111160249A (en) * | 2019-12-30 | 2020-05-15 | 西北工业大学深圳研究院 | Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion |
-
2021
- 2021-07-20 CN CN202110821397.0A patent/CN113379738A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109376637A (en) * | 2018-10-15 | 2019-02-22 | 齐鲁工业大学 | Passenger number statistical system based on video monitoring image processing |
CN109685067A (en) * | 2018-12-26 | 2019-04-26 | 江西理工大学 | A kind of image, semantic dividing method based on region and depth residual error network |
CN109977963A (en) * | 2019-04-10 | 2019-07-05 | 京东方科技集团股份有限公司 | Image processing method, unit and computer-readable medium |
CN110909615A (en) * | 2019-10-28 | 2020-03-24 | 西安交通大学 | Target detection method based on multi-scale input mixed perception neural network |
CN111027547A (en) * | 2019-12-06 | 2020-04-17 | 南京大学 | Automatic detection method for multi-scale polymorphic target in two-dimensional image |
CN111091105A (en) * | 2019-12-23 | 2020-05-01 | 郑州轻工业大学 | Remote sensing image target detection method based on new frame regression loss function |
CN111160249A (en) * | 2019-12-30 | 2020-05-15 | 西北工业大学深圳研究院 | Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion |
Non-Patent Citations (3)
Title |
---|
JUNLING HOU 等: "Multi-Scale Proposal Regions Fusion Network for Detection and 3D Localization of the Infected Trees", 《2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC)》, pages 93 - 98 * |
朱利超 等: "平面影像到DQG像元的转换精度与效率分析", 《测绘与空间地理信息》, vol. 30, no. 1, pages 173 - 176 * |
秦飞飞 等: "基于数字高程模型的森林火灾远程视频同步跟踪算法", 《浙江农林大学学报》, vol. 29, no. 5, pages 917 - 922 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113870254A (en) * | 2021-11-30 | 2021-12-31 | 中国科学院自动化研究所 | Target object detection method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103337052B (en) | Automatic geometric correcting method towards wide cut remote sensing image | |
CN110956187A (en) | Unmanned aerial vehicle image plant canopy information extraction method based on ensemble learning | |
Wang et al. | Bottle detection in the wild using low-altitude unmanned aerial vehicles | |
CN112347895A (en) | Ship remote sensing target detection method based on boundary optimization neural network | |
CN115240093B (en) | Automatic power transmission channel inspection method based on visible light and laser radar point cloud fusion | |
CN113610905B (en) | Deep learning remote sensing image registration method based on sub-image matching and application | |
Hu et al. | Research on a single-tree point cloud segmentation method based on UAV tilt photography and deep learning algorithm | |
CN111967337A (en) | Pipeline line change detection method based on deep learning and unmanned aerial vehicle images | |
CN112907520A (en) | Single tree crown detection method based on end-to-end deep learning method | |
Zhang et al. | Deep learning based object distance measurement method for binocular stereo vision blind area | |
CN117409339A (en) | Unmanned aerial vehicle crop state visual identification method for air-ground coordination | |
JP2023530449A (en) | Systems and methods for air and ground alignment | |
CN115240089A (en) | Vehicle detection method of aerial remote sensing image | |
CN104751451B (en) | Point off density cloud extracting method based on unmanned plane low latitude high resolution image | |
Jiang et al. | Learned local features for structure from motion of uav images: A comparative evaluation | |
Xinmei et al. | Passive measurement method of tree height and crown diameter using a smartphone | |
CN115115954A (en) | Intelligent identification method for pine nematode disease area color-changing standing trees based on unmanned aerial vehicle remote sensing | |
CN114581307A (en) | Multi-image stitching method, system, device and medium for target tracking identification | |
CN113379738A (en) | Method and system for detecting and positioning epidemic trees based on images | |
CN111476167B (en) | One-stage direction remote sensing image target detection method based on student-T distribution assistance | |
CN110580468B (en) | Single wood structure parameter extraction method based on image matching point cloud | |
CN117392382A (en) | Single tree fruit tree segmentation method and system based on multi-scale dense instance detection | |
CN116994029A (en) | Fusion classification method and system for multi-source data | |
CN116385477A (en) | Tower image registration method based on image segmentation | |
CN114694022A (en) | Spherical neighborhood based multi-scale multi-feature algorithm semantic segmentation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210910 |
|
WD01 | Invention patent application deemed withdrawn after publication |