CN113379738A - Method and system for detecting and positioning epidemic trees based on images - Google Patents

Method and system for detecting and positioning epidemic trees based on images Download PDF

Info

Publication number
CN113379738A
CN113379738A CN202110821397.0A CN202110821397A CN113379738A CN 113379738 A CN113379738 A CN 113379738A CN 202110821397 A CN202110821397 A CN 202110821397A CN 113379738 A CN113379738 A CN 113379738A
Authority
CN
China
Prior art keywords
point
image
target
rpn
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110821397.0A
Other languages
Chinese (zh)
Inventor
侯俊岭
李伟红
杨利平
张超
王欣然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202110821397.0A priority Critical patent/CN113379738A/en
Publication of CN113379738A publication Critical patent/CN113379738A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides an image-based epidemic wood detection and positioning method and system. The invention designs a multi-scale candidate region fusion network, wherein a characteristic diagram obtained by ResNet18 is used for detecting RPN (S-RPN) of a small target, a characteristic diagram obtained by ResNet32 is used for detecting RPN (M-RPN) of a medium target, and a characteristic diagram obtained by ResNet50 is used for detecting RPN (L-RPN) of a large target. Non-maximum suppression (NMS) is then employed to reduce redundant candidate regions. The network can solve the problem that the existing target detection network model based on the deep neural network is difficult to deal with multi-scale pine forest epidemic wood detection. The invention provides a method for constructing a log three-dimensional positioning geometric model according to a camera imaging principle of a holder monitoring tower and digital terrain elevation Data (DEM), which can accurately realize the three-dimensional positioning of a log explosion point. The method is simple and quick, and can improve the monitoring and early warning capability of the pine forest epidemic trees.

Description

Method and system for detecting and positioning epidemic trees based on images
Technical Field
The invention belongs to the technical field of image detection and positioning, and particularly relates to a multi-scale candidate region fusion epidemic wood detection network and an epidemic wood three-dimensional positioning geometric model based on deep learning.
Background
The pine wood nematode disease is one of pine wood diseases with the greatest damage to forestry in China and is called cancer of pine trees. It takes only 3-5 years from the infection of a pine tree to the death of the entire pine forest, and the infected pine is called an epidemic. Therefore, epidemic wood detection is the first major thing for pine forest protection. The most notable feature of infected pine trees over normal healthy pine trees is that the entire crown conifer of the tree is yellow-brown or reddish-brown, while the crown conifer of normal healthy pine trees is green. The existing pine forest epidemic wood detection method mainly comprises the following steps: 1) the ground investigation method includes collecting coordinate position information of single-plant epidemic trees and collecting samples by means of on-site visit of forest protection personnel, and is high in detection accuracy and reliability, but high in labor and time cost and low in efficiency; 2) according to the satellite remote sensing detection method, the area monitoring of large-area pine forest is mostly realized through remote sensing images in the current forestry monitoring, and because the satellite revisit period is long and is influenced by cloud and fog, the spatial resolution of the remote sensing images is not high, so that the monitoring of single-plant epidemic trees is difficult to realize at present; 3) an Unmanned Aerial Vehicle (UAV) detection method, which is limited by the endurance time of the UAV, and cannot detect pine forest in a large area in real time for a long time; 4) the cradle head observation tower monitoring method is characterized in that a monitoring camera is arranged at a high point of a cradle head observation tower to carry out near-ground low-altitude shooting on pine forest, shot video has high spatial resolution, and single-frame RGB images in the video are extracted to carry out detection and analysis, so that the method is an effective monitoring method for pine forest epidemic wood detection at present.
With the great progress of the deep learning target detection method, the target detection method based on the image can detect the interested target in the image through the deep neural network model. And features, especially high-level semantic features, can be automatically extracted through a Convolutional Neural Network (CNN), and good effects are achieved in the field of remote sensing image target detection. However, when the camera of the pan-tilt observation tower is used for shooting the image of the pine forest, the shooting has difference between a long shot and a short shot, and the size and the position of the crown of the pine tree in the high-resolution image are greatly changed along with the distance from the pan-tilt. The conventional target detection network based on the CNN has a single candidate frame size, so that the detection result is poor when epidemic trees with different sizes are detected. Meanwhile, the pixel coordinates of the epidemic trees which are detected only in the image level have no monitoring significance, and the spatial geographic coordinates of the epidemic trees relative to the mountain terrain need to be accurately positioned, so that the actual utility can be brought to the discovery and protection work of the epidemic trees.
Disclosure of Invention
The invention aims to provide an image-based epidemic wood detection and positioning method aiming at the epidemic wood detection and positioning of a pan-tilt observation tower image, and is used for solving the problems that the existing target detection network model based on a deep neural network is difficult to deal with multi-scale pine forest epidemic wood detection and cannot accurately acquire the geographic position of the epidemic wood.
The technical scheme of the invention is as follows:
an image-based epidemic wood detection and positioning method comprises the following steps:
step 1: video preprocessing: and extracting a single-frame image from a video shot by the holder observation tower, and then carrying out image preprocessing and labeling to manufacture a data set for network model training, verification and testing.
Step 2: and (3) constructing a multi-scale candidate region fusion network, and training the model by the data set obtained in the step (1).
And step 3: and inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result. And calculating specific geographical three-dimensional coordinates of the epidemic trees according to the three-dimensional positioning geometric model of the epidemic trees.
Further, step 2 comprises the following substeps:
step 2-1: firstly, preprocessing an input image of the multi-scale candidate regional fusion network, then fixing the image scaling to 224 × 224 pixels, and finally respectively taking the 224 × 224 pixel image and the enlarged image and the reduced image as the input of the multi-scale candidate regional fusion network.
Step 2-2: the 224 × 224 pixel images are input to a convolutional neural network (ResNet32), the 224 × 224 pixel image with one time enlargement is input to a convolutional neural network (ResNet18), and the 224 × 224 pixel image with one time reduction is input to a convolutional neural network (ResNet50), so that corresponding feature maps are obtained.
Step 2-3: and acquiring a epidemic wood candidate region from the characteristic map. A feature map obtained by using ResNet18 in the RPN (candidate area generation network) for detecting small targets is called S-RPN; detecting a characteristic graph obtained by ResNet32 of the RPN of the medium-sized target, and the characteristic graph is called M-RPN; the characteristic graph obtained by detecting the RPN of a large target with ResNet50 is called L-RPN. Since the regions generated by each RPN may overlap each other, non-maximum suppression (NMS) is employed to reduce redundant candidate regions.
Step 2-4: training the multi-scale candidate region fused epidemic wood detection network on a deep learning machine through the data set obtained in the step 1, repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
Further, step 3 comprises the following substeps:
step 3-1: firstly, the two-dimensional pixel coordinates of the target on the image are obtained by the target detection algorithm in the step 2, and meanwhile, a regular grid Digital Elevation Model (DEM) is introduced to obtain the terrain of the three-dimensional coordinates of the pine forest.
Step 3-2: establishing a geometric relation model according to a Digital Elevation Model (DEM) and an imaging principle of a camera, wherein the core is that a target detection algorithm determines pixel coordinates of a target on an image, and then according to a corresponding proportional relation of distances from any point to a boundary in a reference window, a corresponding coordinate A (x) of a target point on the reference window is obtained by solvingA,yA,zA)。
Step 3-3: according to the Janus visibility algorithm idea, calculating to obtain the k-th bisector N from the viewpoint C to the point Pk(xNk,yNk,ZNk) Three-dimensional coordinates of (c).
Step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Figure BDA0003172076220000031
Wherein n is 4, ZiElevation, d, of grid nodes around the grid node corresponding to the kth aliquot pointiThe distance from the mesh nodes to the interpolation points.
Step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
In the invention, when the epidemic trees of the video of the pan-tilt observation tower are detected and positioned, three stages are mainly adopted, firstly, the video data set of the pan-tilt observation tower is trained in the constructed multi-scale candidate area fusion network, and the strategy of fine tuning the learning rate is applied in the training process, so that the detection accuracy of the model is improved; then inputting the test image into a trained multi-scale candidate area fusion network to obtain two-dimensional coordinates of the epidemic wood target in the image; and finally, obtaining the specific three-dimensional geographic coordinates of the epidemic wood target according to the two-dimensional coordinates of the epidemic wood target in the image and the three-dimensional positioning geometric model of the epidemic wood.
The invention has the beneficial effects that:
the invention provides a multi-scale candidate area fusion network for epidemic wood detection, aiming at videos shot by a pinene holder observation tower. By analyzing the relation between the camera imaging principle and the camera related parameters and combining the characteristics of a Digital Elevation Model (DEM), the three-dimensional positioning method for the pine forest epidemic trees is provided. The invention can solve the problem that the existing target detection network model based on the deep neural network is difficult to deal with the detection of the multi-scale pine forest epidemic trees, and can realize the positioning of the geographic position of the epidemic trees. The method is simple and rapid, and can improve the monitoring and early warning capability of the pine forest epidemic trees.
Drawings
FIG. 1 is a multi-scale candidate area convergence network framework diagram of the present invention;
FIG. 2 is a three-dimensional positioning geometric model of Phytophthora.
Detailed Description
The following describes in further detail a specific embodiment of the present invention with reference to fig. 1 and 2.
As shown in fig. 1, an image-based epidemic wood detection and positioning method is specifically performed according to the following steps:
step 1: video preprocessing: and extracting a single-frame image from a video shot by the holder observation tower, and then carrying out image preprocessing and labeling to manufacture a data set for network model training, verification and testing.
Step 1 comprises the following substeps:
step 1-1: single-frame images are extracted from videos shot by a holder observation tower, and then the single-frame images are subjected to manual data cleaning, wherein the manual data cleaning mainly comprises the steps of removing repeated images, missing images, useless images and the like.
Step 1-2: and according to the guidance suggestion of forestry disease and pest experts, carrying out data annotation on the cleaned image by adopting an image annotation tool for deep learning target detection. The shuffled data set is divided into a training set, a validation set, and a test set according to different ratios (e.g., 7: 2: 1).
Step 2: and (3) constructing a multi-scale candidate region fusion network, and training the model by the data set obtained in the step (1).
The step 2 comprises the following substeps:
step 2-1: firstly, preprocessing an input image of a multi-scale candidate region fusion network, including data enhancement processing such as image turning, rotation, color transformation, Gaussian noise addition and the like. And then, the enhanced image is fixed to 224 × 224 pixels in a scaling mode, and finally, the image of 224 × 224 pixels and the enlarged image and the reduced image are respectively used as the input of the multi-scale candidate region fusion network.
Step 2-2: inputting 224 × 224 pixel images into a convolutional neural network (ResNet32), inputting 224 × 224 pixel images with one time amplification into a convolutional neural network (ResNet18), and inputting 224 × 224 pixel images with one time reduction into a convolutional neural network (ResNet50) to obtain corresponding feature maps;
step 2-3: different candidate regions are derived from the feature map. A feature map obtained by using ResNet18 in the RPN (candidate area generation network) for detecting small targets is called S-RPN; detecting a characteristic graph obtained by ResNet32 of the RPN of the medium-sized target, and the characteristic graph is called M-RPN; the characteristic graph obtained by detecting the RPN of a large target with ResNet50 is called L-RPN. Since the regions generated by each RPN may overlap each other, non-maximum suppression (NMS) is employed to reduce redundant candidate regions. For each RPN model, first, each Anchor is assigned a positive or negative label. Positive labels follow two principles: (1) the Anchor with the highest IoU in the true value region; or (2) an Anchor with IoU values higher than 0.7 in the true area. Negative labels are anchors with IoU values below 0.3 in the true area. While the Anchor with IoU score in the range of 0.3-0.7 does not belong to either positive or negative label, and has no effect in RPN model training. Then, a multitask loss function is adopted as a loss function of model training, which is specifically defined as follows:
Figure BDA0003172076220000051
where i is the index of a Batch (Batch), piRepresenting the prediction probability that the candidate region generated by the ith anchor is the target region,
Figure BDA0003172076220000052
the truth label for this region is represented, where 0 represents negative and 1 represents positive. Class penalty function of the above formula
Figure BDA0003172076220000053
And regression loss function
Figure BDA0003172076220000054
Respectively using Ncls、NregNormalized with a balance weight λ. For classification loss function
Figure BDA0003172076220000055
Using log loss function, for regression loss function
Figure BDA0003172076220000056
A smoothL1 loss function was used. In the regression of the boundary frame, the coordinates of the central point of the prediction frame and the width and the height of the prediction frame are respectively regressed, and the prediction frame tiSum truth box
Figure BDA0003172076220000057
Each defined as follows.
Figure BDA0003172076220000058
Figure BDA0003172076220000059
Wherein x and y respectively represent the central coordinates of the prediction frame, and w and h respectively represent the width and height of the prediction frame; x is the number ofa、yaRespectively representing the center coordinates of the anchor, wa、haWidth and height of the anchor are indicated, respectively; x is the number of*、y*Respectively representing the central coordinates of the real value box, w*、h*Representing the width and height of the true value box, respectively.
Step 2-4: training the multi-scale candidate region fused epidemic wood detection network on a deep learning machine through the data set obtained in the step 1, repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
And step 3: and inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result. And calculating specific geographical three-dimensional coordinates of the epidemic trees according to the three-dimensional positioning geometric model of the epidemic trees. As shown in fig. 2, the method comprises the following substeps:
step 3-1: firstly, the two-dimensional pixel coordinates of the target on the image are obtained by the target detection algorithm in the step 2, and meanwhile, a regular grid Digital Elevation Model (DEM) is introduced to obtain the terrain of the three-dimensional coordinates of the pine forest.
Step 3-2: a geometric relation model is established according to a Digital Elevation Model (DEM) and the imaging principle of a camera, as shown in figure 2, the core is the pixel coordinates of a target on an image determined by a target detection algorithm, and then the corresponding coordinates of a target point on a reference window are obtained by solving according to the corresponding proportional relation of the distance from any point in the reference window to a boundary.
First, the distance between the camera C and the reference window HEFG is set to R, and the spatial position of the camera is set to C (x)c,yc,zc) The pitch angle is alpha, the azimuth angle is beta, the target surface size of the lens, the focal length is f, the field angle is theta and other parameters, the space analytic geometry is utilized, the solution model of the center point coordinate of the reference window is firstly solved to obtain the coordinate S (x) of the midpoint of the reference windows,ys,zs). Solving the model of the coordinates of each boundary point of the established reference window to obtain the coordinates H (x) of the four boundary points of the reference windowH,yH,zH)、E(xE,yE,zE)、F(xF,yF,zF)、G(xG,yG,zG) (ii) a Then, according to the corresponding proportion relation between the corresponding point of the target point on the reference window and the distance of the four boundaries, which is determined by the coordinates of the target point pixel on the image determined by the target detection algorithm, the solving model of the corresponding point coordinates of the target point on the reference window is solved to obtain the coordinates A (x) of the corresponding point of the target point on the reference windowA,yA,zA)。
Step 3-3: according to two coordinates C (x) in spacec,yc,zc)、A(xA,yA,zA) The equation for the CA space line is obtained as follows.
Figure BDA0003172076220000061
Further, the coordinates (x) of the intersection point P 'of the sight line CP' and the plane xoy can be obtainedP′,yP′,zP′)。
According to the Janus visibility algorithm idea, it is necessary to calculate the translation amount of the x and y coordinates from the viewpoint C to the point P ', take max Δ ═ max { Δ x, Δ y }, and divide the sight line CP' into segments of n ═ int (max Δ/m) according to the resolution (m) of the adopted regular grid DEM. Let the k-th division point use NkRepresents (k ═ 1, 2.). According to the proportional theorem of dividing the line segments equally by parallel lines, the single translation increment of the x, y and z coordinate directions after n is equally divided between the point C and the point P' are respectively (x, y and z)P′-xC)/n、(yP′-yC) N and zCN, then the k-th division point Nk(xNk,yNk,ZNk) Three-dimensional coordinates of
Figure BDA0003172076220000062
Step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Figure BDA0003172076220000063
Wherein n is 4, ZiElevation, d, of grid nodes around the grid node corresponding to the kth aliquot pointiThe distance from the mesh nodes to the interpolation points.
Step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
A further embodiment is an image-based log detection and location system, comprising:
and the preprocessing module is used for extracting a single-frame image from the video shot by the holder observation tower, preprocessing and labeling the image, and making a data set for network model training, verification and testing.
And the network construction training module is used for constructing a multi-scale candidate region fusion network and training the model on the data set obtained by preprocessing.
And the detection positioning module is used for inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result, and calculating the specific geographical three-dimensional coordinates of the epidemic wood according to the epidemic wood three-dimensional positioning geometric model.
In particular, the network construction training module is configured to perform the following steps:
step 2-1: preprocessing an input image of the multi-scale candidate regional fusion network, then scaling and fixing the image to 224 × 224 pixels, and finally respectively taking the image of 224 × 224 pixels and the amplified image and the reduced image as the input of the multi-scale candidate regional fusion network;
step 2-2: inputting 224 × 224 pixel images into a convolutional neural network (ResNet32), inputting 224 × 224 pixel images with one time amplification into a convolutional neural network (ResNet18), and inputting 224 × 224 pixel images with one time reduction into a convolutional neural network (ResNet50) to obtain corresponding feature maps;
step 2-3: different candidate regions are derived from the feature map. Detecting a candidate region of a small target by using a ResNet18 characteristic diagram to generate a network RPN, which is S-RPN; detecting the RPN of the medium-sized target by using the characteristic diagram of ResNet32, wherein the RPN is M-RPN; detecting the RPN of the large target by using the characteristic diagram of ResNet50, wherein the RPN is L-RPN; reducing redundant candidate regions using non-maximum suppression NMS;
step 2-4: and (3) training the multi-scale candidate area fusion network on a deep learning machine through the data set obtained in the step (1), repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
Specifically, the detection positioning module is configured to perform the following steps:
step 3-1: obtaining a two-dimensional pixel coordinate of a target on the image by the target detection algorithm in the step 2, and introducing a regular grid Digital Elevation Model (DEM) to obtain a terrain of a three-dimensional coordinate of the pine forest;
step 3-2: establishing a geometric relation model according to a Digital Elevation Model (DEM) and an imaging principle of a camera, wherein the core is that a target detection algorithm determines pixel coordinates of a target on an image, and then according to a corresponding proportional relation of distances from any point to a boundary in a reference window, a corresponding coordinate A (x) of a target point on the reference window is obtained by solvingA,yA,zA);
Step 3-3: according to the Janus visibility algorithm idea, calculating to obtain the k-th bisector N from the viewpoint C to the point Pk(xNk,yNk,ZNk) Three-dimensional coordinates of (a);
step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Figure BDA0003172076220000081
Wherein n is 4, ZiElevation, d, of grid nodes around the grid node corresponding to the kth aliquot pointiThe distance from the grid node to the interpolation point;
step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。

Claims (10)

1. An image-based epidemic wood detection and positioning method is characterized by comprising the following steps:
step 1: video preprocessing: extracting a single-frame image from a video shot by a holder observation tower, and carrying out image preprocessing and labeling to manufacture a data set for network model training, verification and testing;
step 2: constructing a multi-scale candidate region fusion network, and training the model on the data set obtained in the step 1;
and step 3: and inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result, and calculating the specific geographical three-dimensional coordinates of the epidemic wood according to the epidemic wood three-dimensional positioning geometric model.
2. The image-based log detection and positioning method according to claim 1, characterized in that: the step 2 comprises the following specific steps:
step 2-1: preprocessing an input image of the multi-scale candidate regional fusion network, then scaling and fixing the image to 224 × 224 pixels, and finally respectively taking the image of 224 × 224 pixels and the amplified image and the reduced image as the input of the multi-scale candidate regional fusion network;
step 2-2: inputting 224 × 224 pixel images into a convolutional neural network (ResNet32), inputting 224 × 224 pixel images with one time amplification into a convolutional neural network (ResNet18), and inputting 224 × 224 pixel images with one time reduction into a convolutional neural network (ResNet50) to obtain corresponding feature maps;
step 2-3: different candidate regions are derived from the feature map. Detecting a candidate region of a small target by using a ResNet18 characteristic diagram to generate a network RPN, which is S-RPN; detecting the RPN of the medium-sized target by using the characteristic diagram of ResNet32, wherein the RPN is M-RPN; detecting the RPN of the large target by using the characteristic diagram of ResNet50, wherein the RPN is L-RPN; reducing redundant candidate regions using non-maximum suppression NMS;
step 2-4: and (3) training the multi-scale candidate area fusion network on a deep learning machine through the data set obtained in the step (1), repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
3. The image-based log detection and positioning method according to claim 2, characterized in that: and 2-1, preprocessing the input image of the multi-scale candidate region fusion network, including image turning, rotation, color transformation, Gaussian noise addition and other data enhancement processing.
4. The image-based log detection and positioning method according to claim 2, characterized in that: the step 2-3 of reducing the redundant candidate region by using non-maximum inhibition NMS specifically comprises the following steps:
for each RPN model, first, each anchor box is assigned a positive label or a negative label. Positive labels follow two principles: (1) the Anchor with the highest IoU (cross-over ratio) with the truth region; or (2) an Anchor with IoU values higher than 0.7 in the true area. Negative labels are anchors with IoU values below 0.3 in the true area. While the Anchor with the IoU score within the range of 0.3-0.7 neither belongs to a positive label nor a negative label, and has no effect in the training of the RPN model;
then, a multitask loss function is adopted as a loss function of model training, which is specifically defined as follows:
Figure FDA0003172076210000021
where i is an index of a batch, piRepresenting the prediction probability that the candidate region generated by the ith anchor is the target region,
Figure FDA0003172076210000022
the truth label for this region is represented, where 0 represents negative and 1 represents positive. Class penalty function of the above formula
Figure FDA0003172076210000023
And regression loss function
Figure FDA0003172076210000024
Respectively using Ncls、NregNormalized with a balance weight λ. For classification loss function
Figure FDA0003172076210000025
Using log loss function, for regression loss function
Figure FDA0003172076210000026
A smoothL1 loss function was used. In the boundary frame regression, the coordinates of the center point of the prediction frame and the width and height of the prediction frame are respectively regressed, and the prediction frame tiSum truth box
Figure FDA0003172076210000027
Each defined as follows.
Figure FDA0003172076210000028
Figure FDA0003172076210000029
Wherein x and y respectively represent the central coordinates of the prediction frame, and w and h respectively represent the width and height of the prediction frame; x is the number ofa、yaRespectively representing the center coordinates of the anchor, wa、haWidth and height of the anchor are indicated, respectively; x is the number of*、y*Respectively representing the central coordinates of the real value box, w*、h*Representing the width and height of the true value box, respectively.
5. The image-based log detection and positioning method according to claim 1, characterized in that: the step 3 comprises the following specific steps:
step 3-1: obtaining a two-dimensional pixel coordinate of a target on the image by the target detection algorithm in the step 2, and introducing a regular grid Digital Elevation Model (DEM) to obtain a terrain of a three-dimensional coordinate of the pine forest;
step 3-2: from a digital elevation model(DEM) and the imaging principle of a camera to establish a geometric relation model, wherein the core is that the pixel coordinates of a target on an image are determined by a target detection algorithm, and then the corresponding coordinate A (x) of a target point on a reference window is obtained by solving according to the corresponding proportional relation of the distance from any point in the reference window to a boundaryA,yA,zA);
Step 3-3: according to the Janus visibility algorithm idea, calculating to obtain the k-th bisector N from the viewpoint C to the point Pk(xNk,yNk,ZNk) Three-dimensional coordinates of (a);
step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Figure FDA0003172076210000031
Wherein n is 4, ZiElevation, d, of grid nodes around the grid node corresponding to the kth aliquot pointiThe distance from the grid node to the interpolation point;
step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
6. The image-based log detection and positioning method according to claim 5, characterized in that: the step 3-2 is specifically as follows:
first, the distance between the camera C and the reference window HEFG is set to R, and the spatial position of the camera is set to C (x)c,yc,zc) The pitch angle is alpha, the azimuth angle is beta, the size of the target surface of the lens, the focal length is f, the field angle is theta and the likeParameters, using space analysis geometry, firstly solving the solving model of the central point coordinate of the reference window to obtain the coordinate S (x) of the central point of the reference windows,ys,zs) (ii) a Solving the model of the coordinates of each boundary point of the established reference window to obtain the coordinates H (x) of the four boundary points of the reference windowH,yH,zH)、E(xE,yE,zE)、F(xF,yF,zF)、G(xG,yG,zG) (ii) a Then, according to the corresponding proportion relation between the corresponding point of the target point on the reference window and the distance of the four boundaries, which is determined by the coordinates of the target point pixel on the image determined by the target detection algorithm, the solving model of the corresponding point coordinates of the target point on the reference window is solved to obtain the coordinates A (x) of the corresponding point of the target point on the reference windowA,yA,zA)。
7. The image-based log detection and positioning method according to claim 5, characterized in that: the step 3-3 specifically comprises: according to two coordinates C (x) in spacec,yc,zc)、A(xA,yA,zA) Position, the CA space linear equation is obtained as follows:
Figure FDA0003172076210000032
further, the coordinates (x) of the intersection point P 'of the sight line CP' and the plane xoy are obtainedP′,yP′,zP′);
According to the Janus visibility algorithm idea, firstly, the translation amount of x and y coordinates from a viewpoint C to a point P 'is calculated, max delta is taken as max { delta x, delta y }, according to the resolution (m) of the adopted regular grid DEM, the sight line CP' is equally divided into N which is int (max delta/m), and the kth equally dividing point is set as N for NkExpressed as (k ═ 1, 2.,) each single translation increment in the x, y, z coordinate directions between point C and point P' after n equal division is (x, y, z coordinate directions), respectivelyP′-xC)/n、(yP′-yC)/n and zCN, the k-th division point Nk(xNk,yNk,ZNk) Three-dimensional coordinates of
Figure FDA0003172076210000041
8. An image-based log detection and positioning system is characterized by comprising:
the system comprises a preprocessing module, a network model training module, a network model verifying module and a network model marking module, wherein the preprocessing module is used for extracting a single-frame image from a video shot by a holder observation tower, preprocessing and marking the image and manufacturing a data set for network model training, verification and testing;
the network construction training module is used for constructing a multi-scale candidate region fusion network and training the model on a data set obtained by preprocessing;
and the detection positioning module is used for inputting the image to be tested into the trained multi-scale candidate region fusion detection network to obtain an output result, and calculating the specific geographical three-dimensional coordinates of the epidemic wood according to the epidemic wood three-dimensional positioning geometric model.
9. The image-based log detection and location system of claim 8, wherein the network construction training module is configured to perform the steps of:
step 2-1: preprocessing an input image of the multi-scale candidate regional fusion network, then scaling and fixing the image to 224 × 224 pixels, and finally respectively taking the image of 224 × 224 pixels and the amplified image and the reduced image as the input of the multi-scale candidate regional fusion network;
step 2-2: inputting 224 × 224 pixel images into a convolutional neural network (ResNet32), inputting 224 × 224 pixel images with one time amplification into a convolutional neural network (ResNet18), and inputting 224 × 224 pixel images with one time reduction into a convolutional neural network (ResNet50) to obtain corresponding feature maps;
step 2-3: different candidate regions are derived from the feature map. Detecting a candidate region of a small target by using a ResNet18 characteristic diagram to generate a network RPN, which is S-RPN; detecting the RPN of the medium-sized target by using the characteristic diagram of ResNet32, wherein the RPN is M-RPN; detecting the RPN of the large target by using the characteristic diagram of ResNet50, wherein the RPN is L-RPN; reducing redundant candidate regions using non-maximum suppression NMS;
step 2-4: and (3) training the multi-scale candidate area fusion network on a deep learning machine through the data set obtained in the step (1), repeatedly adjusting parameters until the network is converged, and storing the trained network parameters.
10. The image-based log detection and location system of claim 8, wherein the detection and location module is configured to perform the steps of:
step 3-1: obtaining a two-dimensional pixel coordinate of a target on the image by the target detection algorithm in the step 2, and introducing a regular grid Digital Elevation Model (DEM) to obtain a terrain of a three-dimensional coordinate of the pine forest;
step 3-2: establishing a geometric relation model according to a Digital Elevation Model (DEM) and an imaging principle of a camera, wherein the core is that a target detection algorithm determines pixel coordinates of a target on an image, and then according to a corresponding proportional relation of distances from any point to a boundary in a reference window, a corresponding coordinate A (x) of a target point on the reference window is obtained by solvingA,yA,zA);
Step 3-3: according to the Janus visibility algorithm idea, calculating to obtain the k-th bisector N from the viewpoint C to the point Pk(xNk,yNk,ZNk) Three-dimensional coordinates of (a);
step 3-4: and 3, according to the coordinates of the bisector obtained in the step 3-3, utilizing the elevation data of the four surrounding points, wherein the terrain elevation corresponding to the k bisector is calculated according to a distance weighting method, and the following steps are shown.
Figure FDA0003172076210000051
Wherein n is 4, ZiFor grid nodes around the grid node corresponding to the kth aliquot pointElevation, diThe distance from the grid node to the interpolation point;
step 3-5: scanning all the equant points along the sight line CP' of the target point of the camera, and comparing the obtained elevation on the DEM with the corresponding sight line elevation: if the elevation of the point on the DEM is smaller than the elevation of the corresponding equal division point of the sight line, the next point is judged until the elevation of the first point on the DEM is larger than the elevation of the corresponding equal division point, and the coordinate of the obstacle point, namely the three-dimensional coordinate (x) of the epidemic wood target B is returnedB,yB,zB)。
CN202110821397.0A 2021-07-20 2021-07-20 Method and system for detecting and positioning epidemic trees based on images Pending CN113379738A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110821397.0A CN113379738A (en) 2021-07-20 2021-07-20 Method and system for detecting and positioning epidemic trees based on images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110821397.0A CN113379738A (en) 2021-07-20 2021-07-20 Method and system for detecting and positioning epidemic trees based on images

Publications (1)

Publication Number Publication Date
CN113379738A true CN113379738A (en) 2021-09-10

Family

ID=77582501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110821397.0A Pending CN113379738A (en) 2021-07-20 2021-07-20 Method and system for detecting and positioning epidemic trees based on images

Country Status (1)

Country Link
CN (1) CN113379738A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870254A (en) * 2021-11-30 2021-12-31 中国科学院自动化研究所 Target object detection method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376637A (en) * 2018-10-15 2019-02-22 齐鲁工业大学 Passenger number statistical system based on video monitoring image processing
CN109685067A (en) * 2018-12-26 2019-04-26 江西理工大学 A kind of image, semantic dividing method based on region and depth residual error network
CN109977963A (en) * 2019-04-10 2019-07-05 京东方科技集团股份有限公司 Image processing method, unit and computer-readable medium
CN110909615A (en) * 2019-10-28 2020-03-24 西安交通大学 Target detection method based on multi-scale input mixed perception neural network
CN111027547A (en) * 2019-12-06 2020-04-17 南京大学 Automatic detection method for multi-scale polymorphic target in two-dimensional image
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function
CN111160249A (en) * 2019-12-30 2020-05-15 西北工业大学深圳研究院 Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376637A (en) * 2018-10-15 2019-02-22 齐鲁工业大学 Passenger number statistical system based on video monitoring image processing
CN109685067A (en) * 2018-12-26 2019-04-26 江西理工大学 A kind of image, semantic dividing method based on region and depth residual error network
CN109977963A (en) * 2019-04-10 2019-07-05 京东方科技集团股份有限公司 Image processing method, unit and computer-readable medium
CN110909615A (en) * 2019-10-28 2020-03-24 西安交通大学 Target detection method based on multi-scale input mixed perception neural network
CN111027547A (en) * 2019-12-06 2020-04-17 南京大学 Automatic detection method for multi-scale polymorphic target in two-dimensional image
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function
CN111160249A (en) * 2019-12-30 2020-05-15 西北工业大学深圳研究院 Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JUNLING HOU 等: "Multi-Scale Proposal Regions Fusion Network for Detection and 3D Localization of the Infected Trees", 《2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC)》, pages 93 - 98 *
朱利超 等: "平面影像到DQG像元的转换精度与效率分析", 《测绘与空间地理信息》, vol. 30, no. 1, pages 173 - 176 *
秦飞飞 等: "基于数字高程模型的森林火灾远程视频同步跟踪算法", 《浙江农林大学学报》, vol. 29, no. 5, pages 917 - 922 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870254A (en) * 2021-11-30 2021-12-31 中国科学院自动化研究所 Target object detection method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN103337052B (en) Automatic geometric correcting method towards wide cut remote sensing image
CN110956187A (en) Unmanned aerial vehicle image plant canopy information extraction method based on ensemble learning
Wang et al. Bottle detection in the wild using low-altitude unmanned aerial vehicles
CN112347895A (en) Ship remote sensing target detection method based on boundary optimization neural network
CN115240093B (en) Automatic power transmission channel inspection method based on visible light and laser radar point cloud fusion
CN113610905B (en) Deep learning remote sensing image registration method based on sub-image matching and application
Hu et al. Research on a single-tree point cloud segmentation method based on UAV tilt photography and deep learning algorithm
CN111967337A (en) Pipeline line change detection method based on deep learning and unmanned aerial vehicle images
CN112907520A (en) Single tree crown detection method based on end-to-end deep learning method
Zhang et al. Deep learning based object distance measurement method for binocular stereo vision blind area
CN117409339A (en) Unmanned aerial vehicle crop state visual identification method for air-ground coordination
JP2023530449A (en) Systems and methods for air and ground alignment
CN115240089A (en) Vehicle detection method of aerial remote sensing image
CN104751451B (en) Point off density cloud extracting method based on unmanned plane low latitude high resolution image
Jiang et al. Learned local features for structure from motion of uav images: A comparative evaluation
Xinmei et al. Passive measurement method of tree height and crown diameter using a smartphone
CN115115954A (en) Intelligent identification method for pine nematode disease area color-changing standing trees based on unmanned aerial vehicle remote sensing
CN114581307A (en) Multi-image stitching method, system, device and medium for target tracking identification
CN113379738A (en) Method and system for detecting and positioning epidemic trees based on images
CN111476167B (en) One-stage direction remote sensing image target detection method based on student-T distribution assistance
CN110580468B (en) Single wood structure parameter extraction method based on image matching point cloud
CN117392382A (en) Single tree fruit tree segmentation method and system based on multi-scale dense instance detection
CN116994029A (en) Fusion classification method and system for multi-source data
CN116385477A (en) Tower image registration method based on image segmentation
CN114694022A (en) Spherical neighborhood based multi-scale multi-feature algorithm semantic segmentation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210910

WD01 Invention patent application deemed withdrawn after publication