CN115409691A

CN115409691A - Bimodal learning slope risk detection method integrating laser ranging and monitoring image

Info

Publication number: CN115409691A
Application number: CN202210809378.0A
Authority: CN
Inventors: 林耿; 陈开志; 董正山
Original assignee: Minjiang University
Current assignee: Minjiang University
Priority date: 2022-07-11
Filing date: 2022-07-11
Publication date: 2022-11-29

Abstract

The invention discloses a bimodal learning slope risk detection method fusing laser ranging and monitoring images, and relates to the technical field of slope safety monitoring. According to the bimodal learning slope risk detection method fusing the laser ranging and the monitoring image, firstly, the image data of the monitoring camera and the three-dimensional position data acquired by a single-point laser range finder are combined, the image characteristic and the laser ranging characteristic are fused, and the third-dimensional characteristic is supplemented through laser ranging, so that the three-dimensional perception capability of the slope is greatly improved, the information quantity is larger, and the recognition capability is stronger; and then, by constructing a bimodal network and carrying out multimodal learning, the types and regions of slope risks, particularly some local risk changes, are detected, accurate and real-time early warning service is provided for a slope detection system, and the false alarm probability is reduced.

Description

Bimodal learning slope risk detection method integrating laser ranging and monitoring image

Technical Field

The invention relates to the technical field of slope safety monitoring, in particular to a bimodal learning slope risk detection method fusing laser ranging and monitoring images.

Background

The large-scale construction bases in China all over the country have side slopes, such as mountain side slopes beside mines and road bridges, hydropower station side slopes and the like. The slope safety monitoring is mainly used for monitoring various safety incidents and potential safety hazards of the slope, achieves early warning in time and is important for safety production and environmental protection. With the introduction of more application scenes, not only whether landslide and rockfall exist needs to be simply judged, but also ground surface cracks, ground bulging, settlement, collapse, building deformation and the like need to be detected. Application scenes are more and more, many tasks do not only need to classify whether certain risks exist or not, but also need to locate areas with risks, the former corresponds to classification tasks in machine learning, and the latter corresponds to target detection tasks.

There are also more and more means of detecting a side slope. It is commonly used like surveillance camera head, judge the side slope risk through image recognition. Laser ranging scanning equipment is also commonly used to judge slope risk by analyzing single-point or multi-point displacement data. In addition, there are various specialized devices such as displacement sensors, tilt sensors, strain sensors, etc. When the measures are used singly, various respective defects exist, such as poor noise interference resistance, and the specific defects are as follows:

(1) Manual observation

The cost is high, continuous observation cannot be carried out, the measurement work is time-consuming and labor-consuming, and more importantly, the personal safety of field measurement personnel cannot be guaranteed.

(2) Surface sensor observation

If real-time data collection such as a displacement sensor, an inclination sensor, a stress sensor and the like are installed on the ground surface, the defects are that the cost is high, the installation is inconvenient, and the sensors are easily damaged when the side slope has risk time.

(3) Remote observation device

1) Camera monitoring

The remote observation needs human eyes for auxiliary judgment, the risk factors are difficult to identify automatically with high precision, and the risk factors are difficult to identify accurately particularly for small detail changes such as cracks, surface soil layer scouring and the like. And the identification of the monitoring camera is greatly influenced by weather such as illumination and the like.

2) Laser ranging device monitoring

The laser ranging equipment judges the slope displacement condition by polling and scanning the position change of the point position at intervals, and the aspect of replacing the surface by the point is difficult to detect the local change of the detail. Meanwhile, due to complex conditions such as slope rock soil and vegetation, errors exist in the ranging data per time, and detected points are deviated. The two factors are superposed, so that a large number of false alarms can be generated directly according to data results of two times of scanning before and after the same point position. Even if technologies such as multi-point judgment, three-dimensional slope reconstruction and the like are adopted, local detail changes such as cracks, local ground swelling and settlement are difficult to detect.

Disclosure of Invention

The invention aims to solve the technical problem of providing a bimodal learning slope risk detection method fusing laser ranging and monitoring images, which combines image data of a monitoring camera and three-dimensional position data acquired by a single-point laser range finder to perform multimodal learning, detects the types and regions of slope risks, particularly some local risk changes, provides accurate and real-time early warning service for a slope detection system and reduces the probability of false alarm.

In order to solve the technical problem, the invention is realized as follows:

a bimodal learning slope risk detection method fusing laser ranging and monitoring images comprises the following steps:

four corner datum point markers are arranged in a designated area of the slope, so that the four corner datum point markers can be clearly identified on a camera and can be used as four corner datum points for laser ranging; simultaneously carrying out data acquisition and image acquisition on the designated area through laser ranging equipment and a camera;

taking the interval point data acquired for the first time by the laser ranging equipment as reference data, eliminating outliers by using four corner reference points, and constructing a reference three-dimensional slope by using interpolation algorithm on the position data of the rest effective points; taking the interval point data acquired after the first time as monitoring data, obtaining a monitored three-dimensional slope surface by the same method, and then solving the difference value between the monitored three-dimensional slope surface and the reference three-dimensional slope surface to obtain a three-dimensional difference slope surface map;

establishing an affine transformation relation according to coordinates of four reference points on an acquired image and three-dimensional positions of the four reference points of the ranging equipment, projecting points on a three-dimensional difference slope map onto an acquired image plane by utilizing perspective transformation, and constructing three-dimensional slope difference values corresponding to all pixel points on the acquired image by using an interpolation algorithm to obtain a fusion aligned slope difference map;

labeling the slope risk category and the risk area on the image acquired by the camera for training; constructing a bimodal network, wherein the bimodal network comprises a landslide perception network, a first neural network, a first modal fusion network, a second neural network, a second modal fusion network and a perception fusion network; the inputs of the landslide sensing network and the first neural network are both fusion-aligned slope difference maps, and the input of the second neural network is a slope collected image; the perception fusion network is used for carrying out multi-mode fusion on the output ends of the first mode fusion network, the second mode fusion network and the landslide perception network; then training the training image and the slope difference image which is fused and aligned as the input of the bimodal network, and obtaining a bimodal learning slope risk detection model after training is completed;

and respectively acquiring images acquired by a camera and data acquired by laser ranging equipment during detection deployment, removing points outside the region and outliers according to the same data processing method during training to obtain images to be detected and a slope difference map in fusion alignment, inputting the images to be detected and the slope difference map into a bimodal learning slope risk detection model, and outputting the slope risk category and the region after detection and identification.

Further, establishing an affine transformation relation according to coordinates of the four reference points on the training image and three-dimensional positions of the four reference points of the ranging device, specifically comprising:

inputting XY plane positions (X [ i ], Y [ i ]) of four points of four reference points on laser ranging, solving a perspective matrix H corresponding to the XY positions (X [ i ], Y [ i ]) on an image, wherein the size of the matrix is 3 multiplied by 3, and 8 variables are total, and a standard solution of the perspective matrix needs 8 equations:

the equation constructed for a pair of corresponding points is:

the four pairs of points have 8 equations in total, so that 8 parameters in the matrix H are solved;

and transforming the points on the XY plane of the laser ranging to the XY plane of the image through the transformation formula, wherein the points corresponding to the pixels on the image have difference data of a three-dimensional slope.

Further, the construction method of the three-dimensional difference slope map comprises the following steps:

step 1, constructing an interpolation point network Tnet in an X-axis interval [ Xmin, xmax ] and a Y-axis interval [ Ymin, ymax ] every t meters in the XY direction on an XY plane, wherein t is a set value and is greater than 0;

step 2, inputting slope scanning point data with outliers and outliers removed, projecting the data points on an XY plane, and constructing a Delaunay triangulation network by using a Delaunay triangulation algorithm;

step 3, calculating in which triangle each Tnet point falls in the Delaunay triangulation network, then calculating the plane equation Z = f (x, y) by using the three-point xyz coordinates of the triangle, substituting the xy value of the Tnet point of the interpolation point into Z = f (x, y) to calculate the corresponding Z value, and traversing the interpolation to calculate the Z values of all points in the Tnet;

and 4, replacing the xy value of the Tnet with the index coordinate to obtain a new data point set (m, n, z) of the interpolation point.

Further, the bimodal network further comprises: the landslide sensing network comprises a slicing operation module, a convolution module and three full-connection modules which are connected in sequence, and the output of the landslide sensing network is a sensing coefficient; the perception fusion network is a full-connection module, and the first neural network and the second neural network are yolov5 networks.

The technical scheme of the embodiment of the invention at least has the following technical effects or advantages:

1. the image characteristic and the laser ranging characteristic are fused, three-dimensional information is equivalent, the information amount is larger, and the recognition capability is stronger:

various landslides, cracks, deformation and the like of the side slope are a three-dimensional change process, and when the three-dimensional change process is mapped to a two-dimensional plane image acquired by a camera, a lot of information is actually lost, so that the detection is difficult. The third-dimensional characteristic is supplemented through laser ranging, the stereoscopic perception capability of the side slope is greatly improved, and the recognition capability of various risks of the side slope can be improved by adopting a proper algorithm.

2. Automatic feature extraction replaces manual design features, and according to data driving, the optimal features are automatically extracted, so that the method is more accurate and reliable:

one great advantage of deep learning is that the most efficient features for classification and detection are automatically found from the data using data-driven. The data-driven automatic feature extraction method can avoid the blindness and randomness of the traditional manually designed feature extraction method to a great extent. Especially as the types of risk of the slope increase, the artificial features become increasingly difficult to be adequate. And the data driving mode can be effective only by adding training samples.

3. Complex noise processing is avoided, various parameters such as threshold value adjustment by experience are avoided, and algorithm adaptability is better:

when the traditional machine learning method is used for slope risk classification or region location, noise interference can seriously affect the final performance, and various complex noise filters are usually required to be designed before feature extraction. The design is more or less hooked with experience knowledge and a preset scene, and when the experience knowledge and the preset scene are not established in practical application, the design is easy to lose effectiveness.

In addition, various parameters in the feature extractor and the final classifier need to be designed, and for the simplicity of the algorithm, various experimental parameter values or even empirical values are usually used as parameter preset values. The scientificity, effectiveness and robustness of such parameters are to be tested.

And the data-driven deep learning method can automatically focus on the most core characteristics of the classification, automatically avoid the influence of noise, has no too many preset parameter values, and greatly improves the effectiveness of the algorithm.

4. For a new target detection task under a new scene, the requirements can be met only by adding a labeled sample and retraining, and the expansion capability of the model is improved:

one great advantage of deep learning is that neural networks are automatically trained using data-driven. For the situation under a new scene, the model can be updated iteratively only by increasing the training samples under the scene, and the expansion capability of the model is greatly improved.

The above description is only an overview of the technical solutions of the present invention, and the present invention can be implemented in accordance with the content of the description so as to make the technical means of the present invention more clearly understood, and the above and other objects, features, and advantages of the present invention will be more clearly understood.

Drawings

The invention will be further described with reference to the following examples and figures.

FIG. 1 is a flow chart of a method of an embodiment of the present invention;

FIG. 2 is a flow chart of a multi-modal detection training process according to an embodiment of the present invention;

FIG. 3 is a flowchart of an interpolation algorithm to reconstruct a three-dimensional map of a slope surface according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a bimodal network architecture according to an embodiment of the present invention;

FIG. 5 is a flow chart of multi-modal detection classification according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a bimodal learning slope risk detection method integrating laser ranging and monitoring images, and the multimode learning is carried out by combining image data of a monitoring camera and three-dimensional position data acquired by a single-point laser range finder, so that the types and regions of slope risks, particularly some local risk changes, are detected, accurate and real-time early warning service is provided for a slope detection system, and the false alarm probability is reduced.

The technical scheme in the embodiment of the invention has the following general idea:

the algorithm mainly comprises two processes: a training process and a classification process. Firstly, collecting a labeling training sample, aligning images and ranging multi-modal data, training a model as shown in figure 2, and providing the model for a detection process.

The main key modules in the training process are a three-dimensional slope map construction method (including a three-dimensional interpolation method), perspective transformation of image reference points and ranging reference points, and a multi-mode learning method, which are respectively described as follows.

1. A three-dimensional slope map construction method (including a three-dimensional interpolation method) is shown in fig. 3 and table 1.

TABLE 1 coordinate matrix of interpolation points of XY plane interpolation point network Tnet

Note: the corresponding index coordinate of the matrix is (m, n), m is an integer between [0, (Xmax-Xmin)/t ], and n is an integer between [0, (Ymax-Ymin)/t ].

2. Perspective transformation of image reference points and ranging reference points:

four reference angular points in the two groups of data are calibrated, wherein the three-dimensional position of the space is obtained by laser ranging, xy is equivalent to the coordinate of a ground plane, and z axis is equivalent to the height of a point relative to ranging equipment. Therefore, the xy plane corresponding to the four corner points in the distance measuring space and the plane on the monitoring picture are corresponded, and then the slope difference value information corresponding to each pixel point on the picture is calculated through interpolation, so that the alignment and fusion of the image and the slope difference value image information can be realized. The area where the slope image changes is also the area where the laser ranging information changes.

The method for aligning the image plane and the ranging plane is implemented by perspective transformation under the condition that the position information of four reference points is known, and the specific method is as follows:

the equation constructed for a pair of corresponding points is:

the four pairs of points have 8 equations in total, so that 8 parameters in a matrix H are solved;

and transforming the points on the XY plane of the laser ranging to the XY plane of the image through the transformation formula, so that the points corresponding to the pixels on the image have difference value data of the three-dimensional slope. And interpolating difference data of the three-dimensional slope corresponding to all the pixel points by the interpolation technology based on the Delaunay triangle. At this point, the alignment of the two-mode data is completed.

3. Model training

1) Through the processing, fusion aligned data (images, slope difference values of corresponding points of image points) are obtained, a large number of paired data samples are collected, risk types and regions are marked on the images, and training sample data are prepared.

2) Because the data positions of the two networks are only approximately aligned, a layer of network is added after the two paths of target detection algorithm structures by using a multi-mode data late fusion strategy as a multi-mode fusion network, and the sensing network is adjusted by using the characteristics extracted from the difference slope map reconstructed by more accurate information detection as sensing information. The trained model was obtained from a large amount of labeled data according to a bimodal network designed as shown in FIG. 4. Wherein focus, conv and line are respectively a slicing operation module, a convolution module and a full connection module in yolov5

4. Model detection

And deploying the trained model, wherein the step of marking data is just removed according to a data processing method during training as during deployment. As shown in fig. 5, the preprocessed data are input, and after detection and identification, the slope risk category and area are output.

The implementation of a specific embodiment of the present invention is shown in fig. 1:

a laser ranging and monitoring image fused bimodal learning slope risk detection method comprises the following steps:

four corner reference point markers are arranged in a slope designated area, so that the four corner reference point markers can be clearly identified on a camera and can be used as four corner reference points for laser ranging; and simultaneously carrying out data acquisition and image acquisition on the designated area through laser ranging equipment and a camera.

Taking the interval point data acquired for the first time by the laser ranging equipment as reference data, eliminating outliers by using four corner reference points, and constructing a reference three-dimensional slope by using interpolation algorithm on the position data of the rest effective points; and taking the interval point data acquired after the first time as monitoring data, obtaining a monitored three-dimensional slope surface by the same method, and then solving the difference value between the monitored three-dimensional slope surface and the reference three-dimensional slope surface to obtain a three-dimensional difference slope surface map.

Establishing an affine transformation relation according to coordinates of the four reference points on the acquired image and three-dimensional positions of the four reference points of the ranging device, projecting points on the three-dimensional difference slope map onto an acquired image plane by utilizing perspective transformation, and constructing three-dimensional slope difference values corresponding to all pixel points on the acquired image by using an interpolation algorithm to obtain a fusion aligned slope difference map.

Labeling the slope risk category and the risk area on the image acquired by the camera for training; as shown in fig. 4, a bimodal network is constructed, which includes a landslide perception network, a first neural network, a first modal fusion network, a second neural network, a second modal fusion network, and a perception fusion network; the inputs of the landslide sensing network and the first neural network are both fusion-aligned slope difference maps, and the input of the second neural network is a slope collected image; the perception fusion network is used for performing multi-mode fusion on the output ends of the first mode fusion network, the second mode fusion network and the landslide perception network; and then training the training image and the slope difference image which is fused and aligned as the input of the bimodal network, and obtaining a bimodal learning slope risk detection model after training is completed.

The bimodal network further comprises: the landslide sensing network comprises a slicing operation module (Focus), a convolution module (Conv) and three full-connection modules (Linear) which are sequentially connected, and the output of the landslide sensing network is a sensing coefficient; the perception fusion network is a full-connection module, and the first neural network and the second neural network are yolov5 networks.

And acquiring images acquired by a camera and data acquired by laser ranging equipment during detection deployment, excluding points outside the region and outliers according to the same data processing method during training, acquiring images to be detected and a slope difference map fused and aligned, inputting the images to be detected and the slope difference map into a bimodal learning slope risk detection model, and outputting slope risk categories and regions after detection and identification.

In a possible implementation manner, establishing an affine transformation relationship according to coordinates of four reference points on a training image and three-dimensional positions of the four reference points of a ranging device specifically includes:

inputting XY plane position (X [ i ], Y [ i ]) of four points of four reference points on the laser ranging, corresponding to the XY position (X [ i ], Y [ i ]) on the image, solving a perspective matrix H, the size of the matrix is 3 multiplied by 3, and 8 variables in total, wherein a standard solution of the perspective matrix needs 8 equations:

the equation constructed for a pair of corresponding points is:

As shown in fig. 3, the method for constructing the three-dimensional difference slope map includes:

step 2, inputting slope scanning point data with the outliers and outliers removed, projecting the data points on an XY plane, and constructing a Delaunay triangulation network by using a Delaunay triangulation algorithm;

and 4, replacing the xy value of the Tnet by the index coordinate to obtain a new data point set (m, n, z) of the interpolation point.

According to the bimodal learning slope risk detection method fusing the laser ranging and the monitoring image, firstly, the image data of the monitoring camera and the three-dimensional position data acquired by a single-point laser range finder are combined, the image characteristic and the laser ranging characteristic are fused, and the third-dimensional characteristic is supplemented through laser ranging, so that the three-dimensional perception capability of the slope is greatly improved, the information quantity is larger, and the recognition capability is stronger; and then, by constructing a bimodal network and carrying out multimodal learning, the types and regions of slope risks, particularly some local risk changes, are detected, accurate and real-time early warning service is provided for a slope detection system, and the false alarm probability is reduced.

Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims

1. A bimodal learning slope risk detection method fusing laser ranging and monitoring images is characterized by comprising the following steps:

taking the interval point data acquired for the first time by the laser ranging equipment as reference data, excluding outliers from the area by using coordinates of four corner reference points, and constructing a reference three-dimensional slope by using interpolation algorithm on the position data of the rest effective points; taking the interval point data acquired after the first time as monitoring data, obtaining a monitored three-dimensional slope surface by the same method, and then solving the difference value between the monitored three-dimensional slope surface and the reference three-dimensional slope surface to obtain a three-dimensional difference slope surface map;

2. The method of claim 1, wherein: establishing an affine transformation relation according to coordinates of the four reference points on the training image and three-dimensional positions of the four reference points of the ranging device, and specifically comprising the following steps of:

the equation constructed for a pair of corresponding points is:

and transforming the point on the XY plane of the laser ranging to the XY plane of the image through the transformation formula, so that the corresponding point of the pixel on the image has difference data of the three-dimensional slope.

3. The method according to claim 1 or 2, characterized in that: the construction method of the three-dimensional difference slope map comprises the following steps:

step 1, constructing an interpolation point network Tnet in an X-axis interval [ Xmin, xmax ] and a Y-axis interval [ Ymin, ymax ] at intervals of t meters in the XY direction on an XY plane, wherein t is a set value and is greater than 0;

4. The method of claim 1, wherein: the bimodal network further comprises: the landslide sensing network comprises a slicing operation module, a convolution module and three full-connection modules which are connected in sequence, and the output of the landslide sensing network is a sensing coefficient; the perception fusion network is a full-connection module, and the first neural network and the second neural network are yolov5 networks.