CN111582017B

CN111582017B - Video monitoring-oriented end-to-end geological disaster automatic identification method, system and application

Info

Publication number: CN111582017B
Application number: CN202010211669.0A
Authority: CN
Inventors: 刘敦龙; 吴非; 张少杰; 胡凯衡; 何磊; 唐聃
Original assignee: Institute of Optics and Electronics of CAS; Chengdu University of Information Technology; Institute of Mountain Hazards and Environment IMHE of CAS
Current assignee: Institute of Optics and Electronics of CAS; Chengdu University of Information Technology; Institute of Mountain Hazards and Environment IMHE of CAS
Priority date: 2020-03-24
Filing date: 2020-03-24
Publication date: 2021-03-16
Anticipated expiration: 2040-03-24
Also published as: CN111582017A

Abstract

The invention belongs to the technical field of geological disaster prevention and disaster reduction, and discloses a video monitoring-oriented end-to-end geological disaster automatic identification method, a system and application, wherein a video is input into an H-C3D network to extract features, a 4096-dimensional 3D feature is extracted from each 16 frames, and the features are combined into 32 x 4096-dimensional features according to a time sequence; classifying videos in a weak supervision mode, wherein geological disaster videos are classified into one type, and non-geological disaster videos are classified into one type; building and training a geological disaster classification recognition network R-Net to obtain a training model; judging the early warning level of the geological disaster by converting the characteristics into abnormal scores; extracting 3D features from the video through an H-C3D network, combining the features into 32 x 4096 dimensions according to a time sequence, sending the combined features into a trained R-Net network, and acquiring abnormal scores in real time along with video frames; and dividing the abnormal score into five intervals to obtain the early warning level of the geological disaster. The invention realizes the automatic judgment of the early warning level of the geological disaster.

Description

Video monitoring-oriented end-to-end geological disaster automatic identification method, system and application

Technical Field

The invention belongs to the technical field of geological disaster prevention and reduction, and particularly relates to an end-to-end geological disaster automatic identification method and system for video monitoring and application.

Background

Geological disasters refer to disastrous geological events caused by various geological actions during the development and evolution of the earth. The distribution change rule of geological disasters in time and space is not only limited by natural environment, but also related to human activities, and is often the result of interaction between human and the natural world. Natural disasters mainly caused by geological dynamic activities or abnormal geological environment. Under the action of the internal power, the external power or the artificial geological power, the earth generates abnormal energy release, material movement, deformation and displacement of rock and soil bodies, abnormal change of the environment and the like, and the phenomena or processes of harming human lives and properties, living and economic activities or destroying resources and environments on which human beings live and develop are generated. Adverse geological phenomena are commonly called geological disasters, and refer to geological events that deteriorate geological environment, reduce environmental quality, directly or indirectly harm human safety, and cause losses for social and economic construction, caused by natural geological effects and human activities. Geological disasters are geological effects (phenomena) which are formed under the action of natural or human factors and damage and lose human lives, properties and environments. Such as collapse, landslide, debris flow, ground fissure, ground subsidence, rock burst, water burst in underground tunnel, mud burst, gas burst, spontaneous combustion of coal bed, loess collapse, rock-soil expansion, sandy soil liquefaction, land freeze-thaw, water loss and soil erosion, land desertification and swampiness, soil salinization, earthquake, volcano, geothermal damage, etc. Currently, the closest prior art: the monitoring and early warning of geological disasters such as landslide and debris flow are realized by technical means such as rainfall, micro-earthquake, earth sound, infrasonic wave, earth surface displacement, mud level and mechanical monitoring. The existing geological disaster video monitoring technical means only record the occurrence process of the geological disaster, and can not automatically analyze and realize the on-site real-time monitoring and early warning of the geological disaster.

In summary, the problems of the prior art are as follows:

(1) at present, the geological disaster video monitoring technology only collects video data of a field scene, and related personnel can check the video data in a playback mode when necessary, so that the development situation of the field situation before and after the geological disaster is visually known.

(2) The C3D network can be used to extract video 3D features, but is mainly a video for human behavior and is not suitable for the field of geological disasters.

(3) If the C3D network is applied to the field of geological disasters to extract the 3D characteristics of geological disaster videos, the existing C3D network needs to be modified and optimized, a proper amount of geological disaster videos are added, and retraining is carried out to obtain the network suitable for extracting the 3D characteristics of geological disaster scenes.

The difficulty of solving the technical problems is as follows:

(1) the existing C3D network is reformed and optimized into a network suitable for extracting the 3D characteristics of the geological disaster scene.

(2) Aiming at the actual conditions of the multiple segmented feature prediction values, the loss function is redesigned by combining the thought of multi-example learning, so that the converged geological disaster classification recognition network R-Net can be optimized through traversing all the feature prediction values in one turn during training.

(3) Aiming at the problem that the loss function possibly causes the over-fitting of the parameters of the geological disaster classification recognition network R-Net and thus poor test effect along with the superposition of training times, the method adds a regular term to the loss function to limit the network parameters and prevent the over-fitting, and specifically adds a time smooth constraint and a space smooth constraint to the loss function respectively.

The significance of solving the technical problems is as follows: the invention can fully utilize the video monitoring data in the field of geological disasters, plays a role in real-time automatic monitoring and early warning, and can well serve the geological disaster prevention and reduction. Rather than the traditional mode of today: after the geological disaster occurs, calling a video to see the current situation development condition.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a video monitoring-oriented end-to-end geological disaster automatic identification method, a system and application.

The invention is realized in such a way that the method for automatically identifying the video monitoring end-to-end geological disaster comprises the following steps:

firstly, transforming and optimizing a C3D model to obtain a network (H-C3D) suitable for extracting 3D features of geological disaster scenes, inputting videos into the H-C3D network to extract the features, extracting one 4096-dimensional 3D feature every 16 frames, and combining all the features into 32 x 4096-dimensional features according to a time sequence;

secondly, classifying the videos in a weak supervision mode, wherein the videos of the geological disasters are classified into one type, and the videos of the non-geological disasters are classified into one type; defining a multi-example learning method and a loss function thereof, building a geological disaster classification and identification network R-Net, obtaining a training model by adopting an AdaGrad gradient descent algorithm R-Net, and determining the early warning grade of the geological disaster by converting 3D characteristics into abnormal values by the model;

thirdly, when the method is applied, 3D features are extracted from the video through an H-C3D network, the features are combined into 32 x 4096 dimensions according to time series, the combined features are sent to a trained R-Net, and abnormal scores can be obtained in real time along with video frames;

and fourthly, dividing the abnormal score into five intervals according to observation statistics of a large amount of video data, respectively corresponding to five early warning grades (table 1) of the geological disaster, and obtaining the early warning grade of the geological disaster according to the interval where the abnormal score is located.

Further, the video monitoring-oriented end-to-end geological disaster automatic identification method extracts the 3D characteristics of geological disaster/non-geological disaster videos: based on a C3D network trained by a large-scale data set, a proper amount of geological disaster videos are added by using the idea of transfer learning, and the C3D network is modified and optimized to obtain a network (H-C3D) suitable for extracting the 3D characteristics of geological disaster scenes.

Further comprising: the image is cut into 128 × 117 size by taking a 16-frame sequence image as a unit, the length and width are internally adjusted to 112 × 112 size, and 50% of the total data volume (namely half of the number of images) is horizontally inverted to adapt to the jitter situation on time and space; the H-C3D network consists of 5 convolutional layers, 5 pooling layers and 1 fully-connected layer, the number of filters from 1 to 5 convolutional layers is 64, 128, 256, 512, 512, respectively, each convolutional layer is followed by a pooling layer, the size of each subsequent pooling layer is 2 x 2 by 1 step through 3D convolution kernels of 3 x 3 and the first pooling layer of 1 x 2; the output dimension of the fully connected layer is 4096 dimensions.

Further, the video monitoring-oriented end-to-end geological disaster automatic identification method utilizes 3D features extracted through an H-C3D network to train a geological disaster classification identification network R-Net: after video features are extracted by taking 16 frames as a unit, 32 features for training an R-Net network are set, and if the number of the video frames is less than 16 × 32, the previous features are repeated to the 32 th feature; if the video frame number is larger than 16 × 32, after the features are arranged in the time sequence, 32 features are obtained at equal intervals. These 32 x 4096 dimensional 3D features are fed into the R-Net for training.

Further, the multi-example learning loss function for training the geological disaster classification recognition network R-Net is as follows:

wherein k is the total number of samples, w is the network parameter of R-Net,

and (3) taking the fractional value of R-Net, wherein A is a geological disaster feature fragment set, N is a non-geological disaster video feature fragment set, and then adding time smoothness constraint and space smoothness constraint:

wherein λ₁，λ₂N is the total number of the characteristic segments in one video, and is a hyper-parameter; training R-Net by using the loss function in combination with an AdaGrad gradient descent algorithm to obtain a trained model and parameters thereof;

another object of the present invention is to provide a video monitoring end-to-end geological disaster automatic identification method system for implementing the video monitoring end-to-end geological disaster automatic identification method, wherein the video monitoring end-to-end geological disaster automatic identification method system comprises:

the characteristic extraction module is used for obtaining a network (H-C3D) suitable for extracting the 3D characteristics of the geological disaster scene by modifying and optimizing the C3D network, extracting one 4096-dimensional 3D characteristic every 16 frames, and combining all the characteristics into 32 x 4096-dimensional characteristics according to a time sequence;

the video frame labeling module is used for classifying videos under weak supervision, wherein geological disaster videos are classified into one type, and non-geological disaster videos are classified into one type;

the video frame abnormal score acquisition module is used for constructing a geological disaster classification and identification network R-Net by defining a multi-example learning method and a loss function thereof, training the R-Net by adopting an AdaGrad gradient descent algorithm to obtain a training model, and judging the possibility of geological disaster occurrence by converting 3D characteristics into abnormal scores;

when the method is used for application, 3D features are extracted from the video through an H-C3D network, the features are combined into 32 x 4096 dimensions according to time series, and the combined features are sent to a trained R-Net, so that abnormal scores can be obtained in real time along with video frames.

According to observation statistics of a large amount of video data, dividing the abnormal score into five intervals, respectively corresponding to five early warning grades (table 1), and obtaining the early warning grade of the geological disaster according to the interval where the abnormal score is located.

The invention also aims to provide an information data processing terminal for realizing the video monitoring-oriented end-to-end geological disaster automatic identification method.

Another object of the present invention is to provide a computer-readable storage medium, which includes instructions that, when executed on a computer, cause the computer to execute the video-monitoring-oriented end-to-end geological disaster automatic identification method.

The invention also aims to provide application of the video monitoring-oriented end-to-end geological disaster automatic identification method in geological disaster prevention and reduction.

The invention also aims to provide a geological disaster information data processing platform applying the video monitoring-oriented end-to-end geological disaster automatic identification method.

In summary, the advantages and positive effects of the invention are: the multi-example learning loss function formula trains the geological disaster classification and identification network, so that the network can be converged, and the network output is closer to the expected value. Unlike the existing loss function: the invention aims at the specific problem to be faced, and redesigns the loss function by combining the idea of multi-example learning so as to optimize the geological disaster classification and identification network. The existing geological disaster video monitoring only collects video information, and can play back to see a specific scene at a glance when necessary. The video monitoring data is not truly and fully utilized. The invention aims to fully utilize video monitoring data and play a role in real-time monitoring and early warning.

At present, no technology for automatically judging whether a geological disaster occurs in real time by utilizing video monitoring exists. The invention relates to a method for automatically identifying geological disasters in real time by using camera monitoring for the first time in the field of geological disasters. The invention is based on the idea of human abnormal behavior detection, improves and optimizes the prior art (C3D network), and then is applied to the field of geological disasters to realize real-time automatic judgment of the early warning level of geological disasters.

Drawings

Fig. 1 is a flow chart of an end-to-end geological disaster automatic identification method for video monitoring according to an embodiment of the present invention.

Fig. 2 is a diagram of an H-C3D model network architecture according to an embodiment of the present invention.

FIG. 3 is a block diagram of a geological disaster classification and identification network R-Net according to an embodiment of the present invention.

Fig. 4 is a schematic diagram illustrating automatic discrimination of landslide geological disasters according to an embodiment of the present invention.

Fig. 5 is a schematic diagram illustrating automatic discrimination of a debris flow geological disaster according to an embodiment of the present invention.

Fig. 6 is a schematic diagram illustrating automatic discrimination of a non-geological disaster according to an embodiment of the present invention.

FIG. 7 is a graph of ROC provided by an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Aiming at the problems in the prior art, the invention provides a video monitoring-oriented end-to-end geological disaster automatic identification method, a system and application thereof, and the invention is described in detail below by combining with the attached drawings.

As shown in fig. 1, the method for automatically identifying an end-to-end geological disaster facing video monitoring provided by the embodiment of the invention comprises the following steps:

s101: transforming and optimizing a C3D model to obtain a network (H-C3D) suitable for extracting 3D features of geological disaster scenes, inputting a video into the H-C3D network to extract the features, extracting one 4096-dimensional 3D feature every 16 frames, and combining all the features into 32 x 4096-dimensional features according to a time sequence;

s102: classifying videos in a weak supervision mode, wherein geological disaster videos are classified into one type, and non-geological disaster videos are classified into one type; defining a multi-example learning method and a loss function thereof, training a geological disaster classification recognition network R-Net by adopting an AdaGrad gradient descent algorithm to obtain a training model, and judging the early warning level of the geological disaster by converting 3D characteristics into abnormal values;

s103: when the method is applied, 3D features are extracted from the video through an H-C3D network, the features are combined into 32 x 4096 dimensions according to a time sequence, and the combined features are sent to a trained R-Net network, so that abnormal scores can be obtained in real time along with video frames.

S104: according to observation statistics of a large amount of video data, dividing the abnormal score into five intervals, respectively corresponding to five early warning grades (table 1), and obtaining the early warning grade of the geological disaster according to the interval where the abnormal score is located.

TABLE 1 early warning level corresponding to abnormal score interval

Anomaly score f	f＜0.2	0.2≤f＜0.4	0.4≤f＜0.6	0.6≤f＜0.8	f≥0.8
						Early warning level	1	2	3	4	5
Color warning	Green colour	Blue color	Yellow colour	Orange colour	Red colour

The technical solution of the present invention is further described below with reference to the accompanying drawings.

The video monitoring-oriented end-to-end automatic geological disaster identification method provided by the embodiment of the invention specifically comprises the following steps:

(1) extracting 3D characteristics of geological disaster/non-geological disaster videos: based on a C3D network trained by a large-scale data set, a proper amount of geological disaster videos are added by using the idea of transfer learning, the C3D network is modified and optimized, a network H-C3D which is suitable for geological disaster scenes and has good 3D feature extraction capability is obtained, and the H-C3D network structure for extracting 3D features is shown in FIG. 2:

the method specifically comprises the following steps: first, the image is cut into 128 × 117 size by taking a 16-frame sequence image as a unit, the length and width of the image is adjusted to 112 × 112 size inside the image, and half of the image is horizontally inverted to adapt to the temporal and spatial jitter. The H-C3D network consists of 5 convolutional layers, 5 pooling layers and 1 fully-connected layer, the number of filters from 1 to 5 convolutional layers is 64, 128, 256, 512, 512, respectively, each convolutional layer is followed by a pooling layer, the size of each subsequent pooling layer is 2 x 2 by 1 step through 3D convolution kernels of 3 x 3 and the first pooling layer of 1 x 2; the output dimension of the fully connected layer is 4096 dimensions.

(2) Training a geological disaster classification recognition network R-Net by using the 3D features extracted by the H-C3D network: first, after extracting video features in units of 16 frames, the invention is used for R-Net training network with 32 such features, and if the number of video frames is less than 16 × 32, the previous features are repeated to the 32 th. If the video frame number is more than 16 × 32, after the features are arranged in the time sequence, 32 features are obtained at equal intervals. These 32 x 4096 dimensional 3D features are fed into a classification recognition network R-Net for training, and the block diagram of the classification recognition network R-Net is shown in fig. 3:

because the geological disaster video and the non-geological disaster video are divided into 32 segment features, it is assumed that maximum score values classified by an R-Net network are respectively taken from the 32 segment features, so that the maximum abnormal score of the geological disaster video segment is always larger than the maximum abnormal score of the non-geological disaster video segment, namely, a loss function is expressed as the following form:

wherein k is the total number of samples, w is the network parameter of R-Net,

wherein λ₁，λ₂For hyper-parameters, n is the total number of feature segments in a video. (2) + (3) constitutes the loss function for multi-instance learning of the present invention. And training a geological disaster classification recognition network R-Net by using the function and combining an AdaGrad gradient descent algorithm to obtain a trained model and parameters thereof.

The technical effects of the present invention will be described in detail with reference to the tests below.

When testing a new video, the invention firstly extracts 3D features from the video through an H-C3D network, and then combines the features into 32 x 4096 dimensions according to a time sequence. And inputting the combined 3D features into a trained geological disaster classification and identification network R-Net for classification, outputting abnormal value scores in real time, drawing a dynamic abnormal value curve, and realizing automatic judgment of the early warning level of the geological disaster. As shown in fig. 4 (landslide) and 5 (debris flow):

the invention uses the ROC curve graph to evaluate the effectiveness of the algorithm. At present, a computer vision technology is not used for a case of automatically judging geological disasters, so the method is compared with a random guessing method, wherein the probability of judging geological disasters and non-geological disasters by the random guessing method is 50 percent respectively. The test data are divided into 200 videos including 100 geological disaster videos and 100 non-geological disaster videos, and the videos do not participate in the training. Positions of disaster occurrence frames are marked in a geological disaster type video in time sequence, and an ROC curve graph is drawn as shown in FIG. 6. The application test is carried out by utilizing the video data, and the test result shows that the method has higher accuracy rate which can reach 94%.

It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. An end-to-end geological disaster automatic identification method facing video monitoring is characterized by comprising the following steps:

firstly, transforming and optimizing a C3D network to obtain an H-C3D network suitable for extracting geological disaster features, inputting a video into the H-C3D network to extract the features, extracting a 4096-dimensional 3D feature from each 16 frames, and combining all the features into 32 x 4096-dimensional features according to a time sequence;

secondly, classifying the videos in a weak supervision mode, wherein the videos of the geological disasters are classified into one type, and the videos of the non-geological disasters are classified into one type; defining a multi-example learning method and a loss function thereof, building a classification recognition network R-Net, training the R-Net by adopting AdaGrad to obtain a training model, and judging the early warning level of the geological disaster by converting the 3D characteristics into abnormal values by the model;

thirdly, when the method is applied, 3D features are extracted from the video through an H-C3D network, the 3D features are recombined into 32 x 4096 dimensions according to time sequence, the combined features are sent to a trained classification recognition network R-Net, and abnormal scores are obtained in real time along with video frames;

fourthly, dividing the abnormal score into five intervals based on a large number of observation statistics, respectively corresponding to five early warning grades, and obtaining an early warning level of the geological disaster according to the interval where the abnormal score is located;

the video monitoring-oriented end-to-end geological disaster automatic identification method can extract the 3D characteristics of geological disaster/non-geological disaster videos: the method comprises the steps that a C3D network trained on a large-scale data set is added with a geological disaster video by utilizing the idea of transfer learning, and the C3D network is modified and optimized to obtain a network H-C3D suitable for extracting 3D characteristics of a geological disaster scene;

further comprising: taking a 16-frame sequence image as a unit, cutting the image into a size of 128 × 117, adjusting the length and width to a size of 112 × 112 internally, and horizontally flipping 50% of the total data volume to adapt to the jitter situation on time and space; the H-C3D network consists of 5 convolutional layers, 5 pooling layers and 1 fully-connected layer, the number of filters from 1 to 5 convolutional layers is 64, 128, 256, 512, 512, respectively, each convolutional layer is followed by a pooling layer, the size of each subsequent pooling layer is 2 x 2 by 1 step through 3D convolution kernels of 3 x 3 and the first pooling layer of 1 x 2; the output dimension of the fully connected layer is 4096 dimensions.

2. The video-monitoring-oriented end-to-end geological disaster automatic identification method as claimed in claim 1, characterized in that the video-monitoring-oriented end-to-end geological disaster automatic identification method utilizes 3D features extracted through an H-C3D network to train a geological disaster classification identification network R-Net: after the video is input into the H-C3D network by taking 16 frames as a unit to extract 3D features, the training network has 32 such features, and if the number of the video frames is less than 16 x 32, the previous features are repeated to the 32 th feature; if the number of the video frames is more than 16 × 32, arranging the features according to a time sequence, and obtaining 32 features at equal intervals; and (3) feeding the 32-by-4096-dimensional 3D features into an R-Net for training.

3. The video-monitoring-oriented end-to-end geological disaster automatic identification method as claimed in claim 1, characterized in that the multi-example learning loss function for training the geological disaster classification and identification network R-Net is:

wherein k is the total number of samples, w is the network parameter of R-Net,

wherein λ₁，λ₂N is the total number of the characteristic segments in one video, and is a hyper-parameter; and (4) training R-Net by combining the loss function with an AdaGrad gradient descent algorithm to obtain a trained model and parameters thereof.

4. An automatic identification method system for video monitoring end-to-end geological disasters, which implements the automatic identification method for video monitoring end-to-end geological disasters according to any one of claims 1 to 3, comprises the following steps:

the characteristic extraction module is used for obtaining a network H-C3D suitable for extracting 3D characteristics of geological disaster scenes by modifying and optimizing a C3D network, extracting one 4096-dimensional 3D characteristic every 16 frames, and combining all the characteristics into 32 x 4096-dimensional characteristics according to a time sequence;

the video labeling module is used for classifying videos under weak supervision, wherein geological disaster videos are classified into one type, and non-geological disaster videos are classified into one type;

the video frame classification and identification module builds a geological disaster classification and identification network R-Net by defining a multi-example learning method and a loss function thereof, trains the R-Net by adopting an AdaGrad gradient descent algorithm to obtain a training model, and judges the early warning level of the geological disaster by converting 3D characteristics into abnormal values;

when the method is applied, 3D features are extracted from a video through an H-C3D network, the features are combined into 32 x 4096 dimensions according to a time sequence, the combined features are sent to a trained R-Net, and abnormal scores can be obtained in real time along with video frames;

according to observation statistics of a large amount of video data, dividing the abnormal score into five intervals, respectively corresponding to five early warning levels, and obtaining the early warning level of the geological disaster according to the interval where the abnormal score is located.

5. An information data processing terminal for realizing the video monitoring-oriented end-to-end geological disaster automatic identification method according to any one of claims 1 to 3.

6. A computer-readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the video surveillance-oriented end-to-end geological disaster automatic identification method according to any of claims 1 to 3.

7. An application of the video monitoring oriented end-to-end geological disaster automatic identification method according to any one of claims 1 to 3 in geological disaster prevention and disaster reduction.

8. A geological disaster information data processing platform applying the video monitoring end-to-end geological disaster automatic identification method according to any one of claims 1 to 3.