CN116152678A

CN116152678A - Marine disaster-bearing body identification method based on twin neural network under small sample condition

Info

Publication number: CN116152678A
Application number: CN202211531447.2A
Authority: CN
Inventors: 文莉莉; 邬满; 柯友刚; 李宛怡; 赖俊翔; 许贵林; 严小敏
Original assignee: Hunan Institute of Science and Technology; Guangxi Academy of Sciences
Current assignee: Hunan Institute of Science and Technology; Guangxi Academy of Sciences
Priority date: 2022-12-01
Filing date: 2022-12-01
Publication date: 2023-05-23

Abstract

An ocean disaster-bearing body identification method based on a twin neural network under a small sample condition establishes an image sample library; expanding a sample library by a data enhancement method; labeling the sample; introducing an attention mechanism SKNet network, and constructing a trunk feature extraction network by combining a ResNet101 network; based on a trunk feature extraction network, constructing a twin neural network with the same double-path structure and shared weight; extracting characteristics of input data by using a twin neural network; calculating the distance of the two-way feature vector through a loss function; and outputting the category information of the ocean disaster bearing body. Aiming at the characteristics of multi-scale, diversity, few samples and the like of the marine disaster-bearing body, the invention enhances the feature extraction capability and the feature effectiveness of the algorithm by combining the convolution network with the improved three-channel SKNet network, improves the self-adaptive capability of the algorithm to small samples and multi-scale targets, and is more suitable for identifying and classifying the marine disaster-bearing body under the condition of the small samples.

Description

Marine disaster-bearing body identification method based on twin neural network under small sample condition

Technical Field

The invention belongs to the technical field of marine emergency disaster prevention, and relates to a marine disaster-bearing body identification method.

Background

Typhoon disasters are one of the most damaging natural disasters in the world. With the continuous enhancement of comprehensive national force in China, the ocean disaster early warning and forecasting level is continuously improved, ocean observation and forecasting comprehensive service platforms are built in each province and each region to deal with ocean environment and disaster numerical forecasting, the platform provides service for early ocean disaster early warning of storm surge, sea wave, wind field and the like through observation means of a shore station, a buoy, a satellite, a radar and the like, the ocean disaster early warning and forecasting service provided by the platform mainly takes the large-area and warning effect as a main part, and the small-area and refined early warning information service is insufficient, and particularly the differential early warning service to different areas cannot be realized in the process of serious ocean disasters, so that the basis of disaster prevention and disaster reduction deployment is more and less in quantity. Therefore, the disaster mechanism research of the delicate and fragile disaster-bearing body is carried out, an accurate disaster damage evaluation model is constructed, the evaluation and disaster evaluation of the fragile disaster-bearing body are realized, a basis is provided for the temporary disaster prevention and disaster reduction fine management of strong typhoons, the command and decision of the marine disaster prevention and disaster reduction work are assisted by the government, and the disaster loss is reduced to the maximum extent, so that the method has great significance.

Because of the complexity of typhoon disasters (including disasters of different types such as strong wind disasters, strong wave disasters, seawater backflow and the like) and the uncertainty of vulnerability evaluation of marine disaster-bearing bodies (such as rigid, non-rigid, point-like, linear and plane disaster-bearing bodies and uncertainty of disaster damage), the conventional meteorological simulation means can only predict the forming time, path and the like of the meteorological disasters approximately, and can not predict the possible loss and social influence caused by the disasters, so that the requirements of accurate and intelligent response of the current social development are difficult to meet. Therefore, how to scientifically evaluate the vulnerability of the ocean disaster-bearing body, find out the disaster mechanism of the ocean disaster-bearing body and establish a disaster damage evaluation model, realize the whole process of disaster damage evaluation and early warning before, during and after the typhoon storm surge disaster, and improve the ocean disaster prevention and reduction emergency management level, thereby being an important subject for the public safety of the business relations and the stable and rapid development of national economy.

The traditional evaluation method of the vulnerability of the disaster-bearing body is mostly based on the entity and relation commonality elements of the vulnerability of the disaster-bearing body, and an evaluation index system and a weight calculation physical model of the vulnerability of the disaster-bearing body are constructed, so that the expression, integration and storage of the knowledge of the vulnerability of the disaster-bearing body are realized. The method depends on experience of researchers and selection of weight parameters to a great extent, and disaster causing factors and disaster response of disasters are complex nonlinear relations, have strong uncertainty and difficult masterability, so that vulnerability evaluation and disaster damage evaluation of disaster-bearing bodies are difficult to accurately realize only depending on a traditional physical model.

Disclosure of Invention

The invention provides a marine disaster-bearing body identification method based on a twin neural network under a small sample condition by taking a typical fragile marine disaster-bearing body as a research object. Constructing a small sample library for various typical marine disaster-bearing bodies of points, lines and surfaces; the twin neural network structure is designed to train the sample library, so that a marine disaster-bearing body identification model under the condition of a small sample is obtained and is used for disaster-bearing body evaluation and disaster early warning, and a fragile disaster-bearing body disaster response model is constructed.

In order to achieve the above purpose, the present invention adopts the following technical scheme.

The invention relates to a marine disaster-bearing body identification method based on a twin neural network under a small sample condition, which mainly comprises 3 parts: (1) data enhancement: performing data enhancement on the small sample data set by a method of rotating angle, randomly cutting, adjusting picture brightness and the like; (2) designing a backbone feature extraction network: the function is to extract the characteristics, and various neural networks can be applied; (3) constructing a twin neural network: the network has two sub-networks with the same structure and shared weights, is trained through one supervised metric learning, and then uses the features extracted by that network for small sample learning. The method specifically comprises the following steps:

step S1, carrying out refined data acquisition on a typical marine disaster-bearing body by using an unmanned aerial vehicle, and establishing an image dataset of the marine disaster-bearing body by combining a satellite remote sensing image, wherein the image dataset comprises a target image, labeling information and the like.

And S2, carrying out data enhancement on the sample library by means of random angle rotation, random clipping, contrast adjustment, gray balance and the like.

And S3, introducing an attention mechanism, designing a trunk feature extraction network based on the convolutional neural network, and realizing extraction of image features. Image feature extraction is performed through multiple combinations of SKNet and convolutional neural networks to obtain feature graphs, wherein the convolutional neural network algorithms such as ResNet101, VGG16, resNet, incRes V2 and the like adopt ResNet101 networks.

Further, the implementation steps of the backbone feature extraction network for introducing the attention mechanism are as follows:

(1) Constructing a feature extraction network of an improved three-channel SKNet attention mechanism, and carrying out feature extraction on an input image to generate a preliminary feature map;

(2) And (3) taking the feature map generated in the step (1) as an input image, carrying out convolution processing by using a ResNet101 network, further extracting features, and finally outputting a result through a connection layer.

Further, the improved three-channel SKNet attention mechanism feature extraction network comprises the following specific implementation steps:

1) Convolving the input data with convolution kernels of 3*3, 5*5 and 7*7 to obtain outputs U1, U2 and U3;

2) The results of the 3 branches are fused with element-wise summation: u=u1+u2+u3, U is a feature map of c×h×w (C denotes a channel, H denotes a height, and W denotes a width) size and fused with a plurality of receptive field information. Then, by averaging the H, W dimension direction, a vector of size C1*1 is obtained, which represents the importance of each channel.

Set F _gp Statistical information representing global average pooling operations, channel-wise, is calculated using s (s.epsilon.R ^C ) Expressed by s _c The c-th element of s is represented as formula 1:

3) And (3) carrying out linear transformation on the vector of C1*1 by using a full-connection layer to obtain information Z of Z1*1 (as shown in formula (2)), and then respectively recovering the vector from Z dimension to C dimension by using three linear transformations to extract the information of each channel dimension.

z＝F _fc (s)＝δ(B(W _s )) (2)

Wherein: z epsilon R ^d×1 The method comprises the steps of carrying out a first treatment on the surface of the Delta is a ReLU function; b is batch standardization, W epsilon R ^d×C Where d=max (C/r, L), r denotes a reduction ratio, and L is the minimum value of d.

4) And (3) carrying out normalization processing by using softmax to obtain corresponding scores representing the importance degree of each channel, and then multiplying the scores by corresponding U1, U2 and U3 to obtain A1, A2 and A3. And adding and fusing the 3 modules to obtain Y, wherein the Y is subjected to information extraction relative to U, and a plurality of receptive field information are fused.

Let a, B, c be 3 weight matrices of Select, A, B ε R ^C×d ，A _i Represents line i, a of A _i Is the ith element of a, B _i 、b _i And A _i Similarly, and a _i +b _i +c _i =1, the final feature map Y is as in equations (3), (4) (where e is a natural number).

Yi＝a _i ×A1+b _i ×A2+c _i ×A3 (3)

/>

And S4, constructing a twin neural network structure based on a trunk feature extraction network, wherein the network has two inputs, and the neural network is utilized to map the inputs to a new space so as to form a representation of the inputs in the new space.

Further, the implementation steps of the twin neural network are as follows:

(1) Based on the trunk feature network constructed in the step S3, constructing two networks with the same structure and shared weight;

(2) After the two inputs pass through the trunk feature extraction network, a multidimensional feature is obtained, and is tiled on one dimension to obtain one-dimensional vectors of the two inputs;

(3) During training, different paired samples are constructed in a combined mode, a network is input for training, loss calculation is carried out on the uppermost layer through cross entropy of a distance, whether the paired samples belong to the same class is judged according to the distance of the paired samples, and corresponding probability distribution is generated; in the prediction stage, the twin network processes each sample pair between the test sample and the support set, and the final prediction result is the category with the highest probability on the support set. This approach limits the structure of the input and automatically discovers features that can be generalized from the new sample. Training is performed by a supervised twin network-based metric learning, and then single/small sample learning is performed with reuse of the features extracted by that network.

Further, the loss calculation principle is as shown in fig. 2:

1) The two same sub-neural networks are owned, and the weights W are shared;

2) The method belongs to weak supervision learning, and samples are sample pairs: ((X) ₁ ，X ₂ ) Y). Wherein X is ₁ ，X ₂ Y=1 for similar samples, otherwise y=0;

3) The sub-network accepts two inputs and is converted into vectors by the neural network, and then calculates the distance between the two vectors (distance measure is selected according to the requirement). The calculation formula is as follows:

wherein N represents the number of samples, i.e., the number of pairs of samples; y represents a label, i.e., y=0 or y=1; e (E) _ω Representing Euclidean distance, i.e. E _ω ＝|X ₁ -X ₂ | ₂ The method comprises the steps of carrying out a first treatment on the surface of the m represents the distance threshold of dissimilar samples, i.e. the distance between two dissimilar samples is [0, m]Beyond m, the loss of two dissimilar samples can be considered 0.

4) The distance is fully connected twice, the second time is fully connected to a neuron, the result of the neuron is sigmoid, the value of the result is between 0 and 1, and the result represents the similarity degree of two input pictures.

And S5, training network parameters by using a sample library to obtain the marine disaster-bearing body recognition model.

And S6, inputting the data to be identified and the known category data in pairs by using the model, and calculating the similarity, thereby obtaining a category result with the highest similarity.

The invention takes a series of related scientific problems of vulnerability evaluation, risk evaluation and catastrophe response of the marine disaster-bearing body as a research main line, and aims to establish a set of analysis and evaluation model combining a traditional physical model and an artificial intelligent model. In the weight calculation physical model, a fuzzy theory and a hierarchical analysis method are combined, and the determination process from qualitative to quantitative of the index weight is completed according to expert opinion. In the aspect of a deep learning evaluation model, a marine disaster-bearing body catastrophe damage sample library is established, the influence weight of each disaster-causing factor on the catastrophe damage condition is automatically analyzed through a deep learning method, the damage degree of the disaster-bearing body and the risk level of the disaster are automatically judged, a universal vulnerability analysis and disaster damage evaluation framework is provided, the limitation of vulnerability knowledge of the disaster-bearing bodies among different types is reduced, and the scientificity and reliability of an evaluation result are improved. And finally, verifying the accuracy of the two models through a comparison experiment, and establishing a set of more reliable marine disaster-bearing body disaster damage assessment method combining the two models, so as to realize more accurate, scientific and intelligent marine disaster emergency management. The method has important significance in theory for perfecting emergency management disciplines; the vulnerability research in practice can provide scientific basis for the establishment of emergency measures, namely is favorable for guiding the practice of emergency management, reduces the harm of disaster events to the greatest extent, and improves the effectiveness and scientificity of the emergency management.

Compared with the prior art, the invention has the following beneficial technical effects:

1) According to the invention, by introducing the SKNet network model, the network structure of the original convolutional neural network is improved, and compared with the original ResNet101 network, the improved network enhances the extraction capability of the network for complex scene image features, and is more suitable for extracting and identifying complex ocean targets.

2) The invention uses the convolution kernel with multiple sizes in the network, has the scale self-adaptive capability, can be better suitable for detecting the multi-scale ocean targets, and can realize the accurate detection of one model on a plurality of targets with different scales (large scale span).

3) According to the invention, by using a twin neural network structure with two paths, the learning ability of the network to a small sample is enhanced, and compared with a single-path feature extraction network, the method has higher identification accuracy and stability under the condition of the small sample, and can be used for rapidly modeling and identifying a typical marine disaster-bearing body.

Drawings

Fig. 1 is a network configuration diagram of the present invention.

Fig. 2 is a diagram of a backbone feature extraction network architecture for an attention-drawing mechanism.

FIG. 3 is a block diagram of a two-way twin neural network.

Fig. 4 is a schematic diagram of the cross entropy-based loss calculation of the present invention.

FIG. 5 is a graph showing the test effect of the present invention.

Detailed Description

The invention will be further illustrated by the following examples, with reference to the accompanying drawings.

The marine disaster-bearing body identification method based on the twin neural network under the condition of a small sample comprises the following steps:

s1, collecting remote sensing image data of high-resolution satellites and unmanned aerial vehicles, and establishing an image sample library aiming at typical marine disaster-bearing bodies (houses, dykes, oyster rows, ships and the like).

S2, capturing images by using satellite remote sensing images and unmanned aerial vehicle high-definition images to obtain ocean target samples, wherein the resolution of each image is 600 pixels or 600 pixels. The sample library is expanded by the methods of arbitrary angle rotation, random cutting, noise adding and the like, 160 training sample libraries are built together, wherein 40 targets (houses, embankments, oyster rows and ships) of various types are found, and 80 sample libraries are tested. The data distribution is shown in table 1.

TABLE 1 statistics of different classes of marine disaster recovery volume data

S3, labeling the sample by using a LabelImg tool, and establishing a data tag corresponding to the sample type, wherein the tag file content comprises: sample path, labeled target coordinate range, target category.

S4, extracting the first path of image features of the twin network:

(1) And (3) firstly normalizing the image data input in the first path of the sample pair to 256×256, and then preprocessing the image by using a three-channel SKNet of the attention module.

a. Carrying out convolution operation on input data by using convolution kernels of 3*3, 5*5 and 7*7 respectively to obtain three feature maps;

b. fusing the three feature images by using element-wise summation to obtain a feature image with the size of C.times.H.times.W (C represents a channel, H represents a height and W represents a width); then, by averaging the H, W dimension direction, a vector of size C1*1 is obtained, which represents the importance of each channel. As in formula 1.

c. And (3) carrying out linear transformation on the vector of C1*1 by using a full-connection layer to obtain information Z of Z1*1 (as shown in formula 2), and then respectively recovering the vector from Z dimension to C dimension by using three linear transformations to extract the information of each channel dimension.

d. Carrying out normalization processing by using softmax to obtain corresponding scores representing the importance degree of each channel, and then multiplying the corresponding scores by the three feature graphs obtained in the step a to obtain 3 new modules; and adding and fusing the 3 modules to obtain a feature map fused with a plurality of receptive field information, and outputting the feature map as a three-channel SKNet after preprocessing the image.

(2) The feature map with the attention information output in (1) is further extracted by utilizing a ResNet101 network, and the ResNet101 network is calculated as follows:

a. first through a 7 x 64 convolutional layer.

b. Then 3+4+23+3=33 residual blocks; each residual block contains 3 convolutional layers, respectively: 1×1, 3×3, 1×1, wherein two 1×1 convolution layers are used for respectively reducing and increasing feature dimensions, the main purpose is to reduce the number of parameters, thereby reducing the calculation amount, and training and feature extraction of data can be performed more effectively and intuitively after the dimension reduction. Thus, this section performs 33×3=99 layers of convolution operations in total.

The calculation principle of the residual block is as follows:

the original network input x can be fit to the output F (x), and the desired output H (x). Let H (x) =f (x) +x now, then our network only needs to learn to output one residual F (x) =h (x) -x. The mapping learned by the two fully connected layers is H (x), that is, the two layers can be progressively fit to H (x). Assuming that H (x) is the same as x in dimension, fitting H (x) is equivalent to fitting a residual function H (x) -x, so that the residual function F (x) =h (x) -x, the original function becomes F (x) +x, and then a cross-layer connection is directly added on the basis of the original network, and the cross-layer connection is also very simple, namely, the identity mapping of x is transferred. Instead of letting F (x) learn the underlying mapping directly, F (x) +x is fitted to H (x) instead of learning the residual H (x) -x, i.e., F (x): =h (x) -x, which would have been the original forward path to F (x) +x.

c. The last full connection layer is used for classification.

And carrying out convolution and full connection calculation (as shown in table 1) of 1+99+1=101 layers, and finally obtaining the feature vector v1 of the first path of input image.

TABLE 1ResNet101 network architecture

S5, extracting the second path image characteristics of the twin network: and (4) performing feature extraction processing on the data input in the second path in the sample pair by using the same steps in S4 to obtain a feature vector v2 of the input image in the second path. The second path network and the first path network have the same structure and share weight.

S6, loss calculation: tiling the two obtained multidimensional vectors into two one-dimensional vectors, and calculating Euclidean distance E of the two vectors _w (as in equation 7), it is determined whether the samples belong to the same class according to their distance, and a corresponding probability distribution is generated.

After a batch of samples is trained, the overall loss is calculated according to formulas (5) and (6).

And S7, randomly testing the identification model by using samples which do not participate in training.

As can be seen from the training effect, the overall stability of the algorithm is better under the condition of small samples, and because of less sample data, if training is carried out by using a traditional convolutional neural network, such as Faster R-CNN, YOLOv5 and the like, the model is difficult to converge, and the recognition accuracy of the test sample is lower. The overall test results are shown in tables 2 and 3.

Table 2 comparative test results

Table 3 the algorithm of the invention classifies test results

According to the test result, under the same training and test conditions, the recognition accuracy of the improved algorithm of the invention for the small sample target is obviously improved.

The foregoing is merely illustrative of the preferred embodiments of the present invention and is not intended to limit the embodiments and scope of the present invention, and it should be appreciated by those skilled in the art that equivalent substitutions and obvious variations may be made using the teachings of the present invention, which are intended to be included within the scope of the present invention.

Claims

1. A marine disaster-bearing body identification method based on a twin neural network under a small sample condition is characterized by comprising the following steps:

step S1, carrying out refined data acquisition on a typical marine disaster-bearing body by using an unmanned aerial vehicle, and establishing an image dataset of the marine disaster-bearing body by combining a satellite remote sensing image, wherein the image dataset comprises a target image and labeling information;

s2, carrying out data enhancement on the sample library by a random angle rotation, random cutting, contrast adjustment and gray balance method;

s3, introducing an attention mechanism, designing a trunk feature extraction network based on a convolutional neural network, and realizing extraction of image features; image feature extraction is carried out through multiple combinations of SKNet and a convolutional neural network to obtain a feature map, wherein the convolutional neural network algorithm comprises ResNet101, VGG16, resNet and IncRes V2;

s4, a network is extracted based on trunk characteristics, a twin neural network structure is constructed, the network has two inputs, the neural network is utilized to map the inputs to a new space, and a representation of the inputs in the new space is formed;

s5, training network parameters by using a sample library to obtain an ocean disaster-bearing body recognition model;

s6, inputting the data to be identified and the known category data in pairs by using the model, and calculating the similarity, so as to obtain a category result with the highest similarity;

the backbone feature extraction network of the attention-introducing mechanism described in step S3 includes the following steps:

(1) The method comprises the steps of constructing a feature extraction network of an improved three-channel SKNet attention mechanism, carrying out feature extraction on an input image, and generating a preliminary feature map, wherein the steps are as follows:

2) The results of the 3 branches are fused with element-wise summation: u=u1+u2+u3, U is a feature map of size c×h×w and fused with multiple receptive field information, where C represents a channel, H represents a height, and W represents a width; then, by averaging the H, W dimension direction, a vector with the size of C1*1 is obtained, and the importance degree of each channel is represented;

set F _gp Statistical information representing global average pooling operations, channel-wise, is calculated using s (s.epsilon.R ^C ) Representation s _c The c element of s is represented as formula (1):

3) Performing linear transformation on the vector of C1*1 by using a full-connection layer to obtain information Z of Z1*1, as shown in formula (2); then three linear transformations are respectively used, and the vector is restored from the Z dimension to the C dimension, so that the information of each channel dimension is extracted;

z＝F _fc (s)＝δ(B(W _s )) (2)

wherein: z epsilon R ^d×1 The method comprises the steps of carrying out a first treatment on the surface of the Delta is a ReLU function; b is batch standardization, W epsilon R ^d×C Where d=max (C/r, L), r represents the reduction ratio, L is the minimum value of d;

4) Carrying out normalization processing by using softmax to obtain corresponding scores representing the importance degree of each channel, and then multiplying the scores by corresponding U1, U2 and U3 to obtain A1, A2 and A3; adding and fusing the 3 modules to obtain Y, wherein the Y is subjected to information extraction relative to U, and a plurality of receptive field information are fused;

let a, B, c be 3 weight matrices of Select, A, B ε R ^C×d ，A _i Represents line i, a of A _i Is the ith element of a, B _i 、b _i And A _i Similarly, and a _i +b _i +c _i =1, the final feature map Y is as in equations (3), (4); wherein e is a natural number;

Yi＝a _i ×A1+b _i ×A2+c _i ×A3(3)

/>

(2) Taking the feature map generated in the step (1) as an input image, carrying out convolution processing by using a ResNet101 network, further extracting features, and finally outputting a result through a connecting layer;

the step of twinning the neural network in the step S4 is as follows:

(3) During training, different paired samples are constructed in a combined mode, a network is input for training, loss calculation is carried out on the uppermost layer through cross entropy of a distance, whether the paired samples belong to the same class is judged according to the distance of the paired samples, and corresponding probability distribution is generated; in the prediction stage, the twin network processes each sample pair between the test sample and the support set, and the final prediction result is the category with highest probability on the support set; the method limits the input structure and automatically discovers the characteristic which can be generalized from a new sample, trains the new sample through a supervised metric learning based on a twin network, and then reuses the characteristic extracted by the network to learn a single/small sample;

the loss calculation comprises the following steps:

1) The two same sub-neural networks are owned, and the weights W are shared;

2) The method belongs to weak supervision learning, and samples are sample pairs: ((X) ₁ ，X ₂ ) Y); wherein X is ₁ ，X ₂ Y=1 for similar samples, otherwise y=0;

3) The sub-network receives two inputs, converts the two inputs into vectors by the neural network, and calculates the distance between the two vectors (distance measurement is selected according to the requirement); the calculation formula is as follows:

wherein N represents the number of samples, i.e., the number of pairs of samples; y represents a label, i.e., y=0 or y=1; e (E) _ω Representing Euclidean distance, i.e. E _ω ＝|X ₁ -X ₂ | ₂ The method comprises the steps of carrying out a first treatment on the surface of the m represents the distance threshold of dissimilar samples, i.e. the distance between two dissimilar samples is [0, m]Beyond m, the loss of two dissimilar samples can be considered 0;