CN115146534A

CN115146534A - Subway sleeper beam damage identification method based on attention mechanism and advanced convolution structure

Info

Publication number: CN115146534A
Application number: CN202210758320.8A
Authority: CN
Inventors: 阳程星; 王傲; 许平; 姚曙光
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2022-10-04
Anticipated expiration: 2042-06-29
Also published as: CN115146534B

Abstract

The invention provides a subway sleeper beam damage identification method based on an attention mechanism and an advanced convolution structure, which comprises the following steps: acquiring strain field data of a subway sleeper beam, and recombining the strain field data; inputting the recombined strain field data into a constructed neural network model, obtaining data after primary feature extraction through convolution processing, repeatedly performing grouping convolution and processing by adopting a plane space attention mechanism and a channel attention mechanism to obtain extracted higher-level data, and simultaneously correspondingly reducing the size of the data; performing transposition convolution processing on the extracted higher-level data, splicing the higher-level data with the data which is output last time and has the same size in the channel direction, and sequentially performing separable convolution and transposition convolution processing to obtain a feature matrix which integrates the high-level and low-level features and has the same size as the data after the features are extracted for the first time; and then activating by a transposed convolution and a softmax function, outputting a damage identification probability matrix, and further outputting a damage result.

Description

Subway sleeper beam damage identification method based on attention mechanism and advanced convolution structure

Technical Field

The invention relates to the technical field of rail transit damage identification, in particular to a subway sleeper beam damage identification method based on an attention mechanism and an advanced convolution structure.

Background

In the case of railway vehicles, the body bolster is an important part connecting the underframe of the vehicle body and the bogie, bears the large forces and the weight of the entire vehicle, and transmits the weight to the running gear via the upper and lower center plates. It also bears the action of vertical, longitudinal and torsional alternating loads, and the stress condition is complex. In the production process of the sleeper beam, due to factors such as processing conditions, material defects, manual operation and the like, the product is easy to have defects such as internal holes, depressions, surface scratches and the like; in the daily application process, the sleeper beam is inevitably subjected to the action of alternating heavy load, so that fatigue damage is inevitable. These injuries increase the cost of producing and maintaining the bolster, and in severe cases threaten the safety of passengers' lives and properties. Therefore, the product quality and the operation maintenance of the rail transit are very important, and the efficient and automatic real-time damage identification of the subway sleeper beam is realized, so that the method has great research value.

The strain-based structural health monitoring technology is widely applied to the traffic field, and the existing strain damage identification method has the defects of complex process, state evaluation lag, low intelligent degree and the like, so that the development of an advanced intelligent identification technology is urgently needed.

In research relating to the detection of bolster damage, many scholars focus on the detection of bolster manufacturing and welding, and some use finite element simulations and fatigue damage laws for fatigue life prediction of the fatigue strength of bolster structures. However, the research on identifying the structural damage of the sleeper beam in the using process is less, so how to realize the real-time damage identification of the sleeper beam structure still needs to be solved.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a subway sleeper beam damage identification method based on an attention mechanism and an advanced convolution structure. The method takes the subway sleeper beam as a research object, is based on the subway sleeper beam strain field information, aims at the subway damage identification problem, adopts a deep learning method, utilizes strong nonlinear fitting capability of the deep learning method, introduces an attention mechanism and an advanced convolution structure, constructs a neural network model with precision and speed capable of meeting the requirement of real-time damage identification, and can realize the automatic damage identification process from the strain field to the damage.

In order to achieve the purpose, the invention provides a subway sleeper beam damage identification method based on an attention mechanism and an advanced convolution structure, which comprises the following steps of:

acquiring strain field data of each unit including a subway sleeper beam; the strain field data is a two-dimensional characteristic matrix;

recombining the strain field data to obtain recombined strain field data; the recombined strain field data is a three-dimensional strain field characteristic matrix, the recombined single-direction strain field data comprises height and width, and the number of the strain field directions is equal to the number of the channel directions in the network model;

inputting the recombined strain field data into a constructed neural network model, preliminarily extracting strain field data characteristics through convolution operation, and sequentially carrying out normalization and ReLU function activation processing to obtain data with the characteristics being preliminarily extracted;

repeatedly performing grouping convolution on the data with the characteristics extracted for the first time and processing by adopting a plane space attention mechanism and a channel attention mechanism to sequentially obtain higher-level data than the data extracted for the last time, and simultaneously correspondingly reducing the size of the data;

sequentially performing transposition convolution, normalization and ReLU function activation on the data subjected to the last feature extraction, splicing the data with the same size as the data output in the previous time in the channel direction to fuse low-level features, and sequentially performing separable convolution processing and transposition convolution processing on the data fused with the low-level features; repeating the steps to obtain a feature matrix which integrates all high-level and low-level features and has the same size as the data after the features are extracted for the first time;

a characteristic matrix which integrates all high-level and low-level characteristics and has the same size as the data after the characteristics are extracted for the first time is sequentially subjected to transposition convolution processing and softmax function activation, and a damage identification probability matrix is output;

according to the damage identification probability matrix, obtaining probability information of each unit belonging to corresponding damage degree; and judging which damage degree each unit belongs to respectively according to the probability information, and outputting a damage result.

Further, when the material of the subway sleeper beam is not damaged, D =0, the elastic modulus of the material is not changed; when complete damage occurs, D =1, at which time the elastic modulus of the material becomes 0; therefore, the damage is measured by the change of the elastic modulus of the material before and after the damage, and the weakening and discretization of the elastic modulus of each unit of the subway sleeper beam are divided into three weakening grades of the elastic modulus of the units, namely 20%,40% and 60%, which respectively correspond to the first-level damage, the second-level damage and the third-level damage.

Further, the strain field data are recombined to obtain recombined strain field data, which specifically includes:

assuming that the size of the acquired strain field data of the subway sleeper beam is c multiplied by n, and representing that the strain field comprises n strain values in c directions;

after data recombination, the data become an array with the size of c multiplied by h multiplied by w;

h and w respectively represent the height and width of the recombined single-direction strain field data and satisfy h × w = n; c represents the strain field direction number contained in one sample data in the input data, and is the channel direction in the network model.

Further, the reconstructed strain field data is input into the constructed neural network model, the size of the strain field data is gradually reduced through convolution operation, specifically, 2-D convolution is adopted, namely in the operation process, a convolution kernel slides in two plane directions of the feature data without moving in a channel direction, and therefore the whole data plane is traversed.

Further, in the normalization process, the normalization layer is used as an example normalization layer, which omits the normalization operation in the batch direction and the channel direction and only performs normalization in the H and W directions.

Further, performing packet convolution processing, specifically:

dividing input data into n groups;

sending the divided data into respective convolution channels for convolution calculation, and respectively obtaining corresponding characteristics;

and carrying out superposition operation on the respective data to obtain output data.

Further, a plane space attention mechanism and a channel attention mechanism are adopted for processing, and the method specifically comprises the following steps: dividing the flow direction of the input feature data into two parts, wherein one part firstly passes through a channel attention module, multiplies the obtained weight matrix by the original input data, and then passes through a space attention module and multiplies the weight matrix by a corresponding weight; the other is directly mapped, and finally, the data of the two flow directions are added and output through an activation function.

Further, the processing by the channel attention module specifically includes:

superposing a weight matrix which can be subjected to differential calculation and can accept the backward propagation of the model gradient on a plane of input data; the input feature data are respectively subjected to maximum value pooling and average value pooling to obtain a preliminary weight matrix which has the same plane size as the feature data and has the channel number of 1; stacking the two obtained matrixes in the channel direction, and performing 1 × 1 convolution operation to change the dimension in the channel direction to 1, thereby finally obtaining the weight matrix with the same plane size as the input feature data and the channel number of 1.

Further, the spatial attention module is used for processing, specifically:

weighting the feature data in the aspect of dimension channels, wherein the input feature data are respectively subjected to global maximum pooling and global average pooling to obtain 2 groups of weights with the same size as the number of input data channels; then expanding the data out of 2 dimensions of the plane, sending the data into 1 × 1 convolution to compress the compression rate in the channel direction, and reducing the channel direction through the 1 × 1 convolution; and finally, adding the two matrixes which are multiplied by 1 and multiplied by the number of the original data channels, and activating by sigmoid to obtain a final weight matrix.

The invention has the following beneficial effects:

1. according to the subway sleeper beam damage identification method based on the attention mechanism and the advanced convolution structure, a residual error module (Bol _ Res module) is established by referring to a ResNeXt network thought, and the simple convolution layer stacking is replaced by the grouping convolution, so that each group can be allowed to obtain corresponding features, a network model is deepened, more features can be extracted, the accuracy of the network model is improved, and the damage identification capability is enhanced. Meanwhile, the separable convolution is adopted to replace the common convolution aiming at the condition that the number of layers of the network model is large, so that the calculated amount and the network parameter amount can be obviously reduced, the time for training the network is further reduced, and the recognition efficiency is improved.

2. Considering that the damaged area of the subway sleeper beam is generally small relative to the whole structure, the invention introduces an attention mechanism module, and after the characteristics are preliminarily extracted, the damage identification network can automatically focus around the damaged area, and the damaged area can be better identified by limited computing resources instead of equally giving the same attention to all area data. Specifically, the attention mechanism module mainly comprises two aspects of a plane space of characteristic data and a channel of the characteristic data, and a weight matrix which can be subjected to differential calculation and can accept the gradient reverse propagation of a model is superposed on a plane of input data through introducing the plane space attention mechanism of the characteristic data, so that different weights can be given to different areas on the data plane, and the contribution of the data plane to the next layer is correspondingly increased or reduced; similarly, by introducing a channel attention mechanism of the feature data, the feature data is weighted in the aspect of the dimension channel, different weights can be given to different channels, and contribution of the different channels to the next layer is further influenced.

3. Different from other nondestructive testing modes, the method adopts a deep learning method, establishes the damage identification network model from the strain field information to the damage information directly, does not need manual judgment or data processing, has high intelligent degree, can automatically process the data in real time and high efficiency, and improves the damage identification efficiency.

4. Aiming at the two problems that the background data is large (the proportion of damage units in all units is small), the distribution of a stress field is complex, the number of layers of a network model is possibly deep, and a neural network degradation phenomenon occurs in the real-time damage identification of the occipital beam, the invention introduces an attention mechanism and an advanced convolution structure in a deep learning method, and can further improve the damage identification capability. Specifically, the invention adopts an advanced convolution structure, refers to the ResNeXt network thought, establishes a Bol _ Res module, replaces simple convolution layer stacking with grouping convolution, deepens a network model, avoids the problem of gradient dispersion, can extract more abundant characteristics, improves the identification capability, simultaneously adopts separable convolution to replace common convolution operation, can effectively reduce the network parameter number, reduces the training time of the network, and improves the identification efficiency. Meanwhile, the invention also introduces an attention mechanism, so that the damage identification network can automatically focus around the damaged area, and the damaged area can be better identified by limited computing resources instead of equally giving the same attention to all area data, thereby improving the damage identification efficiency.

In addition to the above-described objects, features and advantages, the present invention has other objects, features and advantages. The present invention will be described in further detail below with reference to the drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a schematic diagram of the computational principle of convolution kernel;

FIG. 2 is a schematic diagram of a subway sleeper beam damage identification network model according to the present invention;

FIG. 3 is a schematic diagram of the structure of the residual-attention module (BolRes _ Att module);

FIG. 4 is a schematic diagram of a configuration of an attention mechanism module; wherein, in fig. 4, (a) is a space attention mechanism module, and (b) is a channel attention mechanism module;

FIG. 5 is a diagram of a bolster damage strain field of a preferred embodiment of the present invention; in fig. 5, (a) is a strain field of the bolster in the X direction, (b) is a strain field of the bolster in the Y direction, and (c) is a strain field of the bolster in the XY direction;

fig. 6 is a schematic diagram for visualizing the damage result according to the preferred embodiment of the present invention.

Detailed Description

Embodiments of the invention will be described in detail below with reference to the drawings, but the invention can be implemented in many different ways, which are defined and covered by the claims.

The method is based on a strain equivalence hypothesis, the damage is measured by the elastic modulus change of the material before and after the damage, and the strain equivalence hypothesis can be used for deducing:

wherein D represents a damage factor, E represents an elastic modulus before the material is damaged; e' represents the elastic modulus after the material is damaged, and it can be found from the formula that D =0 when the material is not damaged, the elastic modulus of the material is not changed; when complete damage occurs, D =1, at which time the elastic modulus of the material becomes 0. Therefore, the damage can be measured by adopting the elastic modulus change of the material before and after the damage, and because the unit elastic modulus weakening of mapping continuous values is difficult to realize, the invention divides the elastic modulus weakening of each unit of the subway sleeper beam into three unit elastic modulus weakening grades of 20 percent, 40 percent and 60 percent, which respectively correspond to the first-level damage, the second-level damage and the third-level damage. If the elastic modulus is not weakened, it is not damaged.

A subway sleeper beam damage identification method based on an attention mechanism and an advanced convolution structure comprises the following steps:

recombining the strain field data to obtain recombined strain field data; the recombined strain field data is a three-dimensional strain field characteristic matrix, and the recombined single-direction strain field data comprises height and width; and the number of directions of the strain field is equal to the number of directions of the channels in the network model;

inputting the recombined strain field data into the constructed neural network model, preliminarily extracting strain field data characteristics through convolution operation, and sequentially carrying out normalization and ReLU function activation processing to obtain data with characteristics extracted for the first time;

sequentially performing transposition convolution, normalization and ReLU function activation processing on the data subjected to the last low-level feature extraction; then, splicing the data with the same size as the data output in the previous time in the channel direction to fuse low-level features, sequentially performing separable convolution processing and transposed convolution processing on the data with the low-level features fused, and repeating the steps to obtain a feature matrix which fuses all high-level and low-level features and has the same size as the data with the features extracted for the first time;

In order to facilitate convolution operation, data reorganization is needed before data are input into the neural network model, and the acquired data size of a certain group of strain fields of the subway sleeper beam is assumed to be c × n, which means that the strain fields contain n strain values in c directions, and the data are reorganized into an array with the size of c × h × w. H and w respectively represent the height and width of the recombined single-direction strain field data and satisfy h × w = n; c represents the number of strain field directions contained in one sample data in the input data, which is called channel direction in the network model.

Specifically, the convolution operation process is as follows: the convolution layer is composed of a plurality of convolution kernels, convolution operation of input feature data and the convolution kernels is carried out, and a calculation result is used as input features of the next layer. The present invention relates to 2-D convolution, i.e. in the course of operation, the convolution kernel slides in the direction of two planes of feature data, without moving in the depth (channel) direction, thereby traversing the entire data plane. The convolutional layer was calculated as follows:

in the formula, i and j respectively represent the horizontal and vertical coordinates of a calculation point, m and n respectively represent the moving distance in the horizontal and vertical directions, and X represents a characteristic data matrix; w represents the weight matrix of the convolution kernel.

The above formula can be understood from fig. 1. When the convolution operation is started between the input data and the convolution kernel, the output data is obtained by multiplying the local submatrix of the input data and the elements of each position of the convolution kernel matrix and then adding the multiplication result. The input in fig. 1 is a two-dimensional 3 x 4 matrix and the convolution kernel is a 2 x 2 matrix, assuming a convolution step size of 1. Then, first, the 2 × 2 local matrix at the top left corner of the input data (data in the input part solid line box in the figure) and the element at the corresponding position of the convolution kernel (data in the kernel part solid line box in the figure) are multiplied and added to obtain the first element in the output matrix output, which has the value of aw + bx + ey + fz. Then, the solid frame is shifted by one unit in the x direction, and then is operated with a convolution kernel. When moving to the border, the x direction of the black solid box is attributed to the origin, and then shifts by one unit in the y direction until the dashed box in the figure, thus obtaining the output data in the figure, and the matrix size is reduced from 3 × 4 to 2 × 3.

Normalization treatment: the Normalization layer used IN the method of the present invention is an Instance Normalization (IN layer). IN contrast to conventional Batch Normalization (BN), the IN layer omits Normalization IN the Batch (Batch) and channel directions and normalizes only IN the H and W directions — i.e., subtracts the mean value and divides by the square difference. Because as the network model deepens, it can be known from the forward propagation of network data that if the parameters of the front convolutional layer of the model change, the change will affect the data output of this layer, and the output will be used as the input of the next layer and will be passed on all the time, thereby affecting the network of the next layers. If the distribution of the input data of each layer is changed drastically all the time in the training process, the training of deep learning is seriously hindered, the training speed is influenced, and the condition that the distribution of the middle layer data is unstable in the training process can be solved by adopting the normalization layer.

The ReLU function process mainly maps the original linearly inseparable multidimensional characteristics to a space capable of enhancing linearly separable. Because the neural network model is a nonlinear model, if the output data is not further processed by the activation function, even the superposition of a plurality of units is linear transformation, and the nonlinear activation function is required to be added to make the neural network become a high-dimensional nonlinear function. Whatever the nonlinear activation function, the goal is to map the original linearly indivisible multidimensional features into a space that enhances linearly separable.

The block convolution processing is performed and the processing with the plane space attention mechanism and the channel attention mechanism is performed, and the whole processing module is called a residual error-attention module (Bol _ Res _ Att module) for short, as shown in fig. 3. Mainly introduces a ResNeXt network thought and an attention mechanism, optimizes the conventional simple convolutional layer stack, and is mainly divided into a residual module (Bol _ Res module) and an attention mechanism module, wherein the specific structure is shown in FIG. 3:

the operation in the whole module is described as follows:

1) Dividing Input characteristic data Input into N groups after 1 × 1 convolution;

2) Respectively carrying out separable convolution operation on each group of data;

3) Connecting each group of data IN the channel direction, and then passing through an instance normalization layer (IN layer) and an activation layer;

4) Obtaining an intermediate matrix A through a convolution layer and an example normalization layer (IN layer);

5) Multiplying the A by the original A through a channel attention mechanism module to obtain a weighting matrix B;

6) B, multiplying the original B by a space attention mechanism module to obtain a weighting matrix C;

7) Adding C and the original A to obtain a matrix D;

8) And D is added with the original input characteristic data and then output through an activation layer.

The packet convolution processing is specifically as follows: dividing input data into n groups; sending the divided data into respective convolution channels for convolution calculation, and respectively obtaining corresponding characteristics; and carrying out superposition operation on the respective data to obtain output data.

The method adopts a plane space attention mechanism and a channel attention mechanism for processing, and specifically comprises the following steps: dividing the flow direction of the input feature data into two parts, wherein one part firstly passes through a channel attention module, multiplies the obtained weight matrix by the original input data, and then passes through a space attention module and multiplies the weight matrix by a corresponding weight; the other one is directly mapped, and finally, the data of the two flow directions are added and output through an activation function.

As shown in fig. 4 (a), the spatial attention module processes the following steps: weighting the feature data in the aspect of dimension channels, wherein the input feature data are respectively subjected to global maximum pooling and global average pooling to obtain 2 groups of weights with the same size as the number of input data channels; then expanding the data out of 2 dimensions of the plane, sending the data into 1 × 1 convolution to compress the compression rate in the channel direction, and reducing the channel direction through the 1 × 1 convolution; and finally, adding the two matrixes which are 1 multiplied by the number of the original data channels, and activating by sigmoid to obtain a final weight matrix.

As shown in fig. 4 (b), the processing by the channel attention module specifically includes: superposing a weight matrix which can be subjected to differential calculation and can accept the backward propagation of the model gradient on a plane of input data; the input feature data are respectively subjected to maximum value pooling and average value pooling to obtain a preliminary weight matrix which has the same plane size as the feature data but the number of channels is 1; stacking the two obtained matrixes in the channel direction, and performing convolution operation of 1 × 1 to change the dimension of the channel direction into 1, so as to finally obtain the weight matrix with the same plane size as the input characteristic data and the channel number of 1.

The method comprises the steps of firstly, through data collection and arrangement, recombining strain field information of the subway sleeper beam according to certain requirements to obtain strain field data with the size of 70 x 264 x 3, on the basis, inputting the strain field data of the subway sleeper beam into a constructed neural network model, carrying out convolution calculation according to the step length (stride) of 2 by using 64 convolution kernels with the size of 7 x 9 through a first CIR module, extracting characteristics, wherein the data size is changed into 32 x 128 x 64, carrying out LU normalization after calculation, then sending the data into a reactivation function for activation, then carrying out LU (kernel) activation through a residual error-Attention module (BolRes _ Att module) with the size of 3 x 3 and the number of convolution kernels of 64, using grouping convolution to replace a convolution layer stack to deepen the network model, enhancing the characteristic extraction capability, and adopting a plane space Attention mechanism (Spatial Attention) and a Channel Attention mechanism (Channel Attention mechanism), extracting the efficiency, increasing the residual error efficiency, extracting the residual error efficiency and outputting the characteristic extraction efficiency, wherein the residual error extraction efficiency is increased by the number of the module (Spatial Attention 1-Attention module, and the residual error extraction efficiency is increased by the number of the module (iteration module) of the Output stage of the block) of the excitation module, and the Output stage of the number of the excitation module (Spatial Attention 1-128 x 128); then, the Output matrix 1 (Output 1) is sent to a residual error-attention module (BolRes _ Att module) with the step length of 2 and the number of convolution kernels of 128, and the residual error-attention module (BolRes _ Att module) with the step length of 1 is repeatedly executed for 3 times to obtain an Output matrix 2 (Output 2), the data size is 16 × 64 × 128, and the stage 2 is finished; in this way, stage 3, stage 4, and stage 5 shown in fig. 2 are performed in sequence, resulting in Output matrix 3 (Output 3), output matrix 4 (Output 4), and Output matrix 5 (Output 5) from which features of different degrees are extracted, the sizes of which are 8 × 32 × 256, 4 × 16 × 512, and 2 × 8 × 728, respectively. And then, the data size is reduced to 4 multiplied by 16 multiplied by 512 through the CTIR module while the characteristics are extracted, and the data size is spliced with the Output matrix 4 (Output 4) obtained in the stage 4 to obtain new data fused with low-level characteristics. Then, the data is sent to an SIR module, separable convolution with the convolution kernel number of 512 and the size of 3 x 3 is carried out, features are further extracted, normalization and ReLU activation are carried out after the convolution is finished, and the data size is not changed for 2 times. On the basis, the data is further processed by a CTIR module, the characteristics are further extracted by adopting the transposition convolution with the step length of 2, the number of convolution kernels of 256 and the size of 3 multiplied by 3, and the data size is reduced to 8 multiplied by 32 multiplied by 256. Then, the data is spliced with an Output matrix 3 (Output 3) with the same size, the features extracted originally are further fused, then the data is sent to the next SIR module, and the operations are sequentially executed according to the diagram shown in fig. 2, so that a feature matrix which is fused with all high-level and low-level features and has the same size as the Output matrix 1 (Output 1) is obtained. And finally, restoring the data by adopting a transposed convolution with 4 convolution kernels and 7 multiplied by 9 size and without filling, activating by adopting a Softmax function, outputting a damage identification matrix with 70 multiplied by 264 multiplied by 4 size, outputting probability information of corresponding damage degrees of each unit of a matrix result, selecting one with the maximum probability according to the probability information, judging which damage degree the units belong to respectively, and outputting the damage result.

The present invention is explained below with reference to specific examples.

Specifically, the method is exemplified by identifying the damage of a certain group of subway sleeper beams, and meanwhile, the set damage is divided into four stages, namely intact damage, 20% elastic modulus weakening damage, 40% elastic modulus weakening damage and 60% elastic modulus weakening damage.

(1) Acquiring and arranging sleeper beam strain field information: the unit number, the unit position information and the strain field information comprise strain field information in x, y and xy directions, part of data is shown in table 1, and a corresponding strain field diagram is shown in fig. 5.

TABLE 1 subway sleeper beam strain field data examples

Unit number

X coordinate

Y coordinate

Z coordinate

Strain in X direction

Strain in Y direction

Strain in XY direction

1

-343.633

-1.31E+03

117

2.76E-06

-2.31E-06

2.06E-06

2

-343.632

-1.30E+03

117

3.63E-06

-5.34E-06

2.02E-06

3

-343.632

-1.29E+03

117

4.44E-06

-7.31E-06

6.12E-08

4

-343.632

-1.28E+03

117

4.24E-06

-7.17E-06

-1.10E-06

5

-343.632

-1.27E+03

117

3.96E-06

-6.42E-06

-1.35E-06

6

-343.632

-1.26E+03

117

3.58E-06

-5.28E-06

-1.48E-06

7

-343.632

-1.25E+03

117

3.20E-06

-4.21E-06

-1.09E-06

8

-343.633

-1.24E+03

117

2.98E-06

-3.28E-06

-8.44E-07

9

-343.633

-1.23E+03

117

2.70E-06

-2.52E-06

-4.73E-07

10

-343.633

-1.22E+03

117

2.61E-06

-1.97E-06

-2.49E-07

…

(2) And (3) identifying and solving the damage: the strain field information is input into the trained neural network model constructed by the invention, the probability that each unit belongs to different damage units is respectively output aiming at each unit through the recognition of the damage recognition network model, and the output results of partial units are shown in a table 2. Since the cell number count is from the top left corner of the bolster, where the cell belongs to undamaged, the probability of belonging to a good cell will be much greater for the earlier cells than for the other three.

Table 2 damaged network output results

(3) And (3) determining a damage unit: according to the probability that the units respectively belong to the units with different damage degrees, selecting according to the maximum probability, determining the damage degree (belonging to one of intact units, 20% damage, 40% damage and 60% damage) of each unit, assigning the probability of each unit again, assigning the position probability of the corresponding unit to be 1, assigning the rest to be 0, and obtaining the damage result of the subway sleeper beam, wherein part of the results are shown in a table 3:

table 3 damage unit determination results

(4) Visualization of the lesion results:

according to different damage degrees, different colors are adopted to represent the damage degrees, namely the damage degrees from perfect to 60% rigidity are represented by gradually changing colors from black to white, the white color represents that the damage degree is larger, and the damage result is visualized as shown in fig. 6. As can be seen from fig. 6, the damage to the group of corbels is concentrated at the position close to the right side of the middle part, and the other parts are not damaged, and all three kinds of damages exist, the 20% rigidity damage and the 40% rigidity damage are distributed more discretely, while the 60% rigidity damage is concentrated, and other damages of lower degree exist nearby, which are consistent with the actual damage condition.

In addition, the method of the present invention is compared with a model obtained by constructing a network by using a simple convolution stack without introducing a mechanism of attention, wherein the compared model is the same as the model of the network of the present invention except that the mechanism of attention is not introduced and the convolution structure is different. The damage set by the experiment is divided into four grades which are complete, 20% rigidity weakening damage, 40% rigidity weakening damage and 60% rigidity weakening damage respectively; the input strain field information comprises three directions of x, y and xy. The experimental data are obtained by finite element simulation through ABAQUS software, and simultaneously comprise damage information and strain field information. Considering that the influence of the applied force on the occipital beam on the strain field may have a large influence on the result, the data sets are two parts, namely an occipital beam static force data set (body _ static _ dataset) and an occipital beam dynamic force data set (body _ dynamic _ dataset). At the same time, white gaussian noise signals of 20dB and 30dB are also injected into the data set to increase the complexity of the data set. The specific contents of the two data sets are shown in table 4: each group of data contains strain field data and damage data with the size of 70 x 264 x 3, namely each group of data contains 70 x 264 units, and each unit has strain field information in the directions of x, y and xy.

TABLE 4 data set specific information for the experiments

The evaluation indexes adopted by the experiment are as follows:

(1) Loss function: the target function to be minimized is used for measuring the difference between a predicted value and an actual value of the neural network model output, the larger the value is, the lower the reliability of the output result of the model is represented, the smaller the value is, the higher the reliability of the output result of the model is represented, and the cross entropy loss function is adopted in the experiment and is expressed as follows:

in the formula, k represents the number of categories; y is _i Is an indication variable, if the category is the same as that of the sample i, the category is 1, and if the category is different from that of the sample i, the category is 0; p is a radical of formula _i Is the output of the model, i.e. refers to the probability that the class is i.

(2) The recall ratio is as follows: the ratio of correctly predicted damage units is used for measuring the accuracy of the prediction result, the larger the value is, the more accurate the prediction result is represented, and the smaller the value is, the less accurate the prediction result is represented, and the formula is as follows:

in the formula, R is the recall ratio, TP is the positive sample predicted to be positive by the model, FN is the positive sample predicted to be negative by the model, in this experiment, the positive sample means the damaged cell, and the negative sample means the intact cell.

(3) Average number of error units: the average number of the error classification units of each group of data can visually reflect the average identification performance of the damage identification model on the whole data set, the smaller the value of the average number is, the better the damage identification performance of the model is, and the mathematical expression is as follows:

in the formula (I), the compound is shown in the specification,

representing the number of average error classification units; n is a radical of an alkyl radical _i The number of units representing classification errors in the ith data is shown; k represents the number of data sets participating in the calculation.

By comparing the test with the simple convolution stacking method, the test results are shown in table 5:

TABLE 5 comparison of test results of the two methods on the test set

Comparing the performance indexes of the simple convolution stacking method and the method of the invention, it can be seen that the performance of the network model established by the method of the invention on both the sleeper beam static data set and the sleeper beam dynamic data set is superior to that of the former, and from the most intuitive evaluation error rate, the method of the invention improves by 44% -53% compared with the simple convolution stacking method, and the performance is greatly enhanced. Meanwhile, the time consumption for processing 1000 groups of data is greatly increased, and the amplitude reaches more than 400%. This is because the calculation amount is greatly increased after the attention mechanism and the advanced convolution structure are introduced, but the processing efficiency at this time can also reach 70 groups/s, compared with the existing nondestructive detection method, the intelligent degree is greatly improved, and the processing efficiency is also improved by one grade, which is enough to support real-time damage identification.

In the convolutional neural network, if the number of layers of the network having the low, medium and high features for extracting the detected object is more, the more abundant the features of different levels can be extracted, the stronger the identification capability is, and as the number of layers of the convolutional neural network is deepened, when reaching a certain degree, gradient dispersion easily occurs, which causes degradation of the neural network and reduction of the network identification capability. The invention adopts an advanced convolution structure, refers to the ResNeXt network thought, establishes a residual error module (Bol _ Res module), replaces simple convolution layer stacking with grouping convolution, deepens a network model, avoids the problem of gradient dispersion, can extract more abundant characteristics, improves the identification capability, simultaneously adopts separable convolution to replace common convolution operation, can effectively reduce the network parameter number, reduces the training time of the network and improves the identification efficiency. Meanwhile, the invention also introduces an attention mechanism, so that the damage identification network can automatically focus around the damaged area, and the damaged area can be better identified by limited computing resources instead of equally giving the same attention to all area data, thereby improving the damage identification efficiency.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A subway sleeper beam damage identification method based on an attention mechanism and an advanced convolution structure is characterized by comprising the following steps:

acquiring strain field data of each unit including a subway sleeper beam; the strain field data is a two-dimensional feature matrix;

inputting the reconstructed strain field data into a constructed neural network model, preliminarily extracting strain field data characteristics through convolution operation, and sequentially carrying out normalization and ReLU function activation processing to obtain data with the characteristics extracted for the first time;

repeatedly carrying out grouping convolution on the data with the characteristics extracted for the first time and adopting a plane space attention mechanism and a channel attention mechanism for processing to sequentially obtain higher-level data than the data extracted for the last time, and simultaneously correspondingly reducing the data size;

a feature matrix which integrates all high-level and low-level features and has the same size as the data after the features are extracted for the first time is activated through the transposition convolution processing and the softmax function in sequence, outputting a damage identification probability matrix;

2. A subway sleeper beam damage identification method based on attention mechanism and advanced convolution structure as claimed in claim 1, characterized in that when the material of subway sleeper beam is not damaged, D =0, the elastic modulus of the material is not changed; when complete damage occurs, D =1, at which time the elastic modulus of the material becomes 0; therefore, the damage is measured by the elastic modulus change of the material before and after the damage, and the elastic modulus weakening and discretization of each unit of the subway sleeper beam are divided into three elastic modulus weakening grades of 20%,40% and 60%, which respectively correspond to the first-level damage, the second-level damage and the third-level damage.

3. The method for identifying the damage to the sleeper beam of the subway based on the attention mechanism and the advanced convolution structure as claimed in claim 1, wherein the strain field data are reconstructed to obtain reconstructed strain field data, and specifically:

4. The method for identifying the damage to the subway sleeper beam based on the attention mechanism and the advanced convolution structure as claimed in claim 1, wherein the reconstructed strain field data is input into the constructed neural network model, the size of the strain field data is gradually reduced through convolution operation, and specifically, 2-D convolution is adopted, that is, in the operation process, a convolution kernel slides in two plane directions of the feature data without moving in a channel direction, so that the whole data plane is traversed.

5. The method for identifying subway sleeper beam damage based on attention mechanism and advanced convolution structure as claimed in claim 1, wherein during normalization processing, the normalization layer used is an example normalization layer, and the example normalization layer omits normalization operation in batch direction and channel direction, and only normalizes in H and W directions.

6. A subway sleeper beam damage identification method based on attention mechanism and advanced convolution structure as claimed in claim 1, characterized by performing packet convolution processing, specifically:

dividing input data into n groups;

7. A subway sleeper beam damage identification method based on attention mechanism and advanced convolution structure as claimed in claim 1, characterized in that, the planar space attention mechanism and the channel attention mechanism are adopted for processing, specifically: dividing the flow direction of the input feature data into two parts, wherein one part firstly passes through a channel attention module, multiplies the obtained weight matrix by the original input data, and then passes through a space attention module and multiplies the weight matrix by a corresponding weight; the other is directly mapped, and finally, the data of the two flow directions are added and output through an activation function.

8. The method for identifying the damage to the subway sleeper beam based on the attention mechanism and the advanced convolution structure as claimed in claim 7, wherein the method is processed by a channel attention module, and specifically comprises the following steps:

superposing a weight matrix which can be subjected to differential calculation and can accept the backward propagation of the model gradient on a plane of input data; the input feature data are respectively subjected to maximum value pooling and average value pooling to obtain a preliminary weight matrix which has the same plane size as the feature data but the number of channels is 1; stacking the two obtained matrixes in the channel direction, and performing convolution operation of 1 × 1 to change the dimension of the channel direction into 1, so as to finally obtain the weight matrix with the same plane size as the input characteristic data and the channel number of 1.

9. A subway sleeper beam damage identification method based on attention mechanism and advanced convolution structure as claimed in claim 8, characterized by that, after the spatial attention module processing, it is specifically: