CN115903022B

CN115903022B - Deep learning chip suitable for real-time seismic data processing

Info

Publication number: CN115903022B
Application number: CN202211556940.XA
Authority: CN
Inventors: 薛清峰; 王一博; 郑忆康
Original assignee: Institute of Geology and Geophysics of CAS
Current assignee: Institute of Geology and Geophysics of CAS
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-10-31
Anticipated expiration: 2042-12-06
Also published as: CN115903022A

Abstract

The invention discloses a deep learning chip suitable for real-time seismic data processing, which comprises: the system comprises a feature extraction subsystem, a P-wave first-arrival induction subsystem, an S-wave first-arrival induction subsystem and a microseism estimation subsystem; the feature extraction subsystem is used for extracting microseism detection data to obtain microseism feature data; the P-wave first arrival induction subsystem is used for extracting P-wave first arrival time in the microseism characteristic data; the S-wave first arrival induction subsystem is used for extracting S-wave first arrival time in the microseism characteristic data; the micro-seismic estimation subsystem is used for estimating a micro-seismic source according to the P-wave first arrival time and the S-wave first arrival time; the method solves the problem that the distance of the seismic source of the microseism can not be measured only by extracting the P wave or the first arrival of the P wave in the prior art.

Description

Deep learning chip suitable for real-time seismic data processing

Technical Field

The invention relates to the technical field of seismic induction, in particular to a deep learning chip suitable for real-time seismic data processing.

Background

Microseism is a small-scale earthquake. Rock cracking and seismic activity during deep mining in an underground mine are often unavoidable phenomena. Microseisms are generally defined as those earthquakes that are caused by rock failure due to changes in stress fields within the rock mass near the production gallery.

The prior literature 'a method and a system for picking up a first arrival of a microseism P wave based on a capsule neural network' and 'a method and a system for identifying the microseism P wave based on a depth convolution neural network' are used for identifying the P wave or extracting the first arrival of the P wave, but only extracting the P wave or the first arrival of the P wave can not measure the distance of a seismic source of the microseism.

Disclosure of Invention

Aiming at the defects in the prior art, the deep learning chip suitable for real-time seismic data processing provided by the invention solves the problem that the prior art only extracts P waves or P wave first arrivals cannot measure the source distance of micro-earthquakes.

In order to achieve the aim of the invention, the invention adopts the following technical scheme: a deep learning chip suitable for real-time seismic data processing, comprising: the system comprises a feature extraction subsystem, a P-wave first-arrival induction subsystem, an S-wave first-arrival induction subsystem and a microseism estimation subsystem;

the feature extraction subsystem is used for extracting microseism detection data to obtain microseism feature data; the P-wave first arrival induction subsystem is used for extracting P-wave first arrival time in the microseism characteristic data; the S-wave first arrival induction subsystem is used for extracting S-wave first arrival time in the microseism characteristic data; the micro-seismic estimation subsystem is used for estimating a micro-seismic source according to the P-wave first arrival time and the S-wave first arrival time.

Further, the feature extraction subsystem includes: a CNN unit, a first BiLSTM unit, a second BiLSTM unit, and a first global attention unit;

the input end of the CNN unit is used as the input end of the feature extraction subsystem and is used for inputting microseism detection data; the input end of the first BiLSTM unit is connected with the output end of the CNN unit, and the output end of the first BiLSTM unit is connected with the input end of the second BiLSTM unit; the input end of the first global attention unit is connected with the output end of the second BiLSTM unit, and the output end of the first global attention unit is used as the output end of the feature extraction subsystem.

Further, the CNN unit includes: a first convolution layer, a first maximum pooling layer, a second convolution layer, a second maximum pooling layer, a third convolution layer, a third maximum pooling layer, a fourth convolution layer, a fourth maximum pooling layer, a fifth convolution layer, and a fifth maximum pooling layer;

the input end of the first convolution layer is used as the input end of the CNN unit, and the output end of the first convolution layer is connected with the input end of the first maximum pooling layer; the input end of the second convolution layer is connected with the output end of the first maximum pooling layer, and the output end of the second convolution layer is connected with the input end of the second maximum pooling layer; the input end of the third convolution layer is connected with the output end of the second maximum pooling layer, and the output end of the third convolution layer is connected with the input end of the third maximum pooling layer; the input end of the fourth convolution layer is connected with the output end of the third maximum pooling layer, and the output end of the fourth convolution layer is connected with the input end of the fourth maximum pooling layer; the input end of the fifth convolution layer is connected with the output end of the fourth maximum pooling layer, and the output end of the fifth convolution layer is connected with the input end of the fifth maximum pooling layer; the output end of the fifth maximum pooling layer is used as the output end of the CNN unit.

Further, the P-wave first arrival induction subsystem includes: a third BiLSTM unit, a second global attention unit, and a first full connection layer unit;

the input end of the third BiLSTM unit is used as the input end of the P-wave first arrival induction subsystem; the input end of the second global attention unit is connected with the output end of the third BiLSTM unit, and the output end of the second global attention unit is connected with the input end of the first full-connection layer unit; the output end of the first full-connection layer unit is used as the output end of the P-wave first arrival induction subsystem.

Further, the S-wave first arrival induction subsystem includes: a fourth BiLSTM unit, a third global attention unit, and a second full connection layer unit;

the input end of the fourth BiLSTM unit is connected with the input end of the S-wave first arrival induction subsystem; the input end of the third global attention unit is connected with the output end of the fourth BiLSTM unit, and the output end of the third global attention unit is connected with the input end of the second full connection layer unit; and the output end of the second full-connection layer unit is used as the output end of the S-wave first arrival induction subsystem.

Further, the input/output relationship of the cells in the LSTM module of the BiLSTM unit in the feature extraction subsystem, the P-wave first-arrival induction subsystem or the S-wave first-arrival induction subsystem is as follows:

f _t ＝σ[(W _f ·(y _t-1 ，x _t ，C _t-1 )+b _f ]

i _t ＝tanh[W _i ·(y _t-1 ，x _t ，C _t-1 )+b _i ]

h _t ＝σ[W _h ·(y _t-1 ，x _t ，C _t-1 )+b _h ]

C _t ＝(C _t-1 ⊙f _t +(1-f _t )⊙i _t )⊙((1-i _t )⊙h _t )

y _t ＝σ[W _o ·(y _t-1 ，x _t ，C _t-1 ，C _t )+b _o ]⊙tanh[C _t ]

wherein f _t For forgetting to leave the door at time tOutput, sigma []To activate a function, W _f Weight of forgetting gate b _f For forgetting the bias of the door, y _t-1 For the output of the cell at time t-1, x _t For t-time cell input, C _t-1 Is the state of the cell at time t-1, i _t For the output of the input gate at time t, tanh [ []For hyperbolic tangent activation function, W _i B is the weight of the input gate _i For biasing the input gate, h _t For the output of the candidate gate at time t, W _h Weights for candidate gates, b _h Bias for candidate gate, C _t As the state of the cell at time t, as the Hadamard product, y _t For outputting the output of the gate at time t, W _o To output the weight of the door, b _o To output the gate bias.

The beneficial effects of the above-mentioned further scheme are: consider state C of last time in LSTM module _t-1 Input x at the current time _t Output y of last moment _t-1 The cell is made to fully consider the relationship among state, input and output at the time of calculation.

Further, the global attention unit in the feature extraction subsystem, the P-wave first-arrival induction subsystem or the S-wave first-arrival induction subsystem comprises: a sixth convolution layer, a Softmax layer, a multiplier, a seventh convolution layer, a ReLU layer, an eighth convolution layer, and an adder;

the input end of the sixth convolution layer is connected with the first input end of the multiplier and the first input end of the adder respectively and is used as the input end of the global attention unit; the input end of the Softmax layer is connected with the output end of the sixth convolution layer, and the output end of the Softmax layer is connected with the second input end of the multiplier; the input end of the seventh convolution layer is connected with the output end of the multiplier, and the output end of the seventh convolution layer is connected with the input end of the ReLU layer; the input end of the eighth convolution layer is connected with the output end of the ReLU layer, and the output end of the eighth convolution layer is connected with the second input end of the adder; the output of the adder acts as the output of the global attention unit.

Further, the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem adopt microseism detection data and data labels to form a training data set, the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem are trained by the training data set, the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem which are trained are obtained, and the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem which are trained are arranged in the processor.

Further, the weight update formula in the training process is:

wherein w is _i+1 Weight of the (i+1) th iteration, w _i Weighting the ith iteration, η _i For the learning rate of the ith iteration, η _i-1 Learning rate for the i-1 th iteration, J _i Loss function for the ith iteration, J _i-1 The loss function of the i-1 th iteration is shown, gamma is the proportionality coefficient, and ζ is the adjustment constant.

The beneficial effects of the above-mentioned further scheme are: the weighted type of the past second derivative and the current second derivative of the loss function is designed, the larger the second derivative of the loss function is, the larger the change rate of the gradient of the loss function is represented, the weighted type and the past gradient change rate are weighted and accumulated to be used for regulating and controlling the weight after being smoothly filtered, so that the step length and the gradient change rate of the weight iteration are hooked, the overshoot of the weight iteration is prevented, and the slow of the weight iteration is avoided; in the design of learning rate parameters, the degree of the decline of the loss function is considered, when the decline degree of the loss function is larger, J _i-1 -J _i Is larger, thereby leading to a learning rate eta _i The speed change is increased, the step length adjusting force of weight updating iteration is increased, the reduction degree of a loss function is smaller, J _i-1 -J _i The difference of (2) is smaller, so that the learning rate eta _i The speed change is reduced, the step length adjusting force of the weight updating iteration is reduced, and finallyThe weight can be self-adaptive, fast and stable in iteration, and the optimal value can be reached rapidly.

In summary, the invention has the following beneficial effects: according to the invention, the feature extraction subsystem is used for extracting the microseism feature data in the microseism detection data, so that the number is reduced on one hand, the data features are reserved on the other hand, the P-wave first-arrival time and the S-wave first-arrival time are respectively extracted through the P-wave first-arrival induction subsystem and the S-wave first-arrival time, and the microseism focus position is estimated through the microseism estimation subsystem.

Drawings

FIG. 1 is a system block diagram of a deep learning chip suitable for real-time seismic data processing;

FIG. 2 is a system block diagram of a CNN unit;

fig. 3 is a system block diagram of a global attention unit.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.

As shown in fig. 1, a deep learning chip suitable for real-time seismic data processing includes: the system comprises a feature extraction subsystem, a P-wave first-arrival induction subsystem, an S-wave first-arrival induction subsystem and a microseism estimation subsystem;

The feature extraction subsystem includes: a CNN unit, a first BiLSTM unit, a second BiLSTM unit, and a first global attention unit;

As shown in fig. 2, the CNN unit includes: a first convolution layer, a first maximum pooling layer, a second convolution layer, a second maximum pooling layer, a third convolution layer, a third maximum pooling layer, a fourth convolution layer, a fourth maximum pooling layer, a fifth convolution layer, and a fifth maximum pooling layer;

The P-wave first arrival induction subsystem comprises: a third BiLSTM unit, a second global attention unit, and a first full connection layer unit;

The S-wave first arrival induction subsystem comprises: a fourth BiLSTM unit, a third global attention unit, and a second full connection layer unit;

The characteristic extraction subsystem, the P-wave first-arrival induction subsystem or the LSTM module of the BiLSTM unit in the S-wave first-arrival induction subsystem has the following input-output relationship:

f _t ＝σ[(W _f ·(y _t-1 ，x _t ，C _t-1 )+b _f ]

i _t ＝tanh[W _i ·(y _t-1 ，x _t ，C _t-1 )+b _i ]

h _t ＝σ[W _h ·(y _t-1 ，x _t ，C _t-1 )+b _h ]

C _t ＝(C _t-1 ⊙f _t +(1-f _t )⊙i _t )⊙((1-i _t )Qh _t )

y _t ＝σ[W _o ·(y _t-1 ，x _t ，C _t-1 ，C _t )+b _o ]⊙tanh[C _t ]

wherein f _t For the output of the forgetting gate at time t, sigma]To activate a function, W _f Weight of forgetting gate b _f For forgetting the bias of the door, y _t-1 For the output of the cell at time t-1, x _t For t-time cell input, C _t-1 Is the state of the cell at time t-1, i _t For the output of the input gate at time t, tanh [ []For hyperbolic tangent activation function, W _i B is the weight of the input gate _i For biasing the input gate, h _t For the output of the candidate gate at time t, W _h Weights for candidate gates, b _h Bias for candidate gate, C _t Is the state of the cell at time t, as indicated by HadamardProduct, y _t For outputting the output of the gate at time t, W _o To output the weight of the door, b _o To output the gate bias.

Consider state C of last time in LSTM module _t-1 Input x at the current time _t Output y of last moment _t-1 The cell is made to fully consider the relationship among state, input and output at the time of calculation.

As shown in fig. 3, the global attention unit in the feature extraction subsystem, the P-wave first-arrival induction subsystem or the S-wave first-arrival induction subsystem includes: a sixth convolution layer, a Softmax layer, a multiplier, a seventh convolution layer, a ReLU layer, an eighth convolution layer, and an adder;

The feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem adopt microseism detection data and data labels to form a training data set, the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem are trained by adopting the training data set, the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem which are trained are obtained, and the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem which are trained are arranged in the processor.

The data labels are P wave first arrival time and S wave first arrival time.

The weight updating formula in the training process is as follows:

The weighted formula of the past second derivative and the current second derivative of the loss function is designed, the larger the second derivative of the loss function is, the larger the change rate of the gradient of the loss function is represented, the weighted sum of the past gradient change rate and the past gradient change rate is used for regulating and controlling the weight after being subjected to smooth filtering, so that the step length of the weight iteration and the gradient change rate are hooked, the overshoot of the weight iteration is prevented, and the slow of the weight iteration is avoided; in the design of learning rate parameters, the degree of the decline of the loss function is considered, when the decline degree of the loss function is larger, J _i-1 -J _i Is larger, thereby leading to a learning rate eta _i The speed change is increased, the step length adjusting force of weight updating iteration is increased, the reduction degree of a loss function is smaller, J _i-1 -J _i The difference of (2) is smaller, so that the learning rate eta _i The speed change is reduced, the step length adjusting force of the weight updating iteration is reduced, and finally the weight can be self-adaptively and rapidly and stably iterated to reach the optimal value rapidly.

According to the invention, the feature extraction subsystem is used for extracting the microseism feature data in the microseism detection data, so that the number is reduced, the data features are reserved, the P-wave first-arrival time and the S-wave first-arrival time are respectively extracted through the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem, the speeds of the P-wave and the S-wave can be measured through the sensor, and the microseism focus position can be estimated by combining the time difference of the P-wave first-arrival time and the S-wave first-arrival time and the speeds of the P-wave and the S-wave.

Claims

1. A deep learning chip suitable for real-time seismic data processing, comprising: the system comprises a feature extraction subsystem, a P-wave first-arrival induction subsystem, an S-wave first-arrival induction subsystem and a microseism estimation subsystem;

the feature extraction subsystem is used for extracting microseism detection data to obtain microseism feature data; the P-wave first arrival induction subsystem is used for extracting P-wave first arrival time in the microseism characteristic data; the S-wave first arrival induction subsystem is used for extracting S-wave first arrival time in the microseism characteristic data; the micro-seismic estimation subsystem is used for estimating a micro-seismic source according to the P-wave first arrival time and the S-wave first arrival time;

the input end of the CNN unit is used as the input end of the feature extraction subsystem and is used for inputting microseism detection data; the input end of the first BiLSTM unit is connected with the output end of the CNN unit, and the output end of the first BiLSTM unit is connected with the input end of the second BiLSTM unit; the input end of the first global attention unit is connected with the output end of the second BiLSTM unit, and the output end of the first global attention unit is used as the output end of the feature extraction subsystem;

the input end of the third BiLSTM unit is used as the input end of the P-wave first arrival induction subsystem; the input end of the second global attention unit is connected with the output end of the third BiLSTM unit, and the output end of the second global attention unit is connected with the input end of the first full-connection layer unit; the output end of the first full-connection layer unit is used as the output end of the P-wave first arrival induction subsystem;

the input end of the fourth BiLSTM unit is connected with the input end of the S-wave first arrival induction subsystem;

the input end of the third global attention unit is connected with the output end of the fourth BiLSTM unit, and the output end of the third global attention unit is connected with the input end of the second full connection layer unit; the output end of the second full-connection layer unit is used as the output end of the S-wave first arrival induction subsystem;

f _t ＝σ[(W _f ·(y _t-1 ，x _t ，C _t-1 )+b _f ]

i _t ＝tanh[W _i ·(y _t-1 ，x _t ，C _t-1 )+b _i ]

h _t ＝σ[W _h ·(y _t-1 ，x _t ，C _t-1 )+b _h ]

C _t ＝(C _t-1 ⊙f _t +(1-f _t )⊙i _t )⊙((1-i _t )⊙h _t )

y _t ＝σ[W _o ·(y _t-1 ，x _t ，C _t-1 ，C _t )+b _o ]⊙tanh[C _t ]

wherein f _t For the output of the forgetting gate at time t, sigma]To activate a function, W _f Weight of forgetting gate b _f For forgetting the bias of the door, y _t-1 For the output of the cell at time t-1, x _t For t-time cell input, C _t-1 Is the state of the cell at time t-1, i _t For the output of the input gate at time t, tanh [ []For hyperbolic tangent activation function, W _i B is the weight of the input gate _i For biasing the input gate, h _t For the output of the candidate gate at time t, W _h Weights for candidate gates, b _h Bias for candidate gate, C _t As the state of the cell at time t, as the Hadamard product, y _t For outputting the output of the gate at time t, W _o To output the weight of the door, b _o Offset for the output gate;

the global attention unit in the feature extraction subsystem, the P-wave first-arrival induction subsystem or the S-wave first-arrival induction subsystem comprises: a sixth convolution layer, a Softmax layer, a multiplier, a seventh convolution layer, a ReLU layer, an eighth convolution layer, and an adder;

2. The deep learning chip adapted for real-time seismic data processing of claim 1, wherein the CNN unit comprises: a first convolution layer, a first maximum pooling layer, a second convolution layer, a second maximum pooling layer, a third convolution layer, a third maximum pooling layer, a fourth convolution layer, a fourth maximum pooling layer, a fifth convolution layer, and a fifth maximum pooling layer;

3. The deep learning chip of claim 1, wherein the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem are configured by microseism detection data and data labels, a training data set is configured, the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem are trained by the training data set, the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem are obtained after training, and the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem after training are arranged in the processor.

4. A deep learning chip suitable for real time seismic data processing according to claim 3, wherein the weight update formula for the training process is: