CN115903022A

CN115903022A - Deep learning chip suitable for real-time seismic data processing

Info

Publication number: CN115903022A
Application number: CN202211556940.XA
Authority: CN
Inventors: 薛清峰; 王一博; 郑忆康
Original assignee: Institute of Geology and Geophysics of CAS
Current assignee: Institute of Geology and Geophysics of CAS
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-04-04
Anticipated expiration: 2042-12-06
Also published as: CN115903022B

Abstract

The invention discloses a deep learning chip suitable for real-time seismic data processing, which comprises: the system comprises a characteristic extraction subsystem, a P wave first arrival induction subsystem, an S wave first arrival induction subsystem and a microseism estimation subsystem; the characteristic extraction subsystem is used for extracting microseism detection data to obtain microseism characteristic data; the P wave first arrival induction subsystem is used for extracting P wave first arrival time in the micro seismic characteristic data; the S wave first arrival induction subsystem is used for extracting S wave first arrival time in the microseism characteristic data; the microseism estimation subsystem is used for estimating a microseism seismic source according to P wave first arrival time and S wave first arrival time; the method solves the problem that the seismic source distance of the micro earthquake can not be measured only by extracting P waves or P wave first arrivals in the prior art.

Description

Deep learning chip suitable for real-time seismic data processing

Technical Field

The invention relates to the technical field of seismic induction, in particular to a deep learning chip suitable for real-time seismic data processing.

Background

A micro-earthquake is a small earthquake. Rock fracture and seismic activity, often unavoidable phenomena, occur during deep mining in underground mines. The seismic activity induced by mining, micro-earthquakes are generally defined as those earthquakes caused by rock failure due to changes in stress fields within the rock mass in the vicinity of the mining excavation.

The existing documents of 'a method and a system for picking up a microseism P wave first arrival based on a capsule neural network' and 'a method and a system for identifying a microseism P wave based on a deep convolution neural network' are to identify the P wave or extract the P wave first arrival, but only extract the P wave or the P wave first arrival cannot measure the seismic source distance of the microseism.

Disclosure of Invention

Aiming at the defects in the prior art, the deep learning chip suitable for real-time seismic data processing provided by the invention solves the problem that the seismic source distance of a micro earthquake cannot be measured only by extracting P waves or P wave first arrivals in the prior art.

In order to achieve the purpose of the invention, the invention adopts the technical scheme that: a deep learning chip adapted for real-time seismic data processing, comprising: the system comprises a characteristic extraction subsystem, a P wave first arrival induction subsystem, an S wave first arrival induction subsystem and a microseism estimation subsystem;

the characteristic extraction subsystem is used for extracting micro-seismic detection data to obtain micro-seismic characteristic data; the P wave first arrival induction subsystem is used for extracting P wave first arrival time in the microseism characteristic data; the S wave first arrival induction subsystem is used for extracting S wave first arrival time in the microseism characteristic data; the microseism estimation subsystem is used for estimating a microseism seismic source according to P wave first arrival time and S wave first arrival time.

Further, the feature extraction subsystem includes: a CNN unit, a first BilSTM unit, a second BilSTM unit and a first global attention unit;

the input end of the CNN unit is used as the input end of the characteristic extraction subsystem and used for inputting microseism detection data; the input end of the first BilSTM unit is connected with the output end of the CNN unit, and the output end of the first BilSTM unit is connected with the input end of the second BilSTM unit; and the input end of the first global attention unit is connected with the output end of the second BilSTM unit, and the output end of the first global attention unit is used as the output end of the feature extraction subsystem.

Further, the CNN unit includes: a first convolution layer, a first maximum pooling layer, a second convolution layer, a second maximum pooling layer, a third convolution layer, a third maximum pooling layer, a fourth convolution layer, a fourth maximum pooling layer, a fifth convolution layer and a fifth maximum pooling layer;

the input end of the first convolution layer is used as the input end of the CNN unit, and the output end of the first convolution layer is connected with the input end of the first maximum pooling layer; the input end of the second convolutional layer is connected with the output end of the first maximum pooling layer, and the output end of the second convolutional layer is connected with the input end of the second maximum pooling layer; the input end of the third convolutional layer is connected with the output end of the second largest pooling layer, and the output end of the third convolutional layer is connected with the input end of the third largest pooling layer; the input end of the fourth convolutional layer is connected with the output end of the third largest pooling layer, and the output end of the fourth convolutional layer is connected with the input end of the fourth largest pooling layer; the input end of the fifth convolutional layer is connected with the output end of the fourth largest pooling layer, and the output end of the fifth convolutional layer is connected with the input end of the fifth largest pooling layer; and the output end of the fifth maximum pooling layer is used as the output end of the CNN unit.

Further, the P-wave first arrival induction subsystem comprises: a third BilSTM unit, a second global attention unit and a first full connection layer unit;

the input end of the third BilSTM unit is used as the input end of the P-wave first arrival induction subsystem; the input end of the second global attention unit is connected with the output end of the third BilSTM unit, and the output end of the second global attention unit is connected with the input end of the first full-connection layer unit; and the output end of the first full connection layer unit is used as the output end of the P-wave first arrival induction subsystem.

Further, the S-wave first-arrival induction subsystem comprises: a fourth BilSTM unit, a third global attention unit and a second full-connection layer unit;

the input end of the fourth BilSTM unit is connected as the input end of the S-wave first arrival induction subsystem; the input end of the third global attention unit is connected with the output end of the fourth BilSTM unit, and the output end of the third global attention unit is connected with the input end of the second full-connection layer unit; and the output end of the second full connection layer unit is used as the output end of the S-wave first arrival induction subsystem.

Further, the input and output relationship of the cells in the LSTM module of the BiLSTM unit in the feature extraction subsystem, the P-wave first arrival induction subsystem or the S-wave first arrival induction subsystem is as follows:

f _t ＝σ[(W _f ·(y _t-1 ，x _t ，C _t-1 )+b _f ]

i _t ＝tanh[W _i ·(y _t-1 ，x _t ，C _t-1 )+b _i ]

h _t ＝σ[W _h ·(y _t-1 ，x _t ，C _t-1 )+b _h ]

C _t ＝(C _t-1 ⊙f _t +(1-f _t )⊙i _t )⊙((1-i _t )⊙h _t )

y _t ＝σ[W _o ·(y _t-1 ，x _t ，C _t-1 ，C _t )+b _o ]⊙tanh[C _t ]

wherein f is _t The output of the forgetting gate at time t, σ [, ]]For sigmoid activation functions, W _f Weight of forgetting gate, b _f To forget the biasing of the door, y _t-1 Output of cells at time t-1, x _t For input of cells at time t, C _t-1 The state of the cells at time t-1, i _t Input the output of the gate at time t, tanh 2]As hyperbolic tangent activation function, W _i As the weight of the input gate, b _i For input of the offset of the gate, h _t Output of candidate gate at time t, W _h As a weight of the candidate gate, b _h As a candidate for the offset of the gate, C _t The state of the cell at time t, < is a Hadamard product, y _t Output of the gate for time t, W _o As weights of output gates, b _o Is the biasing of the output gate.

The beneficial effects of the above further scheme are: taking into account the state C of the last moment in the LSTM module _t-1 Input x at the current time _t Output y at the previous time _t-1 The relationship between the state, input and output is taken into full account when calculating the cell.

Further, the global attention unit in the feature extraction subsystem, the P-wave first-motion induction subsystem or the S-wave first-motion induction subsystem includes: a sixth convolutional layer, a Softmax layer, a multiplier, a seventh convolutional layer, a ReLU layer, an eighth convolutional layer, and an adder;

the input end of the sixth convolutional layer is respectively connected with the first input end of the multiplier and the first input end of the adder and is used as the input end of the global attention unit; the input end of the Softmax layer is connected with the output end of the sixth convolutional layer, and the output end of the Softmax layer is connected with the second input end of the multiplier; the input end of the seventh convolution layer is connected with the output end of the multiplier, and the output end of the seventh convolution layer is connected with the input end of the ReLU layer; the input end of the eighth convolution layer is connected with the output end of the ReLU layer, and the output end of the eighth convolution layer is connected with the second input end of the adder; the output of the adder serves as the output of the global attention unit.

Further, the characteristic extraction subsystem, the P wave first arrival induction subsystem and the S wave first arrival induction subsystem adopt microseism detection data and data labels to form a training data set, the training data set is adopted to train the characteristic extraction subsystem, the P wave first arrival induction subsystem and the S wave first arrival induction subsystem to obtain the trained characteristic extraction subsystem, the P wave first arrival induction subsystem and the S wave first arrival induction subsystem, and the trained characteristic extraction subsystem, the P wave first arrival induction subsystem and the S wave first arrival induction subsystem are arranged in the processor.

Further, the weight updating formula of the training process is as follows:

wherein, w _i+1 Is the weight of the i +1 th iteration, w _i Is the weight, η, of the ith iteration _i Learning rate, η, for the ith iteration _i-1 Learning rate for the i-1 st iteration, J _i As a loss function for the ith iteration, J _i-1 Loss for the i-1 th iterationThe loss function, γ is the proportionality coefficient and ζ is the tuning constant.

The beneficial effects of the above further scheme are: the weighting formula of the past second derivative and the current second derivative of the loss function is designed, the larger the second derivative of the loss function is, the larger the change rate of the gradient of the loss function is, the weight is used for regulating and controlling the weight after the change rate of the loss function and the past gradient change rate are weighted and accumulated to be smooth and filtered, so that the step length of weight iteration is hooked with the gradient change rate, the overshoot of the weight iteration is prevented, and the slowness of the weight iteration is avoided; in the design of the learning rate parameter, the degree of the loss function reduction is considered, and when the degree of the loss function reduction is large, J _i-1 -J _i Is large, so that the learning rate eta is large _i The variable speed is increased, the step length adjustment force of weight updating iteration is increased, the reduction degree of the loss function is smaller, J _i-1 -J _i Is small, so that the learning rate eta is small _i The variable speed is reduced, the step length adjustment strength of the weight updating iteration is reduced, and finally the weight can be adaptive to the rapid stable iteration and reach the optimal value rapidly.

In conclusion, the beneficial effects of the invention are as follows: the microseism characteristic data in the microseism detection data are extracted through the characteristic extraction subsystem, on one hand, the quantity is reduced, on the other hand, the data characteristics are reserved, the P wave first arrival time and the S wave first arrival time are extracted through the P wave first arrival induction subsystem and the S wave first arrival induction subsystem respectively, and the microseism seismic source position is obtained through estimation of the microseism estimation subsystem.

Drawings

FIG. 1 is a system diagram of a deep learning chip suitable for real-time seismic data processing;

FIG. 2 is a system block diagram of a CNN unit;

FIG. 3 is a system block diagram of a global attention unit.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.

As shown in fig. 1, a deep learning chip suitable for real-time seismic data processing includes: the system comprises a characteristic extraction subsystem, a P wave first arrival induction subsystem, an S wave first arrival induction subsystem and a microseism estimation subsystem;

the characteristic extraction subsystem is used for extracting microseism detection data to obtain microseism characteristic data; the P wave first arrival induction subsystem is used for extracting P wave first arrival time in the microseism characteristic data; the S wave first arrival induction subsystem is used for extracting S wave first arrival time in the microseism characteristic data; the micro-seismic estimation subsystem is used for estimating a micro-seismic source according to the P wave first arrival time and the S wave first arrival time.

The feature extraction subsystem includes: a CNN unit, a first BilSTM unit, a second BilSTM unit and a first global attention unit;

the input end of the CNN unit is used as the input end of the feature extraction subsystem and used for inputting microseism detection data; the input end of the first BilSTM unit is connected with the output end of the CNN unit, and the output end of the first BilSTM unit is connected with the input end of the second BilSTM unit; and the input end of the first global attention unit is connected with the output end of the second BilSTM unit, and the output end of the first global attention unit is used as the output end of the feature extraction subsystem.

As shown in fig. 2, the CNN unit includes: a first convolution layer, a first maximum pooling layer, a second convolution layer, a second maximum pooling layer, a third convolution layer, a third maximum pooling layer, a fourth convolution layer, a fourth maximum pooling layer, a fifth convolution layer and a fifth maximum pooling layer;

The P wave first arrival induction subsystem comprises: a third BilSTM unit, a second global attention unit and a first full connection layer unit;

the input end of the third BilSTM unit is used as the input end of the P-wave first arrival induction subsystem; the input end of the second global attention unit is connected with the output end of the third BilTM unit, and the output end of the second global attention unit is connected with the input end of the first full-connection layer unit; and the output end of the first full connection layer unit is used as the output end of the P-wave first arrival induction subsystem.

The S-wave first arrival induction subsystem comprises: a fourth BilSt unit, a third global attention unit and a second full connection layer unit;

The input and output relations of the cells in the LSTM module of the BiLSTM unit in the feature extraction subsystem, the P wave first arrival induction subsystem or the S wave first arrival induction subsystem are as follows:

f _t ＝σ[(W _f ·(y _t-1 ，x _t ，C _t-1 )+b _f ]

i _t ＝tanh[W _i ·(y _t-1 ，x _t ，C _t-1 )+b _i ]

h _t ＝σ[W _h ·(y _t-1 ，x _t ，C _t-1 )+b _h ]

C _t ＝(C _t-1 ⊙f _t +(1-f _t )⊙i _t )⊙((1-i _t )Qh _t )

y _t ＝σ[W _o ·(y _t-1 ，x _t ，C _t-1 ，C _t )+b _o ]⊙tanh[C _t ]

wherein f is _t The output of the forgetting gate at time t, σ [ ]]Activating a function for sigmoid, W _f Weight of forgetting gate, b _f To forget the biasing of the door, y _t-1 Is the output of the cell at time t-1, x _t Input of cells at time t, C _t-1 The state of the cells at time t-1, i _t The output of the input gate at time t, tanh [ 2 ]]Is a hyperbolic tangent activation function, W _i As the weight of the input gate, b _i For the biasing of the input gate, h _t Output of candidate gate at time t, W _h As the weight of the candidate door, b _h As a candidate for the offset of the gate, C _t The state of the cell at time t, <' > is Hadamard product, y _t Output of the gate for time t, W _o As weights of output gates, b _o Is the biasing of the output gate.

Taking into account the state C of the last moment in the LSTM module _t-1 Input x at the current time _t Output y of the previous time _t-1 The cell is operated with the relationship among the state, input, and output taken into full account.

As shown in fig. 3, the global attention unit in the feature extraction subsystem, the P-wave first-motion induction subsystem or the S-wave first-motion induction subsystem includes: a sixth convolution layer, a Softmax layer, a multiplier, a seventh convolution layer, a ReLU layer, an eighth convolution layer, and an adder;

The characteristic extraction subsystem, the P wave first arrival induction subsystem and the S wave first arrival induction subsystem are arranged in the processor.

The data tags are P-wave first arrival and S-wave first arrival times.

The weight updating formula of the training process is as follows:

wherein, w _i+1 Is the weight of the i +1 th iteration, w _i Is the weight, η, of the ith iteration _i Learning rate, η, for the ith iteration _i-1 Learning rate for the i-1 st iteration, J _i As a loss function for the ith iteration, J _i-1 And gamma is a proportionality coefficient and zeta is an adjusting constant for the loss function of the i-1 iteration.

The weighting formula of the past second derivative and the current second derivative of the loss function is designed, the larger the second derivative of the loss function is, the larger the change rate of the gradient of the loss function is, the weight is used for regulating and controlling the weight after the change rate of the loss function and the change rate of the past gradient are weighted and accumulated to be smooth and filtered, so that the step length of weight iteration is hooked with the change rate of the gradient, the overshoot of the weight iteration is prevented, and the slow of the weight iteration is avoided; in the design of the learning rate parameter, the degree of the loss function is considered, and when the degree of the loss function is largerWhen, J _i-1 -J _i Is large, so that the learning rate eta is large _i The variable speed is increased, the step length adjustment strength of weight updating iteration is increased, the reduction degree of the loss function is smaller, J _i-1 -J _i Is small, so that the learning rate eta is small _i The variable speed is reduced, the step length adjustment strength of the weight updating iteration is reduced, and finally the weight can be adaptive to the rapid stable iteration and reach the optimal value rapidly.

The invention extracts the micro-seismic characteristic data in the micro-seismic detection data through the characteristic extraction subsystem, on one hand, the quantity is reduced, on the other hand, the data characteristics are kept, and then the P-wave first-arrival time and the S-wave first-arrival time are extracted through the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem respectively, the speed of the P-wave and the speed of the S-wave can be measured through the sensor, and the micro-seismic source position can be estimated through the time difference between the P-wave first-arrival time and the S-wave first-arrival time and the speed of the P-wave and the S-wave.

Claims

1. A deep learning chip adapted for real-time seismic data processing, comprising: the system comprises a characteristic extraction subsystem, a P wave first arrival induction subsystem, an S wave first arrival induction subsystem and a microseism estimation subsystem;

2. The deep learning chip suitable for real-time seismic data processing according to claim 1, wherein the feature extraction subsystem comprises: a CNN unit, a first BilSTM unit, a second BilSTM unit and a first global attention unit;

3. The deep learning chip suitable for real-time seismic data processing according to claim 2, wherein the CNN unit comprises: a first convolution layer, a first maximum pooling layer, a second convolution layer, a second maximum pooling layer, a third convolution layer, a third maximum pooling layer, a fourth convolution layer, a fourth maximum pooling layer, a fifth convolution layer and a fifth maximum pooling layer;

4. The deep learning chip for real-time seismic data processing according to claim 1, wherein the P-wave first arrival induction subsystem comprises: a third BilSTM unit, a second global attention unit and a first full connection layer unit;

the input end of the third BilSTM unit is used as the input end of the P-wave first arrival induction subsystem; the input end of the second global attention unit is connected with the output end of the third BilSTM unit, and the output end of the second global attention unit is connected with the input end of the first full-connection layer unit; and the output end of the first full connection layer unit is used as the output end of the P wave first arrival induction subsystem.

5. The deep learning chip for real-time seismic data processing according to claim 1, wherein the S-wave first arrival induction subsystem comprises: a fourth BilSt unit, a third global attention unit and a second full connection layer unit;

6. The deep learning chip for real-time seismic data processing according to any one of claims 2, 4, and 5, wherein the input/output relationship of the cells in the LSTM module of the BiLSTM unit in the feature extraction subsystem, the P-wave first arrival induction subsystem, or the S-wave first arrival induction subsystem is:

f _t ＝σ[(W _f ·(y _t-1 ，x _t ，C _t-1 )+b _f ]

i _t ＝tanh[W _i ·(y _t-1 ，x _t ，C _t-1 )+b _i ]

h _t ＝σ[W _h ·(y _t-1 ，x _t ，C _t-1 )+b _h ]

C _t ＝(C _t-1 ⊙f _t +(1-f _t )⊙i _t )⊙((1-i _t )⊙h _t )

y _t ＝[W _o ·(y _t-1 ，x _t ，C _t-1 ，C _t )+ _o ]⊙h[ _t ]

wherein f is _t The output of the forgetting gate at time t, σ [ ]]For sigmoid activation functions, W _f Weight of forgetting gate, b _f To forget the biasing of the door, y _t-1 Is the output of the cell at time t-1, x _t Input of cells at time t, C _t-1 The state of the cells at time t-1, i _t Input the output of the gate at time t, tanh 2]As hyperbolic tangent activation function, W _i As the weight of the input gate, b _i For the biasing of the input gate, h _t For the output of the candidate gate at time t, W _h As the weight of the candidate door, b _h As a candidate for the offset of the gate, C _t The state of the cell at time t, <' > is Hadamard product, y _t Output of the gate for time t, W _o As weights of output gates, b _o Is the biasing of the output gate.

7. The deep learning chip suitable for real-time seismic data processing according to any one of claims 2, 4, and 5, wherein the global attention unit in the feature extraction subsystem, the P-wave first-arrival induction subsystem, or the S-wave first-arrival induction subsystem comprises: a sixth convolution layer, a Softmax layer, a multiplier, a seventh convolution layer, a ReLU layer, an eighth convolution layer, and an adder;

8. The deep learning chip for real-time seismic data processing according to claim 1, wherein the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem use micro-seismic detection data and data labels to form a training data set, the feature extraction subsystem, the P-wave first-arrival induction subsystem and the S-wave first-arrival induction subsystem are trained by using the training data set to obtain a trained feature extraction subsystem, a trained P-wave first-arrival induction subsystem and a trained S-wave first-arrival induction subsystem, and the trained feature extraction subsystem, the trained P-wave first-arrival induction subsystem and the trained S-wave first-arrival induction subsystem are arranged in the processor.

9. The deep learning chip for real-time seismic data processing according to claim 8, wherein the weight update formula of the training process is:

wherein, w _i+1 Weight of the i +1 th iteration, w _i Is the weight of the ith iteration, η _i Learning rate, η, for the ith iteration _i-1 Learning rate for the i-1 st iteration, J _i As a loss function for the ith iteration, J _i-1 And gamma is a proportionality coefficient and zeta is a regulation constant for the loss function of the i-1 th iteration.