CN114510960A

CN114510960A - Method for recognizing distributed optical fiber sensor system mode

Info

Publication number: CN114510960A
Application number: CN202111630414.9A
Authority: CN
Inventors: 杨振国; 董火民; 王金伟; 张发祥; 王春晓; 李传涛
Original assignee: Shandong Shanke Holding Group Co ltd; Qilu University of Technology
Current assignee: Shandong Shanke Holding Group Co ltd; Qilu University of Technology
Priority date: 2021-12-28
Filing date: 2021-12-28
Publication date: 2022-05-17

Abstract

The invention relates to the technical field of distributed optical fiber vibration sensing application, in particular to a method for identifying a mode of a distributed optical fiber sensor system, which is used for preparing the distributed optical fiber sensor system. And acquiring data signals, and constructing data sets of different events. And performing noise reduction processing on the signal data. The signal data is converted to a time domain map and a time-frequency domain map of the corresponding event. And constructing a deep learning network based on the Vision Transformer. And performing identification and classification. The invention has simple preprocessing steps and does not carry out a large amount of data extraction operation on the data. The method ensures the originality of data and prevents the loss of the data, and combines the algorithm with the distributed optical fiber sensor event image for the first time to classify the events under the scene. The algorithm is based on an attention mechanism, is different from the conventional convolutional neural network, has a higher identification effect on a large-scale data set, can extract global features at one time, and the like.

Description

Method for recognizing distributed optical fiber sensor system mode

Technical Field

The invention relates to the technical field of distributed optical fiber vibration sensing application, in particular to a method for identifying a mode of a distributed optical fiber sensor system based on Vision Transformer.

Background

The sensor is used as a primary and key component for information acquisition and plays an important role in information science and technology. The optical fiber sensor senses by using light wave signals and transmits sensing signals by using optical fibers to measure the measured parameters. The distributed optical fiber sensing technology is that a plurality of sensing units are distributed on the same transmission optical fiber and collect signals through one channel. Compared with the traditional electrical sensor, the optical fiber sensor has a series of unique advantages of shielding electromagnetic field interference; no electromagnetic field interference is generated; the loss is low; corrosion resistance; is intrinsically safe; the sensitivity is high; all-weather large-range monitoring is carried out on the sensing optical fiber; the remote sensing and distributed sensing capability is realized; multiplexing and multi-parameter sensing functions; low cost and the like. With the development of information technology and optical fiber technology, optical fiber sensors are attracting attention as an emerging sensor. Up to now, distributed optical fiber sensors are applied to various application scenarios in the fields of civil engineering, communication, petrochemical hazardous occasions, power industry, and the like.

As machine learning algorithms developed, the initial scholars combined distributed fiber optic sensor event recognition with machine learning algorithms. Firstly, feature extraction is carried out on event data collected in an application scene, and then feature classification is carried out through a machine learning algorithm, wherein common machine learning algorithms comprise an SVM (support vector machine), an RVM (relevance vector machine), an HMMS (Markov model), an ANN (artificial neural network) and the like. With the development and progress of the deep learning algorithm, students introduce the deep learning algorithm into the pattern recognition of the distributed optical fiber sensor, and common deep learning algorithms include CNN (convolutional neural network algorithm), LSTM (long-term memory network algorithm), RNN (recurrent neural network algorithm), and algorithms combining various deep learning algorithms.

In the machine learning algorithm, the defects of manual feature extraction, few types of identification events and low identification effect are faced, and the problems of the machine learning algorithm can be solved through the deep learning algorithm. The deep learning algorithm is applied to the distributed optical fiber sensing mode identification, and the problems that the overall weight of the algorithm is heavy, gradient dispersion and performance degradation are easy to occur along with the deepening of the network depth and the like are also faced.

The CNN (convolutional neural network) algorithm is now also applied to various fields by broad scholars as a classic and most widely used algorithm in deep learning. The CNN (convolutional neural network) algorithm has the advantages of automatic feature extraction, parameter sharing, coefficient connection, translation invariance and the like, but also has the defects of normalization of a data set, no memory function, incapability of extracting global features at one time and the like.

Disclosure of Invention

The invention provides a Vision Transformer-based distributed optical fiber sensor pattern recognition method for making up the defects in the prior art, and introduces an attention mechanism algorithm into the field of distributed optical fiber sensors

A method for mode identification of a distributed optical fiber vibration sensor is characterized by comprising the following steps:

step 1: and preparing the distributed optical fiber sensor system.

Step 2: and acquiring data signals, and constructing data sets of different events.

And step 3: and performing noise reduction processing on the signal data.

And 4, step 4: the signal data is converted to a time domain map and a time-frequency domain map of the corresponding event.

And 5: and constructing a deep learning network based on the Vision Transformer.

Step 6: and performing identification and classification.

Further, the step 1 specifically includes the following steps:

step 1.1: in the technical scheme of the distributed optical fiber sensor, an optical time domain reflectometer-based scheme is selected as the technical scheme of the distributed optical fiber sensor.

Step 1.2: preparing narrow line laser, coupler, acousto-optic modulator, first erbium-doped amplifier, band-pass filter, circulator, second erbium-doped amplifier, tunable optical attenuator, photoelectric detector, data acquisition card, personal computer and single-mode single-side optical fiber.

Step 1.3: and assembling the distributed optical fiber sensor system to prepare for event data acquisition under various application scenes.

Further, the step 2 specifically includes the following steps:

step 2.1: the distributed optical fiber sensing system is arranged in a scene according to the scene to be detected and applied. And acquiring scene corresponding event data according to different scenes. When data are collected, relevant parameters such as corresponding sampling frequency and pulse width are set according to the distributed optical fiber sensing system.

Step 2.2: and when the data set is collected, recording the data change condition of the distributed optical fiber sensing channel under the action of an event.

Step 2.3: and storing and backing up the channel data with the signal intensity changed under the action of the event of the last step.

Further, the step 3 specifically includes the following steps:

step 3.1: and (3) extracting the event data acquired in the step (2) from the channel.

Step 3.2: and filtering the collected different event data through a filter.

Step 3.3: and 3.2, performing noise reduction processing on the data in the step 3.2 through wavelet denoising, performing wavelet transformation on the signals by setting a specific wavelet basis function, decomposing the signals into a plurality of scales, removing or correcting part of scale components to reconstruct the signals according to the difference of noise and signal values on the scales, judging the noise by using a wavelet coefficient obtained after decomposition, wherein the wavelet coefficient of the noise signals is usually small, the noise can be removed through a method of setting a threshold, when the wavelet coefficient is smaller than the threshold, the noise signals are judged, and otherwise, the noise signals are judged to be effective signals.

Further, the step 4 specifically includes the following steps:

step 4.1: and (3) converting the data subjected to noise reduction pretreatment in the step (3) into time domain graphs corresponding to various events in batches according to the sampling frequency and time set during event data acquisition. The time domain graph mainly refers to the change situation of the signal intensity along with the time, the characteristics in the time domain signal are visual and obvious, and the intrusion events can be distinguished by counting certain regular change of the signal in the time domain within a certain time.

Step 4.2: and (3) converting the data subjected to noise reduction preprocessing in the step (3) into a time-frequency domain graph corresponding to various events through short-time Fourier transform according to the sampling frequency and time set during event data acquisition. The time-frequency domain diagram not only includes the spectrum characteristics in a certain time, but also includes the time variation of each frequency band.

Further, the step 5 specifically includes the following steps:

step 5.1: and (4) performing labeling operation on the time domain graph of the event processed in the step (4), and dividing the time domain graph into a training set, a checking set and a testing set according to the ratio of 8:1: 1.

Step 5.2: and (4) performing labeling operation on the event time-frequency domain graph processed in the step (4), and dividing the event time-frequency domain graph into a training set, a checking set and a testing set according to the ratio of 8:1: 1.

Step 5.3: and constructing a Vision Transformer deep learning image classification model based on a time domain graph and a time-frequency domain graph data set of the event data, and setting Vision Transformer network model parameters.

Further, the network model in step 5.3 specifically includes: an Embellding layer, a Transformer Encoder layer and an MLP Head layer.

In the Embedding layer, the following are included: convolutional layers, linear mapping layers, Class token layers, Position Embedding layers, and Dropout layers.

In the transform Encoder layer: the method comprises the following steps: layer Norm Layer, multi-head attention Layer, DropPath Layer, MLP Block Layer, wherein MLP Block Layer comprises full connection Layer, GELU activation function Layer, Dropout Layer.

In the MLP Head layer, include: a fully connected layer, a tanh activation function layer.

Further, the step 5.3 specifically comprises the following steps:

step 5.3.1: and initializing Vision Transformer network model parameters.

Step 5.3.2: the three-dimensional RGB image is flattened into a two-dimensional matrix, i.e., a sequence of vectors, using convolutional layers.

Step 5.3.3: and (4) flattening the two-dimensional matrix in the step 5.3.2, wherein a Flatten layer corresponds to the two-dimensional matrix. The flattening process does not affect the size of the batch, and the purpose is to change the high latitude array into a one-dimensional vector sequence.

Step 5.3.4: in order to enable supervised learning, the input picture of the neural network can be marked, and a Class token layer is introduced, wherein the Class token is a trainable parameter and has the same vector series format as the step 5.3.3. The Class token is spliced together with the vector sequence in step 5.3.3.

Step 5.3.5: in order to determine the Position relationship in the vector sequence, a Position Embedding layer is used, which uses a trainable parameter and is superimposed with the final vector sequence of step 5.3.4.

Step 5.3.6: the final sequence of vectors in step 5.3.5 is passed through a Dropout layer. The Dropout layer can reduce coupling between neurons, enable each neuron to extract proper characteristics by itself, and also can perform integration of networks, wherein the networks are different in each training process, and overfitting can be prevented.

Step 5.3.7: through a transform Encoder layer. The network layer is the core backbone part of the algorithm.

A further step 5.3.7 specifically comprises the steps of:

step 5.3.7.1: first, a Layer Normalization Layer is formed. Batch Normalization is to perform Norm processing on each channel of one Batch data, but Layer Normalization is to perform Norm processing on a specified dimension of a single data independent of Batch. The Layer Normalization Layer normalizes the data, that is, normalizes the data at an angle or a level to mean 0 and to have a variance of 1. The gradient can be enlarged through the Layer Normalization Layer, the problem of gradient disappearance is avoided, the training time of the neural network can be accelerated, the learning rate and the initialization weight can be higher easily, more loss functions can be supported, and the like.

Step 5.3.7.2: through the Multi-Head Attention layer. The multi-head attention layer is developed on the basis of the self-attention layer. Learning information from different modules can be combined using a multi-headed attention mechanism. And (3) respectively passing the input vector ai through Wq, Wk and Wv to obtain corresponding qi, ki and vi, and further dividing the obtained qi, ki and vi into h parts by the used head number h. The corresponding head formula is:

obtaining a corresponding result by using a self-attention mechanism for each head by obtaining Qi, Ki, Vi parameters corresponding to each head, wherein a corresponding formula in the self-attention mechanism is as follows:

and splicing the results obtained by each head, and fusing the spliced results through Wo (learnable parameter) to obtain the final result. The corresponding formula is:

step 5.3.7.3: the same Dropout layer is passed as in step 5.36.

Step 5.3.7.4: and connecting the data before the step 5.3.7.1 and the data after the step 5.3.7.3 through the authorized shortcut. Simply increasing the depth of the network cannot simply improve the effect of the network, but may damage the effect of the model due to gradient divergence, and introducing shortcut can solve the degradation problem in the depth model.

Step 5.3.7.5: the same Layer Normalization Layer as that in step 5.3.7.1 is passed, so as to enlarge the gradient, avoid the problem of gradient disappearance and accelerate the training time of the neural network.

Step 5.3.7.6: through the MLP Block layer. The MLP Block layer mainly comprises a full connection layer, a GELU activation function layer and a Dropout layer.

A further step 5.3.7.6 includes the steps of:

step 5.3.7.6.1: and (4) integrating the features extracted in the previous step of the data through a full connection layer, and mapping the feature representation learned by the multi-head attention mechanism into a sample mark space.

Step 5.3.7.6.2: pass through the GELU activation function layer. The function of the activation function is to add a nonlinear factor aiming at the condition that the expression capability of a linear model is not enough. The formula of the corresponding GELU activation function is:

GELU(X)＝x×P(X<＝x)＝x×φ(x)，x～N(0，1)

where X is the corresponding input value and X is a gaussian random variable with zero mean and unit variance. P (X < ═ X) is the probability that X is less than or equal to a given value X. A

Step 5.3.7.6.3: through the Dropout layer, coupling between neurons is reduced, preventing over-fitting.

Step 5.3.7.6.4: through the full link layer in step 5.3.7.6.1.

Step 5.3.7.6.5: over-fitting is prevented via Dropout at step 5.3.7.6.3.

Step 5.3.8: through the same Layer Normalization Layer as in step 5.3.7.1. The problem that the gradient disappears after the characteristics are extracted in the previous step is avoided, and the time for training the neural network is shortened.

Step 5.3.9: and (4) extracting the Class Token layer, and extracting the Class Token layer in the step 5.3.3 to obtain the marked sample information.

Step 5.3.10: and (4) integrating the features extracted in the previous step of the data through a full connection layer, and mapping the feature representation learned by the multi-head attention mechanism into a sample mark space.

Step 5.3.11: the function layer is activated via tanh. The Tanh activation function is centered on the origin, the convergence rate is faster, and the corresponding formula is:

step 5.3.12: and (4) integrating the characteristics in the steps through a full connection layer to prepare for the output result of the next step.

Step 5.3.13: and outputting the result.

Step 5.4: and (4) training the Vision Transformer network model obtained in the step 5.3.

Step 5.5: and (5) performing network tuning on the trained Vision Transformer network model, if the optimal parameters are found, saving the model with the best result as the model for final event recognition, and otherwise, skipping to the step 5.4 and then performing network model training.

The invention has the beneficial effects that:

the invention has simple preprocessing steps and does not carry out a large amount of data extraction operation on the data. The originality of the data is guaranteed and the loss of the data is prevented. The invention combines a Vision Transformer algorithm and a distributed optical fiber sensor event image for the first time and is used for event classification under scenes. The Vision Transformer algorithm is an attention-based algorithm, and has the advantages that the Vision Transformer algorithm is different from the conventional convolutional neural network, the identification effect on a large-scale data set is higher, the global features can be extracted at one time, and the like. In the invention, the time domain graph and the time domain graph in the distributed optical fiber sensing are respectively used as data sets of a Vision Transformer algorithm, so that the effect of comparison can be achieved.

Drawings

FIG. 1 is a general flow diagram of the present invention;

FIG. 2 is a flow chart of distributed fiber optic sensing signal identification of the present invention;

FIG. 3 is a schematic diagram of the mechanism and operation of the distributed optical fiber sensing system of the present invention;

FIG. 4 is a time domain diagram of the high speed bridge section of the present invention showing the correspondence of car horn, car, impact, excavation, pedestrian, and rain events;

fig. 5 is a time-frequency domain diagram corresponding to car horn, car, impact, excavation, pedestrian, and rain events collected at a high-speed bridge section according to the present invention.

Detailed description of the invention

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

In the description of the present invention, it should be noted that the terms "middle", "upper", "lower", "horizontal", "inner", "outer", and the like indicate orientations or positional relationships based on orientations or positional relationships shown in the drawings or orientations or positional relationships conventionally laid out when products of the present invention are used, and are only used for convenience in describing the present invention and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.

Furthermore, the terms "horizontal", "vertical" and the like do not imply that the components are required to be absolutely horizontal or pendant, but rather may be slightly inclined. For example, "horizontal" merely means that the direction is more horizontal than "vertical" and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.

In the description of the present invention, it should be noted that the terms "disposed," "connected," and "connected" are to be construed broadly and may be, for example, fixedly connected, detachably connected, or integrally connected unless otherwise explicitly stated or limited. Either mechanically or electrically. They may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

Some embodiments of the invention are described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.

Fig. 1-5 show an embodiment of the present invention, which takes an intrusion detection application of a highway bridge segment as an example, and a method for identifying a distributed optical fiber sensor pattern based on Vision Transformer, where the overall flow is as shown in fig. 1, and a specific signal processing and deep learning network structure is as shown in fig. 2.

Step 1: and preparing the distributed optical fiber sensing system. In this example, a distributed optical fiber sensing system based on a phase-sensitive optical time domain reflectometry (Φ -OTDR) technique is selected, and the main devices used in this system are: the device comprises a narrow-line laser, a coupler, an acousto-optic modulator, a first erbium-doped amplifier, a band-pass filter, a circulator, a second erbium-doped amplifier, a tunable optical attenuator, a balance detector, a data acquisition card, a personal computer, a single-mode single-side optical fiber and the like. The mechanism and working principle of the system are shown in figure 3. A narrow-line laser with the line width of 5kHz is used as a light source, the light source output by the narrow-line laser is decomposed into 95:5 light path branches through a coupler, the light path branches are arranged on the upper branch, continuous waves are modulated through an acousto-optic modulator, and an optical pulse sequence with the frequency shift of 60mhz is generated. And then the light source is amplified by the first erbium-doped amplifier to prevent the loss of the light source in the transmission process, noise is reduced by the 0.8nm passband filter, the light pulse is transmitted into the single-mode fiber after passing through the circulator, if different events act on the single-mode fiber, the Rayleigh backscattering trajectory is further amplified by the second erbium-doped amplifier, noise reduction is performed by the 0.8nm passband filter, the upper branch light path and the lower branch light path are combined when passing through the second coupler, the light source performs conversion of optical signals and electric signals by the photoelectric detector, and then event data is acquired by the bit data acquisition card 14. And finally storing the event signal through a personal computer or a storage device.

Step 2: and acquiring field data of the high-speed bridge road section. The distributed optical fiber sensing system is arranged on a high-speed bridge section, and six events, namely automobile horn, automobile impact (simulation), spade excavation, pedestrian walking and rain, are collected on site after the 2KM single-mode optical fiber surrounds the high-speed bridge. The sampling frequency of the distributed optical fiber sensing system is 6KHz, and the event acquisition time of each event is 5-8 min.

And step 3: and (2) according to the signal parameters acquired by the high-speed bridge section and the single-mode optical fiber event occurrence channel condition in the step (2), classifying the acquired signals according to event types, extracting time data in a channel in the single-mode optical fiber, and performing noise reduction by combining a wavelet denoising method.

And 4, step 4: and (3) converting the data processed in the step (3) into time domain graphs corresponding to the events collected on site within 3 seconds in batches according to the parameter condition during data collection, wherein the time domain graphs of the six events collected on site are shown in FIG. 4. According to the same method, the data processed in the step 3 are converted into time-frequency domain graphs corresponding to the data within 3 seconds in batches according to the parameter condition during data sampling. A time-frequency domain plot of the six live acquisition events is shown in fig. 5.

And 5: and (3) constructing an algorithm according to a Vision transform algorithm structure in the figure 2, and training and iterating the network according to the collected time domain graph and time frequency domain graph data sets of the highway bridge sections according to the proportion of a training set, a test set and a test set which is 8:1: 1. And storing the parameters corresponding to the optimal one time in multiple iterations as the pre-training weights in step 6.

Step 6: event identification and classification: and (5) after the pre-trained network weights in the step (5) are stored, inputting a new batch of data sets into a network algorithm to obtain a final event recognition result.

Finally, the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting, and other modifications or equivalent substitutions made by the technical solutions of the present invention by those of ordinary skill in the art should be covered within the scope of the claims of the present invention as long as they do not depart from the spirit and scope of the technical solutions of the present invention.

Claims

1. A method for mode identification of a distributed optical fiber vibration sensor is characterized by comprising the following steps:

s1, preparing a distributed optical fiber sensor system;

s2, acquiring data signals, and constructing data sets of different events;

s3, carrying out noise reduction processing on the signal data;

s4, converting the signal data into a time domain graph and a time-frequency domain graph corresponding to the event;

s5, constructing a deep learning network based on Vision Transformer;

and S6, performing identification and classification.

2. The method of distributed fiber optic vibration sensor pattern recognition according to claim 1, wherein: specifically, the step S1 is,

s11, selecting an optical time domain reflectometer-based scheme as a technical scheme of the distributed optical fiber sensor;

s12, preparing a narrow-line laser, a coupler, an acousto-optic modulator, a first erbium-doped amplifier, a band-pass filter, a circulator, a second erbium-doped amplifier, a tunable optical attenuator, a photoelectric detector, a data acquisition card, a personal computer and single-mode single-side optical fiber equipment in the distributed optical fiber system;

and S13, assembling the distributed optical fiber sensor system, and preparing for event data acquisition in various application scenarios.

3. The method of distributed fiber optic vibration sensor pattern recognition according to claim 1, wherein: specifically, the step S2 is,

s21, arranging the distributed optical fiber sensing system in a scene according to the scene to be detected and applied, acquiring event data corresponding to the scene according to different scenes, and setting corresponding sampling frequency and pulse width related parameters according to the distributed optical fiber sensing system when acquiring the data;

s22, recording the data change condition of the distributed optical fiber sensing channel under the action of an event when a data set is collected;

and S23, saving and backing up the channel data with the signal intensity changed under the action of the event in the last step.

4. The method of distributed fiber optic vibration sensor pattern recognition according to claim 1, wherein: specifically, the step S3 is,

s31, extracting the event data collected in S2 from the channel;

s32, filtering the collected different event data through a filter;

s33, denoising the data in S32 by wavelet, wavelet transforming the signal by setting a specific wavelet basis function, decomposing the signal into a plurality of scales, removing or correcting part of scale components to reconstruct the signal according to the difference of the noise and the signal value on the scales, judging the noise by using the wavelet coefficient obtained after decomposition, wherein the wavelet coefficient of the noise signal is usually small, the noise can be removed by setting a threshold value, when the wavelet coefficient is smaller than the threshold value, the noise signal is judged, otherwise, the noise signal is judged as an effective signal.

5. The method of distributed fiber optic vibration sensor pattern recognition according to claim 1, wherein: specifically, the step S4 is,

s41, converting the data subjected to noise reduction preprocessing in S3 into time domain graphs corresponding to various events in batches according to the set sampling frequency and time during event data acquisition;

and S42, converting the data subjected to noise reduction preprocessing in S3 into time-frequency domain graphs corresponding to various events through short-time Fourier transform according to the set sampling frequency and time during event data acquisition.

6. The method of distributed fiber optic vibration sensor pattern recognition according to claim 1, wherein: specifically, the step S5 is,

s51, performing labeling operation on the time domain graph of the processed event in S4, and dividing the time domain graph into a training set, a checking set and a testing set according to the ratio of 8:1: 1;

s52, performing labeling operation on the event time-frequency domain graph processed in the S4, and dividing the event time-frequency domain graph into a training set, a checking set and a testing set according to the ratio of 8:1: 1;

s53, constructing a Vision Transformer deep learning image classification model based on a time domain graph and a time-frequency domain graph data set of event data, and setting Vision Transformer network model parameters;

s54, training Vision Transformer network model parameters obtained in S53;

and S55, performing network optimization on the trained Vision Transformer network model, if the optimal parameters are found, saving the model with the best result as the model for final event recognition, and otherwise, jumping to S54 and then performing network model training.

7. The method of distributed fiber optic vibration sensor pattern recognition according to claim 6, wherein: the network model in the S53 specifically includes an Embedding layer, a Transformer Encoder layer, and an MLP Head layer;

the Embedding layer comprises a convolution layer, a linear mapping layer, a Clas token layer, a Position Embedding layer and a Dropout layer;

the Transformer Encoder Layer comprises a Layer Norm Layer, a multi-head attention Layer, a DropPath Layer and an MLP Block Layer, wherein the MLP Block Layer further comprises a full connection Layer, a GELU activation function Layer and a Dropout Layer;

the MLP Head layer comprises a full connection layer and a tanh activation function layer;

the S53 specifically includes:

s531, initializing Vision Transformer network model parameters;

s532, flattening the three-dimensional RGB image into a two-dimensional matrix, namely a vector sequence, by using a convolution layer;

s533, flattening the two-dimensional matrix in the S532, wherein a Flatten layer corresponds to the two-dimensional matrix;

s534, introducing a Class token layer, wherein the Class token is a trainable parameter, and splicing the Class token with the vector sequence in S533 as the vector series format in S533;

s535, in order to determine the Position relationship in the vector sequence, using a Position Embedding layer, which uses a trainable parameter, and superposing the Position Embedding layer and the final vector sequence of step 5.3.4 together;

s536, passing the final vector sequence in the S535 through a Dropout layer;

s537, passing through a Transformer Encoder layer;

s538, passing through a Layer Normalization Layer;

s539, extracting a Class Token layer, extracting the Class Token layer in S533, and obtaining mark sample information;

s5310, integrating the features extracted in the previous step of the data through a full connection layer, and mapping the feature representation learned by the multi-head attention mechanism into a sample mark space;

s5311, passing through the tanh activation function layer. The Tanh activation function is centered on the origin, the convergence rate is faster, and the corresponding formula is:

s5312, synthesizing the characteristics in the steps through a full connection layer to prepare for the output result of the next step;

and S5313, outputting the result.

8. The method of distributed fiber optic vibration sensor pattern recognition according to claim 7, wherein: specifically, the S537 is, for example,

s5371, firstly, normalizing the data by a Layer Normalization Layer;

s5372, passing through a Multi-Head orientation layer, passing the input vector ai through Wq, Wk and Wv respectively to obtain corresponding qi, ki and vi, and further dividing the obtained qi, ki and vi into h parts according to the used Head number h. The corresponding head formula is:

s5373, passing through a Dropout layer similar to that in S536;

s5374, connecting the data before S5371 and the data after S5373 through the authorized shortcut;

s5375, passing through a Layer Normalization Layer similar to that in S5371;

s5376 passing through MLP Block layer, wherein the MLP Block layer comprises a full connection layer, a GELU activation function layer and a Dropout layer.

9. The method of distributed fiber optic vibration sensor pattern recognition according to claim 8, wherein: the specific example of the S5376 is,

s53761, through a full connection layer, integrating the features extracted in the previous step of the data, and mapping the feature representation learned by the multi-head attention mechanism to a sample mark space;

s53762, the formula of the corresponding GELU activation function is as follows through the GELU activation function layer:

GELU(X)＝x×P(X≤x)＝x×φ(x)，x～N(0，1)

where X is the corresponding input value and X is a gaussian random variable with zero mean and unit variance. P (X < ═ X) is the probability that X is less than or equal to a given value X;

s53763, reducing coupling among neurons through a Dropout layer and preventing overfitting;

s53764, passing through the full link layer in S53761;

s53765, via Dropout in S53763, prevent overfitting.