CN116522771A - Attention mechanism-based bidirectional two-stage interpretable heavy landing prediction method - Google Patents

Attention mechanism-based bidirectional two-stage interpretable heavy landing prediction method Download PDF

Info

Publication number
CN116522771A
CN116522771A CN202310439351.1A CN202310439351A CN116522771A CN 116522771 A CN116522771 A CN 116522771A CN 202310439351 A CN202310439351 A CN 202310439351A CN 116522771 A CN116522771 A CN 116522771A
Authority
CN
China
Prior art keywords
parameter
interpretable
attention mechanism
heavy landing
stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310439351.1A
Other languages
Chinese (zh)
Other versions
CN116522771B (en
Inventor
尚家兴
张锐祥
郑林江
陈逢文
李旭
陈浩东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202310439351.1A priority Critical patent/CN116522771B/en
Publication of CN116522771A publication Critical patent/CN116522771A/en
Application granted granted Critical
Publication of CN116522771B publication Critical patent/CN116522771B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a bidirectional two-stage interpretable heavy landing prediction method based on an attention mechanism, which comprises the steps of acquiring original flight parameters, and preprocessing the original flight parameters to obtain training data; constructing a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism; wherein the attention mechanism-based bi-directional two-phase interpretable heavy landing prediction model includes a temporal parameter module and a parametric temporal module; inputting training data into a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism for training; and acquiring the flight data to be predicted, inputting the flight data to be predicted into a training-completed bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism, and obtaining a prediction result and an occurrence reason during heavy landing.

Description

Attention mechanism-based bidirectional two-stage interpretable heavy landing prediction method
Technical Field
The invention belongs to the field of flight safety.
Background
Flight safety is one of the most interesting subjects for the civil aviation industry. The International Air Transport Association (IATA) 2021 safety report shows that the number of incidents occurring during the landing phase is the greatest among all phases of flight from 2017 to 2021. Heavy landing is a typical safety incident during landing, which may give passengers a poor experience, causing serious damage to the aircraft structure, and sometimes even death accidents. Therefore, predicting a heavy landing in advance plays a critical role in reducing the risk of landing and ensuring the safety of flight. Furthermore, the aeronautics industry is a high risk area, and it is therefore of great importance to provide an interpretation of heavy landing predictions. The interpretation of the predicted result can strengthen the trust of the pilot on the predicted result, and can help the pilot to multiplex the disk at the same time, thereby improving the flight skill.
The fast access recorder (Quick Access Recorder, QAR) is a flight data recording device, which is now widely installed in various commercial airliners. The system can record various parameters such as the flight state, pilot operation, environmental factors and the like of the whole flight stage in real time. QAR data is a typical multi-dimensional time series data and is now widely used for aircraft safety state monitoring, flight quality monitoring, accident investigation, etc.
QAR data is a typical time series data, so the re-landing problem can be abstracted as a Time Series Classification (TSC) problem. In recent years, CNN and RNN are widely used in the study of most time series classification. Zhao et al generate depth features of the input time series using CNN and then output the prediction results through the full connection layer. Wang et al propose a Full Convolutional Network (FCN) that achieves higher performance relative to other methods using convolutional layers, bulk normalization layers, and ReLU activation layers. Bai et al use a Time Convolution Network (TCN) to combine the dilation convolution, residuals and causal convolutions for the task of sequence modeling. LSTM may be appended to the FCN to form an LSTM full convolutional network (LSTM-FCN). Karim et al propose an attention LSTM full convolutional network (ALSTM-FCN) that combines an attention mechanism with an LSTM-FCN. Inspired by the transducer proposed for natural language translation tasks at the earliest, zerveas et al proposed a transducer-based multivariate time series representation learning framework. The output of this framework can be applied to some downstream tasks such as regression, classification and prediction. Ismail et al combine an acceptance module and a residual module to classify time series. Zhang et al propose a TSC network architecture that can be divided into three modules. Random dimensional arrangement, multivariate time series coding, and attention prototype learning. However, none of the above methods provide convincing interpretability for predictions, as they focus only on improving classification performance. Assaf et al and Fauvel et al use Grad-CAM to interpret time series classification results based on CNN. Grad-CAM is a widely used technique for computer vision, providing visual interpretation of the decision-making cause of a network. However, CAM-based methods were originally proposed in the field of computer vision and may not be suitable for time-series data. Rojat et al also states that there is a lack of interpretable methods applicable to CNNs specifically designed for time series tasks. Therefore, there is still much room for exploring how to design an interpretable method specific to time series data.
The invention takes a typical heavy landing event in a landing stage as a research object, takes QAR data as a support, takes an artificial intelligence and data mining method as means, and develops the prediction and the interpretation research of the heavy landing of the civil aircraft based on flight data.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems in the related art to some extent.
Therefore, the invention aims to provide a bidirectional two-stage interpretable heavy landing prediction method based on an attention mechanism, which is used for providing accurate and reliable heavy landing classification results and helping pilots to prevent heavy landing events in advance.
To achieve the above object, an embodiment of a first aspect of the present invention provides a bidirectional two-stage interpretable heavy landing prediction method based on an attention mechanism, including:
acquiring original flight parameters, and preprocessing the original flight parameters to obtain training data;
constructing a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism; wherein the attention mechanism-based bi-directional two-phase interpretable heavy landing prediction model includes a time parameter module and a parameter time module;
inputting the training data into a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism for training;
and acquiring flight data to be predicted, inputting the flight data to be predicted into a training-completed bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism, and obtaining a prediction result and an occurrence reason during heavy landing.
In addition, a bidirectional two-stage interpretable heavy landing prediction method based on an attention mechanism according to the above embodiment of the present invention may further have the following additional technical features:
further, in one embodiment of the present invention, the time parameter module and the parameter time module each include: a convolutional feature encoder, a first stage attention mechanism, and a second stage attention mechanism;
the time parameter module TPB firstly learns the internal time dependency relationship of each parameter and then learns the interrelationship among a plurality of parameters; the parameter time module PTB learns the relationships between the plurality of parameters at the same time point and learns the correlations between the plurality of time points.
Further, in one embodiment of the invention, the convolutional feature encoder is used to extract useful information from the preprocessed data; wherein the length of the time dimension after convolution is unchanged by adding zero padding on the left side of the input data.
Further, in one embodiment of the present invention, the method further includes:
evaluating importance weights of coded data in the sequence through the first-stage attention mechanism, and further obtaining context vectors through the importance weights;
in TPB, data after coding the d parameterThe importance score for each time point within is learned by the following formula:
wherein the method comprises the steps ofIs a weight matrix, < >>Is the deviation vector of the d-th parameter, sigma 1 Representing a nonlinear function>An attention weight vector that is the d-th parameter;
wherein the context of the d-th parameter represents a vectorIs the sum of the weight vectors of each time point, and the calculation method is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Transposed matrix of>By connecting d=1 to D +.>The product can be obtained by the method,from d=1 to D +.>Is combined into a whole;
in PTB, the data after the coding at the t time pointThe importance score for each parameter within is learned by the following formula:
wherein, the liquid crystal display device comprises a liquid crystal display device,is a weight matrix, < >>Is the deviation vector of the t time point, sigma 2 Represents a nonlinear function,/->Evaluating the contribution of each parameter to the model final decision at a t-th time point, the global vector at the t-th time point being calculated with the following formula:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Transposed matrix of>The contribution of the representation vector on the parameter domain to the model decision is summarized by weighting the representation vector. />By connecting from t=1 to T +.>Obtained (I)>From t=1 to T +.>Weights are composed and assigned to all parameters at all time points.
Further, in one embodiment of the invention, the second stage attention mechanism is used to obtain a higher level of attention score and a final context vector;
in TPB, the attention score of each parameter is calculated as follows:
A 3 =σ 3 (W 3 E 1T +b 3 ),
wherein, the liquid crystal display device comprises a liquid crystal display device,is a model parameter, +.>Quantifying the information contribution of each parameter to the prediction, then, < ->The definition is as follows:
G 1 =A 3 E 1
representing a context vector obtained by assigning attention weights in the time domain first and then in the parameter domain;
in PTB, the attention score of each parameter is calculated as follows:
A 4 =σ 4 (W 4 E 2T +b 4 ),
wherein, the liquid crystal display device comprises a liquid crystal display device,is a model parameter, +.>Quantifying the information contribution of each time point to the prediction, < >>Polymerization E 2 The information of all time points in the database is defined as follows:
G 2 =A 4 E 2
G 1 and G 2 The context vector, which is obtained by encoding the input data in the reverse order of the time domain and the parameter domain, also constitutes the input of the prediction module.
Further, in an embodiment of the present invention, the inputting the to-be-predicted flight data into the training-completed bidirectional two-stage based on the attention mechanism may explain the heavy landing prediction model, to obtain a prediction result and a cause of occurrence of heavy landing, including:
predicting whether a heavy landing event will occur for a flight by:
wherein, [ G 1 ,G 2 ]Is G 1 And G 2 Fc (·) represents the fully connected network,representing predictive labels, the training of the overall model aims to minimize the cross entropy loss function as follows:
where y represents the real label.
To achieve the above object, an embodiment of a second aspect of the present invention provides a bidirectional two-stage interpretable heavy landing prediction device based on an attention mechanism, including:
the acquisition module is used for acquiring original flight parameters, and preprocessing the original flight parameters to obtain training data;
the construction module is used for constructing a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism; wherein the attention mechanism-based bi-directional two-phase interpretable heavy landing prediction model includes a time parameter module and a parameter time module;
the training module is used for inputting the training data into a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism to train;
the prediction module is used for acquiring the flight data to be predicted, inputting the flight data to be predicted into the training-completed bidirectional two-stage interpretable heavy landing prediction model based on the attention mechanism, and obtaining a prediction result and an occurrence reason during heavy landing.
To achieve the above object, an embodiment of a third aspect of the present invention provides a computer device, which is characterized by comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements a bidirectional two-stage interpretable heavy landing prediction method based on an attention mechanism as described above when executing the computer program.
To achieve the above object, a fourth aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements a bi-directional two-phase interpretable heavy landing prediction method based on an attention mechanism as described above.
The invention provides a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism. The model is based on a CNN structure, an attention mechanism is added in forward propagation, and the calculated attention weight is used for visualization after the forward propagation is finished so as to explain the heavy landing leg. Meanwhile, aiming at the time sequence characteristics of QAR data, the data are respectively encoded in two dimensions of time and parameters, attention scores are obtained in stages, finally, the attention scores are fused for visualization, and context vectors are fused to obtain a final prediction result. The method and the device can provide accurate and reliable heavy landing classification results, and help pilots to prevent heavy landing events in advance. Meanwhile, the method and the device can provide explanation of heavy landing events, help pilots or flight specialists to better find the reasons of the occurrence of the events, guide and promote the flight skills of the pilots to fly, and better guarantee the flight safety.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
fig. 1 is a flow chart of a bidirectional two-stage interpretable heavy landing prediction method based on an attention mechanism according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a framework structure of a DUTSAI model according to an embodiment of the present invention.
Fig. 3 shows a block diagram of a convolutional kernel k with padding p=2 according to an embodiment of the present invention 1 CNN schematic of=3.
FIG. 4 is a schematic illustration of a SumA system in accordance with an embodiment of the present invention 1 And SumA 2 Is a schematic diagram of the calculation process.
Fig. 5 is a schematic flow chart of a bidirectional two-stage interpretable heavy landing prediction device based on an attention mechanism according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
The attention-based bi-directional two-phase interpretable heavy landing prediction method of embodiments of the present invention is described below with reference to the accompanying drawings.
Fig. 1 is a flow chart of a bidirectional two-stage interpretable heavy landing prediction method based on an attention mechanism according to an embodiment of the present invention.
As shown in fig. 1, the attention-mechanism-based bi-directional two-stage interpretable heavy landing prediction method includes the steps of:
s101: acquiring original flight parameters, and preprocessing the original flight parameters to obtain training data;
s102: constructing a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism; wherein the attention mechanism-based bi-directional two-phase interpretable heavy landing prediction model includes a temporal parameter module and a parametric temporal module;
s103: inputting training data into a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism for training;
s104: and acquiring the flight data to be predicted, inputting the flight data to be predicted into a training-completed bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism, and obtaining a prediction result and an occurrence reason during heavy landing.
Specifically, the method is used for providing accurate and reliable heavy landing classification results and helping pilots to prevent heavy landing events in advance. First, the aligned QAR data is given to have been processed in the time dimension (and pre-processed)The QAR data has D parameters and T time points,/>Representing all values of the d-th parameter value over the whole time length T,representing all parameter values at time t, a heavy landing may be defined as the maximum vertical load (VRTG) of the aircraft during the landing phase exceeding a certain threshold θ, namely: max (X) VRTG,landing )≥θ,
Wherein X is VRTG,landing Representing a sequence of changes in VRTG parameters during the landing phase, given a data setWherein N is the number of samples, X i Represents the ith sample, Y i Representing its corresponding tag, 1 indicates a heavy landing and 0 indicates a normal landing. The task of the invention is to train a model at the moment t of maximum vertical load MV Previously, QAR data was used to predict whether a flight would have a heavy landing. The invention uses t MV The reason for dividing the data rather than the ground point (the moment when the landing gear of the aircraft first touches the ground) is that some flights do not reach the maximum vertical load at the ground point, but instead the maximum vertical load occurs a few seconds after ground contact. Post ground and t MV The previous period of time also contains important information, and the utilization of the information is helpful for making decisions by the model of the invention, so that the occurrence cause of heavy landing is fully explored. Furthermore, another task of the present invention is to provide interpretability, which requires the model of the present invention to be able to find the cause of occurrence of a heavy landing, i.e. to provide an interpretation of the heavy landing event.
Each flight pattern contains a plurality of parameter values throughout the flight from takeoff to landing. Based on domain knowledge, 18 parameters were extracted and are listed in table 1. The flight parameters are then pre-processed by the following steps. First, there are some repeated parameters in the original data, such as left and right radio heights, whose values are almost identical, so only one is retained. Rejecting these similar parameters can reduce the modelAnd (3) the training speed is improved. Furthermore, most parameters are continuous values except for the discrete state variable LDGNOS, which reflects nose gear state. The continuous value parameters are normalized and the discrete parameters are converted to binary values of 0 and 1. Furthermore, a large number of QAR parameters recorded by aircraft have different sampling rates, ranging from 1Hz to 8Hz. Higher sampling frequencies correspond to more detailed parameter values and vice versa. This means that even with the same period of time, different parameters may have different lengths in the time dimension. However, the feature encoder to be described later requires that the input data be aligned in the time dimension. The invention is therefore based on two reasons, using an averaging method to reduce the sampling rate of all parameters to 1Hz. First, the sampling rate of the selected parameters of the invention is not lower than 1Hz. Secondly, in parameters with sampling frequency higher than 1Hz, the variation of the parameter values within 1 second is not obvious, so that the parameter values within 1 second can still be averaged to retain enough detailed parameter information. Furthermore, since the present invention focuses only on heavy landing events, only the landing stage data is selected and not the entire flight process. The invention uses t MV Is a demarcation point and extracts data 30 seconds before the point, i.e., t MV-30 To t MV Data for this period of time. Because of the uniform sampling rate of 1Hz, t=30 for eachAll are true, where D has a value between 1 and D. The invention also uses t MV-30 To t MV-2 And t MV-30 To t MV-4 To provide early warning of heavy landing events.
TABLE 1
Further, in one embodiment of the present invention, the time parameter module and the parameter time module each include: a convolutional feature encoder, a first stage attention mechanism, and a second stage attention mechanism;
the time parameter module TPB firstly learns the internal time dependency relationship of each parameter and then learns the interrelationship among a plurality of parameters; the parameter time module PTB learns the relationships between the plurality of parameters at the same time point and learns the correlations between the plurality of time points.
The invention provides a bidirectional two-stage interpretable heavy landing prediction model DUTSAI (DUal Two Stage Attention-based Interpretable model) based on an attention mechanism. This model contains two parallel parts. One of the components is a time parameter module (Temporal Parametric Block, TPB) that first learns the internal time dependence of each parameter and then learns the correlations between the multiple parameters. The other part is a parameter time module (Parametric Temporal Block, PTB), contrary to TPB, the module learns the relation among a plurality of parameters in the same time point, and then learns the interrelationship among a plurality of time points, and the design idea is one of the core innovation points of the invention. Both TPB and PTB consist of a convolutional signature encoder and a two-stage attention mechanism. The two convolutional feature encoders first extract and encode useful information from the processed data in the previous section, but they do so from different directions, specifically: the TPB is encoded from the time domain and the PTB is encoded from the parameter domain. The present invention then uses the attention mechanism to predict heavy landing and provides an explanation about the predicted outcome. In particular, each two-phase attention mechanism captures the correlation between time and between parameters hierarchically in reverse order from the two encoded data. The present invention explains the cause of occurrence of heavy landing by visualizing the influence of each time point and each parameter on model decisions. Since there are two calculation paths, the interpretation of heavy landings is also done from two angles, which is an interpretable feature and innovation of the present invention. Fig. 2 shows the framework of the dutai model.
Further, in one embodiment of the invention, a convolutional feature encoder is used to extract useful information from the preprocessed data; wherein the length of the time dimension after convolution is unchanged by adding zero padding on the left side of the input data.
For already processed data, the present invention first employs a convolutional feature encoder to extract useful information from the processed data, the encoding being performed separately in the time and parameter domains, specifically in the TPB, given the dataCan be regarded as x= (X) 1,* ,X 2,* ,…,X D,* )。
For each ofApplying thereto a convolution kernel of size k 1 Is a one-dimensional convolution of (a) and (b). In the present invention, k 1 >1 to extract the links between adjacent time points. However, when k 1 Above 1, such a convolution layer may shorten the length of the convolution output result in the time dimension, resulting in an interpretable mismatch. Inspired by the TCN design, zero padding is added to the left side of the input data to keep the length of the convolved time dimension unchanged. Fig. 3 shows a convolution kernel k with padding p=2 1 Cnn=3. In addition, the channel of the input data is 1, and the output channel is C 1 This makes the output of the d-th parameter +.>Multiple convolution kernels may extract different aspects of information from the input data to deeply mine and encode time series information. The weights of the convolution operation are shared in the time dimension, but for each parameter there is a set of own unique weights to extract the pattern of the particular parameter.
As shown in fig. 3, for the input data X d,* One with padding p=2 and convolution kernel k 1 Cnn=3. The blank gray blocks below represent the original input data, blankThe blocks represent zero padding. The gray blocks above are extracted features. The filler blocks allow the original length to be preserved after convolution. On the other hand, in PTB, an encoder may be applied to a parameter domain. The invention adopts a convolution kernel with the size of k 2 The number of convolution kernels (i.e. output channels) is C 2 The CNN layer is applied to eachHowever, unlike the above case, k 2 Must be set to 1 because the mixed coding of different parameters will result in multiple parameters being combined into the same feature space, thus failing to distinguish the importance of the different parameters and compromising interpretability. Other arrangements are similar to time-domain encoders. The final output of the module is +.> Representing the encoded data at the t-th time point.
Further, in one embodiment of the present invention, the method further includes:
evaluating importance weights of the coded data in the sequence through a first-stage attention mechanism, and further obtaining context vectors through the importance weights;
in TPB, data after coding the d parameterThe importance score for each time point within is learned by the following formula:
wherein the method comprises the steps ofIs a weight matrix, < >>Is the deviation vector of the d-th parameter, sigma 1 Representing a nonlinear function>An attention weight vector that is the d-th parameter;
wherein the context of the d-th parameter represents a vectorIs the sum of the weight vectors of each time point, and the calculation method is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Transposed matrix of>By connecting d=1 to D +.>The product can be obtained by the method,from d=1 to D +.>Is combined into a whole;
in PTB, the data after the coding at the t time pointThe importance score for each parameter within is learned by the following formula:
wherein, the liquid crystal display device comprises a liquid crystal display device,is a weight matrix, < >>Is the deviation vector of the t time point, sigma 2 Represents a nonlinear function,/->Evaluating the contribution of each parameter to the model final decision at a t-th time point, the global vector at the t-th time point being calculated with the following formula:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Transposed matrix of>The contribution of the representation vector on the parameter domain to the model decision is summarized by weighting the representation vector. />By connecting from t=1 to T +.>Obtained (I)>From t=1 to T->Weights are composed and assigned to all parameters at all time points.
Further, in one embodiment of the invention, a second stage attention mechanism is used to obtain a higher level of attention score and final context vector;
in TPB, the attention score of each parameter is calculated as follows:
A 3 =σ 3 (W 3 E 1T +b 3 ),
wherein, the liquid crystal display device comprises a liquid crystal display device,is a model parameter, +.>Quantifying the information contribution of each parameter to the prediction, then, < ->The definition is as follows:
G 1 =A 3 E 1
representing a context vector obtained by assigning attention weights in the time domain first and then in the parameter domain;
in PTB, the attention score of each parameter is calculated as follows:
A 4 =σ 4 (W 4 E 2T +b 4 ),
wherein, the liquid crystal display device comprises a liquid crystal display device,is a model parameter, +.>Quantifying the information contribution of each time point to the prediction, < >>Polymerization E 2 The information of all time points in the database is defined as follows:
G 2 =A 4 E 2
G 1 and G 2 The context vector, which is obtained by encoding the input data in the reverse order of the time domain and the parameter domain, also constitutes the input of the prediction module.
Further, in one embodiment of the present invention, inputting the flight data to be predicted into the training-completed bidirectional two-stage based on the attention mechanism can explain the heavy landing prediction model to obtain the prediction result and the occurrence reason of heavy landing, including:
predicting whether a heavy landing event will occur for a flight by:
wherein, [ G 1 ,G 2 ]Is G 1 And G 2 Fc (·) represents the fully connected network,representing predictive labels, the training of the overall model aims to minimize the cross entropy loss function as follows:
where y represents the real label.
The whole model has two calculation paths, so the interpretability can also be demonstrated from two angles. First, in TPB, A 1 Giving importance weights for each time point within each parameter, A 3 Weights for each parameter are given. A is that 1 And A 3 Can be obtained by the product of these two attention scores:
wherein the method comprises the steps ofThe overall importance score of a computation path representing broadcast element multiplication, similarly, computing the attention score of a parameter domain followed by computing the attention score of a time domain, can be found by:
SumA 1 and SumA 2 The calculation process of (2) is shown in fig. 4.The importance of each parameter to the final decision at each point in time is revealed at a fine granularity. Specifically, the larger the value, the greater the contribution to the prediction.
The invention provides a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism. The model is based on a CNN structure, an attention mechanism is added in forward propagation, and the calculated attention weight is used for visualization after the forward propagation is finished so as to explain the heavy landing leg. Meanwhile, aiming at the time sequence characteristics of QAR data, the data are respectively encoded in two dimensions of time and parameters, attention scores are obtained in stages, finally, the attention scores are fused for visualization, and context vectors are fused to obtain a final prediction result. The method and the device can provide accurate and reliable heavy landing classification results, and help pilots to prevent heavy landing events in advance. Meanwhile, the method and the device can provide explanation of heavy landing events, help pilots or flight specialists to better find the reasons of the occurrence of the events, guide and promote the flight skills of the pilots to fly, and better guarantee the flight safety.
In order to implement the above embodiment, the present invention also proposes a bidirectional two-stage interpretable heavy landing prediction device based on an attention mechanism.
Fig. 5 is a schematic structural diagram of a bidirectional two-stage interpretable heavy landing prediction device based on an attention mechanism according to an embodiment of the present invention.
As shown in fig. 5, the attention-based bi-directional two-stage interpretable heavy landing prediction device includes: an acquisition module 100, a construction module 200, a training module 300, a prediction module 500, wherein,
the acquisition module is used for acquiring the original flight parameters, and preprocessing the original flight parameters to obtain training data;
the construction module is used for constructing a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism; wherein the attention mechanism-based bi-directional two-phase interpretable heavy landing prediction model includes a temporal parameter module and a parametric temporal module;
the training module is used for inputting training data into a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism to train;
the prediction module is used for acquiring the flight data to be predicted, inputting the flight data to be predicted into the training-completed bidirectional two-stage interpretable heavy landing prediction model based on the attention mechanism, and obtaining a prediction result and an occurrence reason during heavy landing.
To achieve the above object, an embodiment of a third aspect of the present invention provides a computer device, which is characterized by comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the two-way two-stage interpretable heavy landing prediction method based on an attention mechanism as described above when executing the computer program.
To achieve the above object, a fourth aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements a bi-directional two-phase interpretable heavy landing prediction method based on an attention mechanism as described above.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.
While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims (9)

1. A bi-directional two-phase interpretable heavy landing prediction method based on an attention mechanism, comprising the steps of:
acquiring original flight parameters, and preprocessing the original flight parameters to obtain training data;
constructing a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism; wherein the attention mechanism-based bi-directional two-phase interpretable heavy landing prediction model includes a time parameter module and a parameter time module;
inputting the training data into a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism for training;
and acquiring flight data to be predicted, inputting the flight data to be predicted into a training-completed bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism, and obtaining a prediction result and an occurrence reason during heavy landing.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the time parameter module and the parameter time module each include: a convolutional feature encoder, a first stage attention mechanism, and a second stage attention mechanism;
the time parameter module TPB firstly learns the internal time dependency relationship of each parameter and then learns the interrelationship among a plurality of parameters; the parameter time module PTB learns the relationships between the plurality of parameters at the same time point and learns the correlations between the plurality of time points.
3. The method of claim 2, wherein the step of determining the position of the substrate comprises,
the convolution feature encoder is used for extracting useful information from the preprocessed data; wherein the length of the time dimension after convolution is unchanged by adding zero padding on the left side of the input data.
4. The method as recited in claim 2, further comprising:
evaluating importance weights of coded data in the sequence through the first-stage attention mechanism, and further obtaining context vectors through the importance weights;
in TPB, data after coding the d parameterThe importance score for each time point within is learned by the following formula:
wherein the method comprises the steps ofIs a weight matrix, < >>Is the deviation vector of the d-th parameter, sigma 1 Representing a nonlinear function>An attention weight vector that is the d-th parameter;
wherein the context of the d-th parameter represents a vectorIs the sum of the weight vectors of each time point, and the calculation method is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Transposed matrix of>By connecting d=1 to D +.>Obtained (I)>From d=1 to D +.>Is combined into a whole;
in PTB, the data after the coding at the t time pointThe importance score for each parameter within is learned by the following formula:
wherein, the liquid crystal display device comprises a liquid crystal display device,is a weight matrix, < >>Is the deviation vector of the t time point, sigma 2 Represents a nonlinear function,/->Evaluating the contribution of each parameter to the model final decision at a t-th time point, the global vector at the t-th time point being calculated with the following formula:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Transposed matrix of>The contribution of the representation vector on the parameter domain to the model decision is summarized by weighting the representation vector. />By connecting from t=1 to T +.>Obtained (I)>From t=1 to T +.>Weights are composed and assigned to all parameters at all time points.
5. The method according to claim 2 or 4, wherein,
the second stage attention mechanism is used for obtaining higher-level attention scores and final context vectors;
in TPB, the attention score of each parameter is calculated as follows:
A 3 =σ 3 (W 3 E 1T +b 3 ),
wherein, the liquid crystal display device comprises a liquid crystal display device,is a model parameter, +.>Quantifying the information contribution of each parameter to the prediction, then, < ->The definition is as follows:
G 1 =A 3 E 1
representing a context vector obtained by assigning attention weights in the time domain first and then in the parameter domain;
in PTB, the attention score of each parameter is calculated as follows:
A 4 =σ 4 (W 4 E 2T +b 4 ),
wherein, the liquid crystal display device comprises a liquid crystal display device,is a model parameter, +.>Quantifying the information contribution of each time point to the prediction, < >>Polymerization E 2 The information of all time points in the database is defined as follows:
G 2 =A 4 E 2
G 1 and G 2 The context vector, which is obtained by encoding the input data in the reverse order of the time domain and the parameter domain, also constitutes the input of the prediction module.
6. The method according to claim 1 or 5, wherein inputting the to-be-predicted flight data into the trained attention-based bi-directional two-stage interpretable heavy landing prediction model to obtain a prediction result and a cause of occurrence of heavy landing comprises:
predicting whether a heavy landing event will occur for a flight by:
wherein, [ G 1 ,G 2 ]Is G 1 And G 2 Fc (·) represents the fully connected network,representing predictive labels, the training of the overall model aims to minimize the cross entropy loss function as follows:
where y represents the real label.
7. A bi-directional two-phase interpretable heavy landing prediction device based on an attention mechanism, comprising the following modules:
the acquisition module is used for acquiring original flight parameters, and preprocessing the original flight parameters to obtain training data;
the construction module is used for constructing a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism; wherein the attention mechanism-based bi-directional two-phase interpretable heavy landing prediction model includes a time parameter module and a parameter time module;
the training module is used for inputting the training data into a bidirectional two-stage interpretable heavy landing prediction model based on an attention mechanism to train;
the prediction module is used for acquiring the flight data to be predicted, inputting the flight data to be predicted into the training-completed bidirectional two-stage interpretable heavy landing prediction model based on the attention mechanism, and obtaining a prediction result and an occurrence reason during heavy landing.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the attention-based bi-directional two-phase interpretable heavy landing prediction method as claimed in any one of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements a two-way two-phase interpretable heavy landing prediction method based on an attention mechanism as claimed in any one of claims 1-6.
CN202310439351.1A 2023-04-21 2023-04-21 Attention mechanism-based bidirectional two-stage interpretable heavy landing prediction method Active CN116522771B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310439351.1A CN116522771B (en) 2023-04-21 2023-04-21 Attention mechanism-based bidirectional two-stage interpretable heavy landing prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310439351.1A CN116522771B (en) 2023-04-21 2023-04-21 Attention mechanism-based bidirectional two-stage interpretable heavy landing prediction method

Publications (2)

Publication Number Publication Date
CN116522771A true CN116522771A (en) 2023-08-01
CN116522771B CN116522771B (en) 2024-01-26

Family

ID=87405829

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310439351.1A Active CN116522771B (en) 2023-04-21 2023-04-21 Attention mechanism-based bidirectional two-stage interpretable heavy landing prediction method

Country Status (1)

Country Link
CN (1) CN116522771B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9310222B1 (en) * 2014-06-16 2016-04-12 Sean Patrick Suiter Flight assistant with automatic configuration and landing site selection method and apparatus
US20170291715A1 (en) * 2016-04-06 2017-10-12 Honeywell International Inc. Methods and apparatus for providing real-time flight safety advisory data and analytics
CN109977517A (en) * 2019-03-19 2019-07-05 北京瑞斯克企业管理咨询有限公司 A kind of personal landing again and group's offline mode comparative analysis method based on QAR parameter curve
CN111008669A (en) * 2019-12-10 2020-04-14 北京航空航天大学 Deep learning-based heavy landing prediction method
CN113486938A (en) * 2021-06-28 2021-10-08 重庆大学 Multi-branch time convolution network-based re-landing analysis method and device
CN114282792A (en) * 2021-12-20 2022-04-05 中国民航科学技术研究院 Flight landing quality monitoring and evaluating method and system
US20230025527A1 (en) * 2021-07-26 2023-01-26 Chongqing University Quantitative analysis method and system for attention based on line-of-sight estimation neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9310222B1 (en) * 2014-06-16 2016-04-12 Sean Patrick Suiter Flight assistant with automatic configuration and landing site selection method and apparatus
US20170291715A1 (en) * 2016-04-06 2017-10-12 Honeywell International Inc. Methods and apparatus for providing real-time flight safety advisory data and analytics
CN109977517A (en) * 2019-03-19 2019-07-05 北京瑞斯克企业管理咨询有限公司 A kind of personal landing again and group's offline mode comparative analysis method based on QAR parameter curve
CN111008669A (en) * 2019-12-10 2020-04-14 北京航空航天大学 Deep learning-based heavy landing prediction method
CN113486938A (en) * 2021-06-28 2021-10-08 重庆大学 Multi-branch time convolution network-based re-landing analysis method and device
US20230025527A1 (en) * 2021-07-26 2023-01-26 Chongqing University Quantitative analysis method and system for attention based on line-of-sight estimation neural network
CN114282792A (en) * 2021-12-20 2022-04-05 中国民航科学技术研究院 Flight landing quality monitoring and evaluating method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YEQI LIU 等: "DSTP-RNN: A dual-stage two-phase attention-based recurrent neural network for long-term and multivariate time series prediction", 《EXPERT SYSTEMS WITH APPLICATIONS》, vol. 143, pages 1 - 12 *
王璇: "基于注意力机制双向长短期记忆网络的短期负荷滚动预测方法", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》, no. 1, pages 042 - 1509 *

Also Published As

Publication number Publication date
CN116522771B (en) 2024-01-26

Similar Documents

Publication Publication Date Title
Brunton et al. Data-driven aerospace engineering: reframing the industry with machine learning
Li et al. Analysis of flight data using clustering techniques for detecting abnormal operations
Mainini et al. Surrogate modeling approach to support real-time structural assessment and decision making
Zhang et al. Technology evolution prediction using Lotka–Volterra equations
CN113486938B (en) Multi-branch time convolution network-based re-landing analysis method and device
CN107133253A (en) Recommendation based on forecast model
CN107103362A (en) The renewal of machine learning system
Liu et al. Information fusion for national airspace system prognostics: A NASA ULI project
Puranik et al. Identification of instantaneous anomalies in general aviation operations using energy metrics
Rose et al. Application of structural topic modeling to aviation safety data
Lee et al. Deep spatio-temporal neural networks for risk prediction and decision support in aviation operations
Ballakur et al. Empirical evaluation of gated recurrent neural network architectures in aviation delay prediction
CN116468186B (en) Flight delay time prediction method, electronic equipment and storage medium
Bleu Laine et al. Multiclass multiple-instance learning for predicting precursors to aviation safety events
Kang et al. A deep sequence‐to‐sequence method for accurate long landing prediction based on flight data
Wang et al. A Bayesian-entropy network for information fusion and reliability assessment of national airspace systems
CN116522771B (en) Attention mechanism-based bidirectional two-stage interpretable heavy landing prediction method
Jiang et al. Reliability analysis of the starting and landing system of UAV by FMECA and FTA
Kong et al. Aircraft landing distance prediction: a multistep long short-term memory approach
Cankaya et al. Business inferences and risk modeling with machine learning; the case of aviation incidents
El Mir et al. Certification approach for physics informed machine learning and its application in landing gear life assessment
Friso et al. Predicting abnormal runway occupancy times and observing related precursors
Haselein et al. Multiple machine learning modeling on near mid-air collisions: An approach towards probabilistic reasoning
Jain et al. Using Deep Learning to Predict Unstable Approaches for General Aviation Aircraft
İnan et al. The analysis of fatal aviation accidents more than 100 dead passengers: an application of machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant