CN110085327A - Multichannel LSTM neural network Influenza epidemic situation prediction technique based on attention mechanism - Google Patents

Multichannel LSTM neural network Influenza epidemic situation prediction technique based on attention mechanism Download PDF

Info

Publication number
CN110085327A
CN110085327A CN201910256008.7A CN201910256008A CN110085327A CN 110085327 A CN110085327 A CN 110085327A CN 201910256008 A CN201910256008 A CN 201910256008A CN 110085327 A CN110085327 A CN 110085327A
Authority
CN
China
Prior art keywords
data
neural network
representing
lstm neural
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910256008.7A
Other languages
Chinese (zh)
Inventor
郝建业
侯韩旭
马钰
付博峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan University of Technology
Original Assignee
Dongguan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan University of Technology filed Critical Dongguan University of Technology
Priority to CN201910256008.7A priority Critical patent/CN110085327A/en
Publication of CN110085327A publication Critical patent/CN110085327A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Primary Health Care (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides the multichannel LSTM neural network Influenza epidemic situation prediction technique based on attention mechanism, belongs to epidemic disease monitoring technical field.The present invention first pre-processes data intensive data, standardizes, feature selecting, and the data of selection are divided into two class of weather dependent data and Influenza epidemic situation related data, generate training set;Then the multichannel LSTM neural network model including attention mechanism is established;Training set data is inputted the model to be trained, and carries out MAPE assessment, obtains trained multichannel LSTM neural network model;Test data is handled, test set is obtained;Test set data are inputted in trained LSTM neural network model and are tested;Inverse standardization finally is carried out to test output result, obtains Influenza epidemic situation predicted value.The present invention solves the problems, such as that existing Influenza epidemic situation Predicting Technique predictablity rate is lower.The present invention can be used for the influenza prediction of different zones.

Description

Attention mechanism-based multi-channel LSTM neural network influenza epidemic situation prediction method
Technical Field
The invention relates to an influenza epidemic situation prediction method, and belongs to the technical field of epidemic disease monitoring.
Background
Influenza is an acute respiratory infection caused by influenza virus. After the patient is infected with the disease, the primary disease is possibly aggravated, and secondary bacterial pneumonia, chronic heart and lung diseases and the like are caused. Outbreaks of influenza epidemics are seasonal and thus may trigger social panic, with major impact on human health and social stability (d.n.t.how, c.k.lo, and k.s.m. sahara. behavior recognition for human beings using short-term social-technical. international Journal of Advanced robotics Systems,13(6):1729881416663369,2016.). For example, influenza type H1N1, outbreak in 2009, caused death of 151,700 to 575,400 people worldwide in the first year of an outbreak (S.Yang, M. Santilana, and S.C.Kou.acquisition information of influenza using a genetic search data, visual. proceedings of the National Academy of Sciences,112(47): 14473-14478, 2015.). Therefore, the accurate and real-time monitoring and early warning of the influenza epidemic situation have important practical significance for public health epidemic prevention departments. The flu epidemic monitoring and early warning system can provide epidemiological information for public health departments, help the public health departments to do epidemic prevention work in advance, and then coordinate each Medical institution in the region to take corresponding countermeasures (J.S. Brown and K.D.Mandl.Regenerative real time outbreak detection systems for influenting Medical monitoring. in AMIA Annual Symposium Proceedings, volume 2006, page866.American Medical information Association, 2006.).
Ili is a criterion for acute respiratory infection established by the world health Organization (abbr. who). Symptoms in flu-like cases are fever above 38 ℃ within 10 days after infection and accompanying cough (W.H. Organization et al. Who intuitional local epidemic behaviour standards for influenca. geneva: world health Organization, pages 1-61,2012.). Our prediction target is ILI%, which is calculated as the ratio of the number of confirmed flu-like cases to the total number of patients. In the field of influenza epidemic monitoring, ILI% is often used as an indicator to determine whether an outbreak of influenza has occurred. When the ILI% exceeds a certain threshold value, the flu season comes, and the relevant department is reminded to do hygiene and epidemic prevention work in time.
In recent years, more and more scholars are focusing on the research of accurate real-time monitoring, early monitoring and epidemic situation early warning of influenza epidemic situations. Influenza epidemic prediction has become a research direction of great interest in academia by using information such as Twitter, Google Correlationship, etc. in web search or social networking sites ([ H.Achrekar, A.Gandhe, R.Lazarus, S. -H.Yu, and B.Liu. Predicting flash using and transmitting data. in computer communications works, (INFOCOM WKSHPS),2011IEEE Conference on, pages 702-707. IEEE,2011., [ D.A.Broniaywski, M.J. Paul 2006, and M.drive.national and localfluenza resonant transmission screw viewer: analysis of 2013 input, 3586.g.) (3527.35: 19. 33: 19. III.12. Hindu.) through the use of Internet, and the research of business discovery, III. Previous research methods are generally based on some commonly used linear models such as least absolute shrinkage and selection operator (LASSO LASSO algorithm) or normalized regression penalty regression, etc. (D.A. broadsword, M.J. Paul, and M.D. draft. national and local approximation screw: analysis of the 2012 and 2013 underfilling electronic plos, 8(12) e83672,2013.], [ M.Santillana, E.O. Nsorese, S.R. root, D.scales, and J.S. brown. using 'areas' discovery, analysis data, and M.S. broadcast of area, analysis of software, 10. export of software, 10. transform, M.D. export of software, see [ 10. export of software, 7. export of software, see [ 10. export of software, see, 7. export of software, 7. software, see, 7. export of software, see, 7. export, see, 2. export, see, 2. copy, see, section, see, section, see, section, 7, see, section, see, sample, see, sample, see. Still others have solved the influenza epidemic prediction problem using a deep learning approach (H.Hu, H.Wang, F.Wang, D.Langley, A.Avram, and M.Liu.Press of influenza-induced genetic basic on the improved anatomical tree, 8(1):4895,2018.], [ Q.Xu, Y.R.gel, L.R.Ramirez, K.Nezafati, Q.Zhang, and K. -L.Tsui.Foresting in fluidic search query and static fusion, P.P.355.: 0176690,2017). However, these methods do not predict the change in influenza epidemic (influenza-like case ratio ILI%) relatively accurately. First, the data collected in the internet is not accurate enough and lacks the features needed to accurately predict influenza epidemics. The result predicted by using the online data cannot accurately reflect the change trend of the influenza epidemic situation. Secondly, influenza epidemic situation data are complex in composition, strong in noise and data diversity, and information in multi-dimensional input data cannot be fully utilized by a traditional linear method. Thirdly, in the deep learning method proposed previously, the time sequence characteristics of the influenza epidemic situation data are not considered; therefore, a high-accuracy influenza epidemic situation prediction technology is urgently needed.
Disclosure of Invention
The invention provides a multichannel LSTM neural network influenza epidemic situation prediction method based on an attention mechanism, which aims to solve the problem of low prediction accuracy of the existing influenza epidemic situation prediction technology.
The invention discloses an attention mechanism-based multi-channel LSTM neural network influenza epidemic situation prediction method, which is realized by the following technical scheme:
step one, preprocessing and standardizing data in a data set; then, performing feature selection by using a mode based on model sorting, and dividing the selected data into weather related data and influenza epidemic situation related data to generate a training set;
step two, establishing a multi-channel LSTM neural network model comprising an attention mechanism; the input of the multichannel LSTM neural network model comprises influenza epidemic related data and weather related data;
inputting training set data into the multichannel LSTM neural network model for training, and performing MAPE (mapping adaptive mapping algorithm) evaluation to obtain a trained multichannel LSTM neural network model;
step four, processing the test data in the same way as the step one to obtain a test set;
inputting the test set data into the trained LSTM neural network model for testing;
and sixthly, carrying out inverse standardization processing on the test output result to obtain the influenza epidemic situation prediction value.
The most prominent characteristics and remarkable beneficial effects of the invention are as follows:
the invention relates to a multichannel LSTM neural network influenza epidemic situation prediction method based on an attention mechanism, which takes the time sequence characteristics of influenza epidemic situation data into consideration and uses a long-term and short-term memory neural network as a basis; by designing a multi-channel structure, the time sequence information in the data is better extracted, so that the data of different types are not influenced mutually in an underlying network, and the fusion of multi-dimensional data in a high-level network is also ensured; in addition, the accuracy of prediction is further improved by adding an Attention mechanism; thereby pertinently solving the influenza prediction problem. Simulation experiments show that the accuracy of the test result of the method is obviously higher than that of other traditional methods, and accurate and effective real-time prediction can be provided for ILI% (influenza sample case ratio).
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of an LSTM memory cell according to the present invention;
FIG. 3 is a schematic diagram of the Attention mechanism of the present invention;
FIG. 4 is a schematic diagram of an Attention-based Multi-channel LSTM model according to an embodiment of the present invention;
FIG. 5 is a graph comparing the real value and the predicted value of Att-MCLSTM in the embodiment of the present invention;
FIG. 6 is a graph comparing the real and predicted values of MCLSTM in an embodiment of the present invention;
FIG. 7 is a graph comparing the real and predicted values of LSTM in an embodiment of the present invention;
FIG. 8 is a graph comparing the real and predicted values of RNN in an embodiment of the present invention;
1. input gate, 2 output gate, 3 forget gate, 4 self-circulation neuron.
Detailed Description
The first embodiment is as follows: the embodiment is described with reference to fig. 1, and the method for predicting the influenza epidemic situation in the multichannel LSTM neural network based on the attention mechanism in the embodiment specifically includes the following steps:
step one, Preprocessing (Preprocessing) and standardizing (Normalization) data in a data set; then, a model-based ranking mode (model-based ranking) is used for feature selection, and the selected data are divided into weather related data and influenza epidemic situation related data to generate a training set;
establishing a multichannel LSTM neural network model including an Attention mechanism (Attention); this model is named Att-MCLSTM, namely Attention-based Multi-channel LSTM (Multi-channel LSTM neural network based on Attention-driven mechanism). Unlike the conventional recurrent Neural network RNN (Recurrent Neural network), the multi-channel LSTM Neural network model based on attention mechanism can solve the problem of gradient disappearance (S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation,9(8): 1735-1780, 1997); the memory module of the LSTM memory unit reserves the sequence information of the input context, and has better effect than the traditional RNN in the aspect of time sequence data processing. The input of the multichannel LSTM neural network model (Att-MCLSTM) comprises influenza epidemic related data and weather related data;
inputting the training set data processed in the step one into the multichannel LSTM neural network model for training, and performing MAPE (mapping adaptive mapping algorithm) evaluation to obtain a trained multichannel LSTM neural network model;
step four, processing the test data in the same way as the step one to obtain a test set;
inputting the test set data into the trained LSTM neural network model for testing;
and sixthly, carrying out inverse standardization processing on the test output result in order to reconstruct the original data to obtain the influenza epidemic situation predicted value.
The method adopts a deep neural network method to solve the problem of influenza epidemic situation prediction. In consideration of the time-series characteristics of the influenza epidemic situation data, the present embodiment uses a Long-short term memory (LSTM) neural network 0 as a basic prediction method. Because input data of different dimensions have different characteristics, the time sequence characteristics in multiple data dimensions cannot be fully extracted by using a single network structure. By designing a Multi-channel (Multi-channel) structure, timing information in data can be better extracted. The method not only ensures that different types of data do not influence each other in the underlying network, but also ensures the fusion of multi-dimensional data in a high-level network. The multi-level LSTM neural network model has strong fitting capability, and the accuracy of prediction is further improved by adding an Attention mechanism; in the Attention layer, the probability of occurrence of a value in the output sequence depends on the value in the input sequence. The Attention structure allows the model to better handle the relationships between the different regions of input data.
The second embodiment is as follows: the difference between the present embodiment and the first embodiment is that the normalization in the first step specifically includes the following steps:
and (3) carrying out Min-Max standardization (also called dispersion standardization) on the preprocessed data:
wherein x is a numerical value in the preprocessed data, and xminIs the minimum value, x, in the pre-processed datamaxIs the maximum value in the preprocessed data, and y is the value of x after Min-Max standardization processing; after data normalization, the data values will scale between 0 and 1.
Other steps and parameters are the same as those in the first embodiment.
The third concrete implementation mode: the difference between this embodiment and the second embodiment is that, as shown in fig. 2, the LSTM memory unit in the multichannel LSTM neural network model in step two includes an input gate 1, an output gate 2, a forgetting gate 3, and a self-circulation neuron 4; the structure gate structure of the LSTM memory cell controls the transfer of data in the LSTM memory cell, including data transfer between different cells and data transfer within the cell. The input gate 1 controls the state updating process of the unit, the output gate 2 controls whether the output sequence of the unit changes the memory state of other units, and the forgetting gate 3 can selectively retain or forget the previous state.
The LSTM memory cell can be represented by the following system of equations:
wherein, sigma (-) is a logistic sigmoid function (mapping variables to be between 0-1), tanh (-) is a hyperbolic tangent function, and both sigma (-) and tanh (-) are activation functions; i.e. itInput Gate State at time t, ftLeft door state at time t, otOutput gate state at time t, ctRepresents the unit state (activation vector) at time t, htA hidden state (hidden vector) indicating time t; wxiA weight matrix representing the weight between the input gate and the input data; whi(hidden-input gate) weight matrix representing input gate, hidden layer; wciRepresenting weight matrixes among input gates and units; wxfA weight matrix representing the forgetting gate and the input data; whfRepresenting a weight matrix between a forgetting gate and a hidden layer; wcfRepresenting a weight matrix between forgetting gates and units; wxcA weight matrix between the unit and the input data is represented; whcA weight matrix between the presentation unit and the hidden layer; wxoRepresenting a weight matrix between the output gate and the input data; whoRepresenting the weight matrix between the output gate and the hidden layer; wcoRepresenting the weight matrix between output gates and units; x is the number oftInput value representing time t, biRepresenting an input gate bias term; bfA representation forgetting gate bias term; bcRepresenting a cell state bias term; boAn output gate bias term is indicated.
Other steps and parameters are the same as those in the first or second embodiment.
The fourth concrete implementation mode: the present embodiment is different from the first, second, or third embodiment in that the attention mechanism in the second step can be expressed as:
mi=tanh(Wcmc+Wymyi) (3)
ai=∝exp(<wm,mi>) (4)
z=∑iaiyi(6)
wherein the attention layer inputs n parameters y1,…,ynAnd context sequence c, output vector z; 1, …, n; wcmIs a context sequence weight matrix, WymRepresenting an input vector weight matrix; vector z is given context sequence c, yiThe weighted arithmetic mean of (a); sequence miDenotes c and yiThe polymerization is calculated by the tanh layer; oc indicates a positive rate; w is amA weight matrix representing when the normalization index function softmax (i.e., equation (4)) is performed; a isiRepresenting m given a context sequence ciSoftmax results of (1).
Conventional codec structures typically encode an input sequence as a fixed-length vector. However, there are some drawbacks to this structure. When the input sequence is long, the structure is difficult to learn a proper vector characterization mode. The basic idea of the Attention mechanism is to break the traditional codec structure and to achieve selective learning of information in the input sequence by using the intermediate result training model of the LSTM encoder. There is therefore a correlation between the output sequence and the input sequence, i.e. the probability of each value in the output sequence appearing depends on the values in the input sequence.
FIG. 3 is a schematic diagram of the Attention mechanism. Attention layer computation y1,…,ynWeight distribution of (S)tThe input to the LSTM layer at time t includes the output of the Attention layer. LSTM layer output sequence { …, xt-1,xt… } the probability of occurrence of the numerical value depends on the input sequence y1,…,yn}。
Other steps and parameters are the same as those in the first, second or third embodiment.
The fifth concrete implementation mode: the fourth difference between the present embodiment and the fourth embodiment is that, in the third step, the mape (mean absolute percentage) evaluation specifically includes:
wherein MAPE is the mean absolute percent error,representing the ith true value, piRepresenting the ith predicted value; the lower the MAPE number, the higher the accuracy of the model.
Other steps and parameters are the same as those in the fourth embodiment.
The sixth specific implementation mode: the difference between this embodiment and the fifth embodiment is that, in the sixth step, the inverse normalization processing on the test output result specifically includes:
wherein,the predicted value is the influenza epidemic situation, and q is the test output result; q. q.smaxRepresents the maximum value, q, in the test output resultsminRepresenting the minimum value in the test output.
The other steps and parameters are the same as those in the fifth embodiment.
Examples
The following examples were used to demonstrate the beneficial effects of the present invention:
this embodiment uses influenza epidemic data collected by the Guangzhou disease control epidemic prevention center as a data set. The data set comprises 6 modules, and each module has multiple dimensions. Data records are in weeks, including 52 weeks (week) of data each year. Data preprocessing, standardization and feature selection are carried out; selecting characteristics by adopting a model-based ranking (model-based ranking) mode; each time one dimension in the data set is removed, all the remaining dimensions are input into the same prediction model, and the output results of the models are compared. If the accuracy of the prediction result is lower, the removed data dimension is more relevant to the prediction target. And sorting all the prediction results according to the accuracy, thereby selecting 19 dimensions with higher correlation with the prediction target. Selected dimensions and their descriptions are listed in table 1, where the basic information modules (including time information, regions and demographics) are not listed.
TABLE 1 Module and selected dimension description
And dividing the selected dimension data into two types, namely weather related data and influenza epidemic related data. The weather related data comprises average air temperature, highest air temperature, lowest air temperature, rainfall, air pressure and relative humidity, and the rest dimensions are classified as flu epidemic situation related data. All regions have the same weather-related data and different influenza epidemic-related data every week; thus, the correspondence of the multi-channel LSTM neural network model includes two channels: a flu-related channel and a weather-related channel.
The overall structure of the Attention-based Multi-channel LSTM (abbreviated as Att-MCLSTM) model is shown in FIG. 4. Firstly, a network (LSTM 1, …, LSTM 9) consisting of a group of LSTM neural memory units is used for processing influenza epidemic situation related data, and the data of each area (District 1st, District 1nd … District 9th) are respectively input into one LSTM neural memory unit; while a LSTM neural network (LSTM 10) is used to process weather-related data. In order to combine the information extracted from the different regional influenza epidemic related data, the outputs of all LSTM neural networks in the first part are fused in its upper fusion layer (Merge 1). Although this set of LSTM neural networks extracts information about the influenza epidemic in each region, it is still necessary to weight the intermediate output sequences of the extracted information. Because the influenza epidemic information of different areas has different influences on the change trend of the whole influenza epidemic in Guangzhou city. Thus, the intermediate output sequence of the LSTM neural network passes through the Attention layer (Attention) and the fully connected layer (density 1) in sequence. Thereafter, the two pieces of data are fused in a higher layer (fusion layer Merge 2) network. And finally, extracting information fused with multi-dimensional input data after passing through two full connection layers (Dense 2 and Dense 3).
(1) Selection of input data length
I.e. how many consecutive weeks of data are used to optimize the prediction result for the next week. To validate the data from the experimental tests, we split the data into two parts, a training set and a test set. All experimental results are the average of 10 replicates.
The input data length was set to 6 weeks, 8 weeks, 10 weeks, 12 weeks, and 14 weeks, respectively. The parameter settings of each layer of the Attention-based Multi-channel LSTM neural network are shown in Table 2. The activation function is a linear activation function, the loss function is map, and the optimizer uses adam;
TABLE 2 neural network layer parameter settings
Name of neural network layer Number of units
LSTM 1,…,LSTM 9 32
LSTM 10 32
Dense 1 16
Dense 2 10
Dense 3 1
The data from the first 370 weeks were used for training and the remaining data were tested. Each data record includes weather related data and influenza epidemic related data for 9 regions. The weather related data has 6 dimensions, and the flu epidemic related data has 13 dimensions. The weather-related data (Climate data) is input into a weather-related channel (Climate-related channel), and the stream epidemic data of each area is input into a corresponding flu-related channel (flu-related channel). The prediction results of the model are shown in table 3:
TABLE 3 MAPE values for each prediction result
Number of time cycles MAPE
6 0.107
8 0.092
10 0.086
12 0.106
14 0.109
As can be seen from table 3, the prediction effect is best when the input data length is 10. This shows that the data of 10 consecutive weeks can sufficiently reflect the time sequence characteristics of the influenza epidemic data. If the length of the input data is too short, the time sequence characteristics of the data cannot be sufficiently reflected; if the input data length is too long, the noise of the data may increase. In the following experiments, we selected data for 10 consecutive weeks as input data.
(2) Evaluation of effects
The validity of the Attention mechanism is verified by comparing the predicted results of Att-MCLSTM and MCLSTM (multichannel LSTM neural network). The setting of neural network parameters and data entry methods are described in table 1.
Secondly, the effectiveness of the multi-channel structure is verified by comparing the prediction results of the MCLSTM and LSTM neural networks. The setting and data input method of the neural network parameters of the MCLSTM is the same as the step (1). The LSTM neural network receives all inputs using one LSTM layer. The influenza epidemic situation related data of the same week in all the areas are added correspondingly and used as input together with the weather related data, so that each data record comprises 19 dimensions. And the output result of the LSTM layer is output after passing through a full connection layer. The number of neurons in the LSTM and fully junctional layers were 32 and 1, respectively.
Finally, the LSTM neural network is proved to have better effect than the common RNN network. The setting of neural network parameters and data input methods are as described above.
MAPE values for the four methods described above are shown in table 4. As can be seen from the table, Att-MCLSTM can give the best results. The data in the first two rows of table 4 show that the MAPE value is increased from 0.105 to 0.086 by adding the Attention mechanism. This illustrates that the Attention structure allows the model to better learn to handle the relationships between the different regions of input data. The data in the second and third rows of table 4 show that the MAPE value was increased from 0.118 to 0.105 by designing a multichannel structure. This demonstrates that the multi-channel structure can better extract the timing characteristics of a variety of input data. The MAPE values for the LSTM neural network and the generic RNN neural network predictions were 0.118 and 0.132, respectively. This demonstrates that the LSTM neural network is better able to handle timing characteristics in the input data than the normal RNN neural network. Meanwhile, the influenza epidemic situation data is verified to have time sequence.
TABLE 4 MAPE values for each prediction result
Method of producing a composite material MAPE
The method of the invention, Att-MCLSTM 0.086
MCLSTM 0.105
LSTM 0.118
RNN 0.132
The true values and predicted values of the four methods are shown in fig. 5, 6, 7, and 8, respectively. As can be seen from the figure, the predicted ILI% of the inventive method (Att-MCLSTM) is closest to the true ILI% (Actual ILI%). The predicted values and the true values of the other three methods have obvious differences. Experiments prove that the method can accurately and fully extract the implicit characteristics of the time sequence data and provide accurate flow epidemic situation prediction.
The present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof, and it is therefore intended that all such changes and modifications as fall within the true spirit and scope of the invention be considered as within the following claims.

Claims (6)

1. The attention mechanism-based multi-channel LSTM neural network influenza epidemic situation prediction method is characterized by comprising the following steps:
step one, preprocessing and standardizing data in a data set; then, selecting characteristics based on the ranking of the model, dividing the selected data into weather related data and influenza epidemic situation related data, and generating a training set;
step two, establishing a multi-channel LSTM neural network model comprising an attention mechanism; the input of the multichannel LSTM neural network model comprises influenza epidemic related data and weather related data;
inputting training set data into the multichannel LSTM neural network model for training, and performing MAPE (mapping adaptive mapping algorithm) evaluation to obtain a trained multichannel LSTM neural network model;
step four, processing the test data in the same way as the step one to obtain a test set;
inputting the test set data into the trained LSTM neural network model for testing;
and sixthly, carrying out inverse standardization processing on the test output result to obtain the influenza epidemic situation prediction value.
2. The attention mechanism-based multi-channel LSTM neural network influenza epidemic prediction method of claim 1, wherein the normalization in the first step specifically comprises the following processes:
carrying out Min-Max standardization treatment on the preprocessed data:
wherein x is a numerical value in the preprocessed data, and xminIs the minimum value, x, in the pre-processed datamaxIs the maximum value in the preprocessed data, and y is the value of x after Min-Max standardization processing.
3. The attention mechanism-based multi-channel LSTM neural network influenza epidemic prediction method of claim 2, wherein in step two the LSTM memory unit in the multi-channel LSTM neural network model comprises an input gate, an output gate, a forgetting gate and a self-circulation neuron; the LSTM memory cell can be represented by the following system of equations:
where σ (·) is a logical sigmoid function (mapping variables between 0 and 1)Tanh (·) is a hyperbolic tangent function; i.e. itInput Gate State at time t, ftLeft door state at time t, otOutput gate state at time t, ctRepresents the cell state at time t, htRepresenting a hidden state at time t; wxiA weight matrix representing the weight between the input gate and the input data; whiRepresenting the weight matrix between the input gate and the hidden layer; wciRepresenting weight matrixes among input gates and units; wxfA weight matrix representing the forgetting gate and the input data; whfRepresenting a weight matrix between a forgetting gate and a hidden layer; wcfRepresenting a weight matrix between forgetting gates and units; wxcA weight matrix between the representation unit and the input data; whcA weight matrix between the presentation unit and the hidden layer; wxoRepresenting a weight matrix between the output gate and the input data; whoRepresenting the weight matrix between the output gate and the hidden layer; wcoRepresenting the weight matrix between output gates and units; x is the number oftInput value representing time t, biRepresenting an input gate bias term; bfA representation forgetting gate bias term; bcRepresenting a cell state bias term; boAn output gate bias term is indicated.
4. The method for predicting the influenza epidemic of the multichannel LSTM neural network based on the attention mechanism as claimed in claim 1,2 or 3, wherein the attention mechanism in the second step can be expressed as:
mi=tanh(Wcmc+Wymyi) (3)
ai=∝exp(<wm,mi>) (4)
z=∑iaiyi(6)
wherein the attention layer inputs n parameters y1,…,ynAnd context sequence c, output vector z; 1, …, n; wcmIs a context sequence weight matrix, WymRepresenting an input vector weight matrix; vector z is given context sequence c, yiWeighted arithmetic mean of (a); sequence miDenotes c and yiPolymerization of (a); oc indicates a positive rate; w is amRepresenting a weight matrix when normalization is performed; a isiRepresenting m given a context sequence ciAnd (5) normalizing the result.
5. The attention mechanism-based multichannel LSTM neural network influenza epidemic prediction method according to claim 4, wherein the MAPE evaluation in step three specifically comprises:
wherein MAPE is the mean absolute percent error,representing the ith true value, piIndicates the ith prediction value.
6. The attention mechanism-based multi-channel LSTM neural network influenza epidemic prediction method according to claim 5, wherein the inverse standardization processing of the test output result in the sixth step is specifically:
wherein,the predicted value is the influenza epidemic situation, and q is the test output result; q. q.smaxRepresents the maximum value, q, in the test output resultsminRepresenting the minimum value in the test output.
CN201910256008.7A 2019-04-01 2019-04-01 Multichannel LSTM neural network Influenza epidemic situation prediction technique based on attention mechanism Pending CN110085327A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910256008.7A CN110085327A (en) 2019-04-01 2019-04-01 Multichannel LSTM neural network Influenza epidemic situation prediction technique based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910256008.7A CN110085327A (en) 2019-04-01 2019-04-01 Multichannel LSTM neural network Influenza epidemic situation prediction technique based on attention mechanism

Publications (1)

Publication Number Publication Date
CN110085327A true CN110085327A (en) 2019-08-02

Family

ID=67413942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910256008.7A Pending CN110085327A (en) 2019-04-01 2019-04-01 Multichannel LSTM neural network Influenza epidemic situation prediction technique based on attention mechanism

Country Status (1)

Country Link
CN (1) CN110085327A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553394A (en) * 2020-04-20 2020-08-18 中国长江三峡集团有限公司 Reservoir water level prediction method based on cyclic neural network and attention mechanism
CN111724897A (en) * 2020-06-12 2020-09-29 电子科技大学 Motion function data processing method and system
CN111798991A (en) * 2020-07-09 2020-10-20 重庆邮电大学 LSTM-based method for predicting population situation of new coronary pneumonia epidemic situation
CN111882157A (en) * 2020-06-24 2020-11-03 东莞理工学院 Demand prediction method and system based on deep space-time neural network and computer readable storage medium
CN111933300A (en) * 2020-09-28 2020-11-13 平安科技(深圳)有限公司 Epidemic situation prevention and control effect prediction method, device, server and storage medium
CN111968755A (en) * 2020-08-21 2020-11-20 上海海洋大学 Epidemic situation prediction model based on LSTM deep learning network model
CN112164471A (en) * 2020-09-17 2021-01-01 吉林大学 New crown epidemic situation comprehensive evaluation method based on classification regression model
CN112201361A (en) * 2020-09-01 2021-01-08 浙江大学山东工业技术研究院 COVID-19 epidemic situation prediction method based on LSTM model
CN112582074A (en) * 2020-11-02 2021-03-30 吉林大学 Bi-LSTM and TF-IDF based new crown epidemic situation prediction and analysis method
CN112946187A (en) * 2021-01-22 2021-06-11 西安科技大学 Refuge chamber real-time state monitoring method based on neural network
CN113161004A (en) * 2020-07-15 2021-07-23 泰康保险集团股份有限公司 Epidemic situation prediction system and method
CN113314231A (en) * 2021-05-28 2021-08-27 北京航空航天大学 Infectious disease propagation prediction system and device integrating spatio-temporal information
CN113380420A (en) * 2020-10-13 2021-09-10 深圳云天励飞技术股份有限公司 Epidemic situation prediction method and device, electronic equipment and storage medium
CN113434989A (en) * 2021-06-28 2021-09-24 山东大学 Pipe network leakage amount prediction method and system based on attention mechanism and LSTM
CN113744888A (en) * 2021-09-02 2021-12-03 深圳万海思数字医疗有限公司 Regional epidemic trend prediction early warning method and system
CN114983352A (en) * 2021-03-01 2022-09-02 浙江远图互联科技股份有限公司 Method and device for identifying new coronary pneumonia based on attention mechanism
CN115393678A (en) * 2022-08-01 2022-11-25 北京理工大学 Multi-modal data fusion decision-making method based on image type intermediate state
CN115631869A (en) * 2022-11-28 2023-01-20 北京理工大学 Construction method of infectious disease prediction model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130097103A1 (en) * 2011-10-14 2013-04-18 International Business Machines Corporation Techniques for Generating Balanced and Class-Independent Training Data From Unlabeled Data Set
CN108648829A (en) * 2018-04-11 2018-10-12 平安科技(深圳)有限公司 Disease forecasting method and device, computer installation and readable storage medium storing program for executing
CN109431507A (en) * 2018-10-26 2019-03-08 平安科技(深圳)有限公司 Cough disease identification method and device based on deep learning
CN109493933A (en) * 2018-08-08 2019-03-19 浙江大学 A kind of prediction meanss of the adverse cardiac events based on attention mechanism

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130097103A1 (en) * 2011-10-14 2013-04-18 International Business Machines Corporation Techniques for Generating Balanced and Class-Independent Training Data From Unlabeled Data Set
CN108648829A (en) * 2018-04-11 2018-10-12 平安科技(深圳)有限公司 Disease forecasting method and device, computer installation and readable storage medium storing program for executing
CN109493933A (en) * 2018-08-08 2019-03-19 浙江大学 A kind of prediction meanss of the adverse cardiac events based on attention mechanism
CN109431507A (en) * 2018-10-26 2019-03-08 平安科技(深圳)有限公司 Cough disease identification method and device based on deep learning

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553394A (en) * 2020-04-20 2020-08-18 中国长江三峡集团有限公司 Reservoir water level prediction method based on cyclic neural network and attention mechanism
CN111724897A (en) * 2020-06-12 2020-09-29 电子科技大学 Motion function data processing method and system
CN111724897B (en) * 2020-06-12 2022-07-01 电子科技大学 Motion function data processing method and system
CN111882157A (en) * 2020-06-24 2020-11-03 东莞理工学院 Demand prediction method and system based on deep space-time neural network and computer readable storage medium
CN111798991A (en) * 2020-07-09 2020-10-20 重庆邮电大学 LSTM-based method for predicting population situation of new coronary pneumonia epidemic situation
CN111798991B (en) * 2020-07-09 2022-09-02 重庆邮电大学 LSTM-based method for predicting population situation of new coronary pneumonia epidemic situation
CN113161004A (en) * 2020-07-15 2021-07-23 泰康保险集团股份有限公司 Epidemic situation prediction system and method
CN113161004B (en) * 2020-07-15 2023-11-10 泰康保险集团股份有限公司 Epidemic situation prediction system and method
CN111968755A (en) * 2020-08-21 2020-11-20 上海海洋大学 Epidemic situation prediction model based on LSTM deep learning network model
CN112201361A (en) * 2020-09-01 2021-01-08 浙江大学山东工业技术研究院 COVID-19 epidemic situation prediction method based on LSTM model
CN112164471B (en) * 2020-09-17 2022-05-24 吉林大学 New crown epidemic situation comprehensive evaluation method based on classification regression model
CN112164471A (en) * 2020-09-17 2021-01-01 吉林大学 New crown epidemic situation comprehensive evaluation method based on classification regression model
WO2021139336A1 (en) * 2020-09-28 2021-07-15 平安科技(深圳)有限公司 Epidemic prevention and control effect prediction method and apparatus, and server and storage medium
CN111933300A (en) * 2020-09-28 2020-11-13 平安科技(深圳)有限公司 Epidemic situation prevention and control effect prediction method, device, server and storage medium
CN111933300B (en) * 2020-09-28 2021-02-12 平安科技(深圳)有限公司 Epidemic situation prevention and control effect prediction method, device, server and storage medium
CN113380420A (en) * 2020-10-13 2021-09-10 深圳云天励飞技术股份有限公司 Epidemic situation prediction method and device, electronic equipment and storage medium
CN113380420B (en) * 2020-10-13 2023-10-17 深圳云天励飞技术股份有限公司 Epidemic situation prediction method and device, electronic equipment and storage medium
CN112582074B (en) * 2020-11-02 2022-10-18 吉林大学 Bi-LSTM and TF-IDF based new crown epidemic situation prediction and analysis method
CN112582074A (en) * 2020-11-02 2021-03-30 吉林大学 Bi-LSTM and TF-IDF based new crown epidemic situation prediction and analysis method
CN112946187A (en) * 2021-01-22 2021-06-11 西安科技大学 Refuge chamber real-time state monitoring method based on neural network
CN114983352A (en) * 2021-03-01 2022-09-02 浙江远图互联科技股份有限公司 Method and device for identifying new coronary pneumonia based on attention mechanism
CN113314231A (en) * 2021-05-28 2021-08-27 北京航空航天大学 Infectious disease propagation prediction system and device integrating spatio-temporal information
CN113314231B (en) * 2021-05-28 2022-04-22 北京航空航天大学 Infectious disease propagation prediction system and device integrating spatio-temporal information
CN113434989A (en) * 2021-06-28 2021-09-24 山东大学 Pipe network leakage amount prediction method and system based on attention mechanism and LSTM
CN113744888A (en) * 2021-09-02 2021-12-03 深圳万海思数字医疗有限公司 Regional epidemic trend prediction early warning method and system
CN113744888B (en) * 2021-09-02 2023-09-22 深圳万海思数字医疗有限公司 Regional epidemic trend prediction and early warning method and system
CN115393678A (en) * 2022-08-01 2022-11-25 北京理工大学 Multi-modal data fusion decision-making method based on image type intermediate state
CN115393678B (en) * 2022-08-01 2024-04-02 北京理工大学 Multi-mode data fusion decision method based on image intermediate state
CN115631869A (en) * 2022-11-28 2023-01-20 北京理工大学 Construction method of infectious disease prediction model

Similar Documents

Publication Publication Date Title
CN110085327A (en) Multichannel LSTM neural network Influenza epidemic situation prediction technique based on attention mechanism
CN108231201B (en) Construction method, system and application method of disease data analysis processing model
CN112949828B (en) Graph convolution neural network traffic prediction method and system based on graph learning
Yan et al. An improved method for the fitting and prediction of the number of covid-19 confirmed cases based on lstm
CN108009674A (en) Air PM2.5 concentration prediction methods based on CNN and LSTM fused neural networks
CN111917785B (en) Industrial internet security situation prediction method based on DE-GWO-SVR
CN109523021A (en) A kind of dynamic network Structure Prediction Methods based on long memory network in short-term
CN110309609B (en) Building indoor air quality evaluation method based on rough set and WNN
CN110866631A (en) Method for predicting atmospheric pollution condition based on integrated gate recursion unit neural network GRU
CN114547974A (en) Dynamic soft measurement modeling method based on input variable selection and LSTM neural network
CN107403188A (en) A kind of quality evaluation method and device
CN116151485B (en) Method and system for predicting inverse facts and evaluating effects
CN115564114A (en) Short-term prediction method and system for airspace carbon emission based on graph neural network
CN110580213A (en) Database anomaly detection method based on cyclic marking time point process
CN114648097A (en) Elevator trapping feature analysis and time series prediction model construction method based on deep learning, obtained model and prediction method
CN115051929B (en) Network fault prediction method and device based on self-supervision target perception neural network
CN114818579A (en) Analog circuit fault diagnosis method based on one-dimensional convolution long-short term memory network
CN113096070A (en) Image segmentation method based on MA-Unet
CN110289987B (en) Multi-agent system network anti-attack capability assessment method based on characterization learning
CN114819102A (en) GRU-based air conditioning equipment fault diagnosis method
CN114266201B (en) Self-attention elevator trapping prediction method based on deep learning
Li et al. A lstm-based method for comprehension and evaluation of network security situation
Yuan et al. Assessing the forecasting of comprehensive loss incurred by typhoons: a combined PCA and BP neural network model
Anshori et al. Estimation of closed hotels and restaurants in Jakarta as impact of corona virus disease spread using adaptive neuro fuzzy inference system
CN111210081A (en) Bi-GRU-based PM2.5 data processing and prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190802