CN115204509A

CN115204509A - Method for predicting number of inpatients in respiratory system

Info

Publication number: CN115204509A
Application number: CN202210893982.6A
Authority: CN
Inventors: 余双彬; 郑玲; 彭荷玲; 廖芳
Original assignee: Sichuan Academy Of Medical Sciences Sichuan Provincial People's Hospital
Current assignee: Sichuan Academy Of Medical Sciences Sichuan Provincial People's Hospital
Priority date: 2022-07-27
Filing date: 2022-07-27
Publication date: 2022-10-18

Abstract

The invention discloses a respiratory system inpatient quantity prediction method, which belongs to the technical field of medical data mining and comprises the steps of obtaining the respiratory system inpatient quantity, the patient number to be diagnosed, seasons and air quality indexes to manufacture a data set, decomposing the data set into a plurality of initial training sets and a plurality of initial test sets, reconstructing the plurality of initial training sets and the plurality of initial test sets into a plurality of training sets and a plurality of test sets, inputting the training sets to train a multi-feature self-attention fusion network, inputting the test sets to test the trained spatio-temporal multi-feature self-attention fusion network, and obtaining the respiratory system inpatient quantity, the patient number to be diagnosed, the seasons and the air quality indexes of each day in the next preset period when a predicted value is not in expectation, so as to obtain a predicted value which is in line with expectation finally, wherein the data acquisition is simple and convenient, and the patient quantity can be accurately predicted.

Description

Method for predicting number of inpatients in respiratory system

Technical Field

The invention relates to the technical field of medical data mining, in particular to a respiratory system inpatient quantity prediction method.

Background

How to improve the configuration efficiency of medical resources and effectively reduce invalid waiting queues of inpatients is an urgent problem to be solved by medical managers. The scientific is to hospital inpatient quantity analysis prediction, in time, accurately analyzes inpatient's flow of people change and trend characteristic, can provide scientific decision-making basis for administrator's rational configuration health resources, pool medical personnel, optimization sick bed pool etc. to improve the work efficiency and the management level of hospital, and then improve patient's satisfaction, provide timely medical health service for the patient and have positive meaning.

The number of inpatients in a respiratory system of a hospital is influenced by a plurality of factors, namely the factors of the hospital, such as medical technology, medical service, geographical position and the like; another aspect is the patient's own medical choice, including factors such as the disease type, economic status, educational level, etc. Hospital and patient-self factors are often stable and difficult to change over time, and are also influenced by natural environmental factors. Some diseases are not directly caused by meteorological changes, but are often accompanied by certain seasonal and meteorological conditions. Researches show that the synergistic effect exists between the atmospheric pollution and the health effect, and is closely related to the hospitalization rate, the morbidity, the mortality, the hospitalization number and the like. The characteristics of seasonality, short-term fluctuation and long-term trend of atmospheric pollution are obvious, and the selection of outpatient and emergency treatment is the embodiment of comprehensive consideration of hospital factors and patient factors. Therefore, the prediction of the number of the hospitalized patients of the respiratory system by researching the air quality, the seasonal factors and the number of the patients to be treated is significant.

According to the search, morina et al propose a hospital emergency service model based on a second-order integer value autoregressive time series, which is used for predicting the number of patients admitted per week due to influenza. Zhu Xiangpeng adopts a leaf bass model to predict the outpatient circulation of a certain hospital intestinal department in Shanghai city. Wang et al propose that the patient prediction time of a fuzzy minimum maximum neural network based on rule extraction is all highly demanding. However, these model methods are complex in calculation process and have high requirements on calculation capacity, the number of training samples and prediction time. Meanwhile, the methods have limitations in simply using a single timing characteristic and have a certain hysteresis in predicting the number of abrupt changes. Since the change of the number of patients is influenced by a plurality of complex factors, the methods fail to consider the linear and nonlinear characteristics of the number of patients, and fail to integrate multi-factor modeling and time series analysis, and lack the correlation consideration of time series data.

Disclosure of Invention

The invention aims to overcome the defects that the existing patient quantity prediction method cannot consider the linear and nonlinear characteristics of the patient quantity because the change of the patient quantity is influenced by various complex factors, and cannot integrate multi-factor modeling and time series analysis and lack the consideration of time series data association in the prior art, and provides the inpatient quantity prediction method for the respiratory system.

In order to achieve the above object, the present invention provides the following technical solutions:

a respiratory system inpatient quantity prediction method comprises the following steps:

s1: acquiring the number of respiratory system inpatients per day in a preset period, acquiring the number of patients to be treated per day in the preset period, the season, and the air quality index per day in the preset period;

s2: making a patient data set according to the number of inpatients of the respiratory system, the number of patients to be treated, seasons and the air quality index;

s3: decomposing the data set into a plurality of initial training sets and a plurality of initial test sets, and reconstructing the initial training sets and the initial test sets to obtain a plurality of training sets and a plurality of test sets;

s4: constructing a space-time multi-feature self-attention fusion network, and inputting the training set to train the space-time multi-feature self-attention fusion network;

s5: inputting the test set into a trained multi-feature self-attention fusion network, and predicting the number of inpatients in a respiratory system to obtain a predicted value;

s6: judging whether the predicted value meets the expectation, if so, outputting the average value of the predicted value, and if not, acquiring the number of the respiratory system inpatients per day, the number of the patients to be diagnosed per day, the season and the air quality index per day in the next preset period to update data, and repeating the steps S2-S5 to perform online learning.

By adopting the technical scheme, firstly, the number of inpatients of a respiratory system, the number of patients to be diagnosed, seasons and air quality indexes are obtained to manufacture a data set, the data acquisition is simple and convenient, the data set is decomposed into a plurality of initial training sets and a plurality of initial test sets, then the plurality of initial training sets and the plurality of initial test sets are reconstructed into a plurality of training sets and a plurality of test sets, finally, the training sets are input to train the multi-feature self-attention fusion network, the training sets are input to test the trained space-time multi-feature self-attention fusion network after the training is completed, when the predicted value is not in accordance with the expectation, the number of inpatients of the respiratory system, the number of the inpatients per day, the number of the patients to be diagnosed, the seasons and the air quality indexes per day are subjected to data updating and the online learning of the space-time multi-feature self-attention fusion network, the predicted value in accordance with the expectation is finally obtained, the space-time multi-feature self-attention fusion network can be better adapted to the current inpatients data set sequence in the next preset period, the accurate prediction is realized, and the uncertain number prediction effect caused by the multi-factor is solved, so that the accuracy of the inpatients is improved. Meanwhile, the space-time multi-feature self-attention fusion network can be used for converting the medical data prediction problem into a supervised learning problem based on data driving, and finally, the accurate prediction of the number of the hospitalized patients in the respiratory system is realized.

As a preferable embodiment of the present invention, the step S2 includes: the number of hospitalized respiratory system patients, the number of patients to be treated, season, the air, will be dailyThe quality indexes are respectively arranged into 4-dimensional single vectors, and the 4-dimensional single vectors in the preset time limit are connected in series into [ t [ [ t ] ₁ ，t ₂ ，t ₃ ，t ₄ ]The data sets of (1) are finally arranged into a vector matrix with n rows and 4 columns, wherein the first column is a real label of the data set, and the last three columns are characteristic factors of the data set;

n is the number of days of the preset period.

As a preferred aspect of the present invention, the decomposing the data set into a plurality of initial training sets and a plurality of initial test sets in step S3 includes: performing VMD variational modal decomposition on the special evidence factors to obtain intrinsic modal components IMF ₁ 、IMF ₂ 、IMF ₃ 、IMF ₄ 、IMF ₅ 、IMF ₆ 、IMF ₇ 、IMF ₈ And a residual component Res, taking the daily number of said respiratory hospitalizations and said intrinsic mode component IMF ₁ 、 IMF ₂ 、IMF ₃ Forming 4-dimensional training single vectors, and connecting the 4-dimensional training single vectors in series to form [ T ] within the preset time limit every day ₁ ,T ₂ ,T ₃ ,T ₄ ]Finally forming a stationary feature matrix with n rows and 4 columns, wherein the first column is a real label of the stationary feature matrix, and the IMFs of the last three columns ₁ 、IMF ₂ 、IMF ₃ Dividing the stationary feature matrix into a plurality of initial training sets and a plurality of initial testing sets according to the time length of 4.

By adopting the technical scheme and adopting VMD variational modal decomposition, the inherent linear characteristics of the data can be effectively obtained, the data complexity and the coupling degree are reduced, more noise information is eliminated, and the prediction accuracy of the space-time multi-characteristic self-attention fusion network is improved.

As a preferred embodiment of the present invention, the reconstructing the decomposed initial training set and the initial test set in step S3 to obtain a plurality of training sets and a plurality of test sets includes: respectively reconstructing the initial training set and the initial test set into a training set of 1+ (3/4 n-w)/s w rows and 4 columns and a test set of (1/4 n-w)/s w rows and 4 columns through a set sliding window and a set step length;

where w is the sliding window and s is the step size.

As a preferred embodiment of the present invention, the spatiotemporal multi-feature self-attention fusion network in step S4 includes: the system comprises a spatial feature extraction structure, a time sequence feature structure and a self-attention output structure;

the spatial feature extraction structure comprises two one-dimensional convolutional layers and one-dimensional pooling layer, the number of channels of the two one-dimensional convolutional layers is 64, the size of a convolutional kernel is 1 multiplied by 1, an activation function is relu, and the step length is 1; the size of the pooling layer of the one-dimensional pooling layer is 2, and the step length is 1;

the self-attention output structure comprises a self-attention layer, a flat layer and two full-connection layers, wherein the self-attention layer consists of query values query, key and value; the number of neurons in the first of said fully-linked layers is 100, the activation function is relu, and the number of neurons in the second of said fully-linked layers is 1.

By adopting the technical scheme, the space-time multi-feature self-attention fusion network adopts the one-dimensional convolutional layer to effectively extract the space features of each group of data, adopts the two-way long-and-short-time memory layer structure to effectively extract the time sequence features of the data, adopts the self-attention structure to have stronger capturing capability on long-term time sequence dependency relationship and reduce the model calculation complexity, finally realizes accurate quantity prediction, and the output of the self-attention mechanism can be converted into one-dimensional data by the tiled layer.

As a preferred aspect of the present invention, the calculating of the self-attention layer includes: firstly, calculating the similarity between the query and the key dot product to obtain a weight, then normalizing the weight through a Softmax normalization function, and performing weighted summation on the processed weight and the corresponding value to obtain a final Attention, wherein the formula is as follows:

wherein Q is,K. V is matrix of query, key and value respectively, Q = K = V is time sequence characteristic extraction structure output matrix, d _k Is the vector dimension of the query matrix.

As a preferable embodiment of the present invention, the step S4 further includes: after the construction of the space-time multi-feature self-attention fusion network is completed, the intrinsic modal components IMF in the training set are used ₁ 、IMF ₂ 、IMF ₃ And respectively forming three groups of corresponding data with the number of the hospitalized patients of the respiratory system, and respectively inputting the three groups of data into the three space-time multi-feature self-attention fusion networks for training until all the training sets are input.

By adopting the technical scheme, the comprehensive quantity prediction is carried out by training the three space-time multi-feature self-attention fusion networks, so that the reliability of the prediction result is ensured.

As a preferred scheme of the invention, in the training process of the space-time multi-feature self-attention fusion network, training weight parameters are initialized by adopting Kaiming He, the same training hyper-parameters are adopted, learning rates are set to be 0.001, adam optimizers are adopted to train 100 batches, a loss function is set to be a mean square error, and the loss function is as follows:

wherein,

is the euclidean norm, N is the number of training data per batch,

network predicted value, num, for inpatients _hos The actual value is the actual value of the inpatient.

As a preferable embodiment of the present invention, the step S5 includes: IMF the natural modal components in the test set ₁ 、IMF ₂ 、IMF ₃ Inputting three trained space-time multi-feature self-attention respectivelyPredicting in the fusion network until all the test sets are input;

the IMF ₁ 、IMF ₂ 、IMF ₃ And after three trained space-time multi-feature self-attention fusion networks are correspondingly input, completing prediction and outputting three groups of corresponding predicted values of the respiratory system inpatients, wherein the predicted values comprise the predicted number and the confidence coefficient.

As a preferable scheme of the present invention, the step S6 includes that the predicted value is not in accordance with the expectation if the confidence is less than 0.9, the predicted value is in accordance with the expectation if the confidence is greater than or equal to 0.9, and an average value of the three groups of predicted values is calculated;

when repeating the steps S2 to S5, data update needs to be performed on the full connection layer.

Compared with the prior art, the invention has the beneficial effects that: firstly, acquiring the number of inpatients of a respiratory system, the number of patients to be diagnosed, seasons and air quality indexes to manufacture a data set, wherein the data acquisition is simple and convenient, the data set is decomposed into a plurality of initial training sets and a plurality of initial test sets, the plurality of initial training sets and the plurality of initial test sets are reconstructed into a plurality of training sets and a plurality of test sets, the training sets are input to train the multi-feature self-attention fusion network, the test sets are input to test the trained space-time multi-feature self-attention fusion network after the training is finished, and when the predicted value is not in accordance with the expectation, the number of inpatients of the respiratory system per day, the number of patients to be diagnosed per day and the seasons within the next preset period are acquired, the daily air quality index is subjected to data updating and online learning of a space-time multi-feature self-attention fusion network, a predicted value which is in line with expectation is finally obtained, the space-time multi-feature self-attention fusion network can be better adapted to a current respiratory system inpatient data set sequence, accurate prediction is achieved, the problem of poor quantity prediction effect caused by uncertainty of multi-factor influence factors is solved, the accuracy of respiratory system inpatient quantity prediction is improved, meanwhile, the space-time multi-feature self-attention fusion network can be used for converting medical data prediction problems into data-driven supervised learning problems, and accurate respiratory system inpatient quantity prediction is finally achieved; by adopting VMD variational modal decomposition, the inherent linear characteristics of the data can be effectively obtained, the complexity and the coupling degree of the data are reduced, more noise information is eliminated, and the prediction accuracy of the space-time multi-characteristic self-attention fusion network is improved; the spatial-temporal multi-feature self-attention fusion network adopts a one-dimensional convolutional layer to effectively extract spatial features of each group of data, adopts a two-way long-and-short-time memory layer structure to effectively extract time sequence features of the data, adopts a self-attention structure to have stronger capturing capability on long-term time sequence dependency and reduce model calculation complexity, and finally realizes accurate quantity prediction, and a tiled layer can convert output of a self-attention mechanism into one-dimensional data; the method has wide application, can expand the patient quantity prediction applied to a plurality of scenes by correspondingly modifying the input end or the output end of the time-space multi-feature self-attention fusion network, and has stronger scene adaptability and expansibility.

Drawings

Fig. 1 is a flowchart of a method for predicting the number of hospitalized patients in a respiratory system according to embodiment 1 of the present invention;

fig. 2 is a structural diagram of a spatiotemporal multi-feature self-attention fusion network of a respiratory system inpatient quantity prediction method according to embodiment 1 of the present invention.

Detailed Description

The present invention will be described in further detail with reference to test examples and specific embodiments. It should be understood that the scope of the above-described subject matter is not limited to the following examples, and any techniques implemented based on the disclosure of the present invention are within the scope of the present invention.

Example 1

A method for predicting the number of hospitalized patients in a respiratory system, as shown in fig. 1, comprising the steps of:

The step S2 includes: arranging the number of respiratory system inpatients, the number of patients to be treated, the season and the air quality index into 4-dimensional single vectors each day, and serially connecting the 4-dimensional single vectors in the preset time limit into [ t ₁ ，t ₂ ，t ₃ ，t ₄ ]The data sets are finally arranged into a vector matrix with n rows and 4 columns, wherein the first column is a real label of the data sets, and the last three columns are characteristic factors of the data sets;

n is the number of days of the preset period.

As a preferred aspect of the present invention, the decomposing the data set into a plurality of initial training sets and a plurality of initial test sets in step S3 includes: performing VMD variational modal decomposition on the special evidence factors to obtain intrinsic modal component IMF ₁ 、IMF ₂ 、IMF ₃ 、IMF ₄ 、IMF ₅ 、IMF ₆ 、IMF ₇ 、IMF ₈ And a residual component Res, taking the daily number of said respiratory hospitalizations and said intrinsic mode component IMF ₁ 、 IMF ₂ 、IMF ₃ Forming 4-dimensional training single vectors, and connecting the 4-dimensional training single vectors in series to form [ T ] within the preset time limit every day ₁ ,T ₂ ,T ₃ ,T ₄ ]Finally forming a stationary feature matrix with n rows and 4 columns, wherein the first column is a real label of the stationary feature matrix, and the IMFs of the last three columns ₁ 、IMF ₂ 、IMF ₃ Dividing the stationary feature matrix into a plurality of initial training sets and a plurality of initial testing sets according to the time length of 4.

Num for inpatients _hos Num for patients to be diagnosed _out Season is Sea and daily air quality index is Q.

Reconstructing the decomposed initial training set and the initial test set in the step S3 to obtain a plurality of training sets and a plurality of test sets includes: respectively reconstructing the initial training set and the initial test set into a training set of 1+ (3/4 n-w)/s w rows and 4 columns and a test set of (1/4 n-w)/s w rows and 4 columns through a set sliding window and a set step length;

where w is the sliding window and s is the step size.

As shown in fig. 2, the spatiotemporal multi-feature self-attention fusion network in step S4 includes: the system comprises a spatial feature extraction structure, a time sequence feature structure and a self-attention output structure;

the spatial feature extraction structure comprises two one-dimensional convolutional layers and a one-dimensional pooling layer, the number of channels of the two one-dimensional convolutional layers is 64, the size of a convolutional kernel is 1 multiplied by 1, an activation function is relu, and the step length is 1; the size of the pooling layer of the one-dimensional pooling layer is 2, and the step length is 1;

the time sequence extraction characteristic structure comprises two bidirectional long-short time memory layers, the number of units of the first bidirectional long-short time memory layer is 500, and the number of units of the second bidirectional long-short time memory layer is 200;

The calculation of the self-attention layer includes: firstly, calculating the similarity between the query and the key dot product to obtain a weight, then normalizing the weight through a Softmax normalization function, and performing weighted summation on the processed weight and the corresponding value to obtain a final Attention, wherein the formula is as follows:

q, K and V are matrixes of query, key and value respectively, Q = K = V is a time sequence feature extraction structure output matrix, and d _k Is the vector dimension of the query matrix.

The step S4 further includes: after the construction of the space-time multi-feature self-attention fusion network is completed, the inherent modal component IMF in the training set is used ₁ 、IMF ₂ 、IMF ₃ And respectively forming three groups of corresponding data with the number of the hospitalized patients of the respiratory system, and respectively inputting the three groups of data into the three space-time multi-feature self-attention fusion networks for training until all the training sets are input.

In the training process of the space-time multi-feature self-attention fusion network, training weight parameters are initialized by adopting Kaiming He, the same training super-parameters are adopted, the learning rate is set to be 0.001, 100 batches of training are carried out by adopting an Adam optimizer, the loss function is set to be mean square error, and the loss function is as follows:

wherein,

is the euclidean norm, N is the number of training data per batch,

The step S5 includes: IMF the natural modal components in the test set ₁ 、IMF ₂ 、IMF ₃ Correspondingly inputting three trained time-space multi-feature self-attention fusion networks for prediction until all the test sets are input;

the natural modal component IMF ₁ 、IMF ₂ 、IMF ₃ And after three trained space-time multi-feature self-attention fusion networks are correspondingly input, completing prediction and outputting three groups of predicted values of corresponding respiratory system inpatients, wherein the predicted values comprise the prediction number and the confidence coefficient.

The step S6 comprises that the predicted value is not in accordance with expectation when the confidence coefficient is less than 0.9, the predicted value is in accordance with expectation when the confidence coefficient is more than or equal to 0.9, and the average value of the three groups of predicted values is calculated;

when the steps S2 to S5 are repeated, data update needs to be performed on the full connection layer.

By adopting the technical scheme, firstly, the number of inpatients of the respiratory system, the number of patients to be diagnosed, seasons and air quality indexes are obtained to manufacture a data set, the data acquisition is simple and convenient, the data set is decomposed into a plurality of initial training sets and a plurality of initial test sets, then the plurality of initial training sets and the plurality of initial test sets are reconstructed into a plurality of training sets and a plurality of test sets, finally, the training sets are input to train the multi-feature self-attention fusion network, the test sets are input to test the trained space-time multi-feature self-attention fusion network after the training is finished, when the predicted value is not in accordance with the expectation, the number of inpatients of the respiratory system per day, the number of patients to be diagnosed per day and the seasons are obtained within the next preset period, and performing data updating and online learning of the space-time multi-feature self-attention fusion network by the daily air quality index to finally obtain a predicted value which is in line with expectation, so that the space-time multi-feature self-attention fusion network can better adapt to a current respiratory system inpatient data set sequence, accurate prediction is realized, and the problem of poor quantity prediction effect caused by uncertain multi-factor influence factors is solved, thereby improving the accuracy of the respiratory system inpatient quantity prediction.

Example 2

This example is a specific example of example 1:

the method comprises the following steps: the number Num of respiratory system inpatients daily during 2013-2018 in the hospital in the designated area of the medical record system is obtained _hos Acquiring the number Num of patients to be treated by the respiratory system every day from an outpatient service system _out And in the season Sea, simultaneously acquiring a daily air quality index Q from the air quality online analysis monitoring platform in the vacuum network China from 2013 to 2018.

Step two: to represent characteristic factors and real labels affecting respiratory hospitalized patients, the four types of data are expressed as { Num } _hos ,Num _out Sea, Q } into 4-dimensional single vectors, and concatenating the 4-dimensional single vectors for each day for 6 years into [ t ] ₁ ,t ₂ ,t ₃ ,t ₄ ]N =2190, which is a matrix of n rows by 4 columns, where the number Num of respiratory hospitalized patients is the first column _hos For the true label of the dataset, the last three columns [ Num _out ,Sea,Q]Is a characteristic factor of the data set.

Step three: in order to obtain the stable linear characteristics of the matrix, VMD variational modal decomposition is carried out on the special diagnosis factors of the matrix, and the special diagnosis factors are decomposed into intrinsic modal components IMF ₁ 、IMF ₂ 、IMF ₃ 、IMF ₄ 、IMF ₅ 、 IMF ₆ 、IMF ₇ 、IMF ₈ And a residual component Res, while taking the number of respiratory system hospitalizations per day and the intrinsic modal component IMF in order to reduce model complexity ₁ 、IMF ₂ 、IMF ₃ Form a 4-dimensional training single vector Num _hos ,IMF ₁ ,IMF ₂ ,IMF ₃ ]And serially connecting the 4-dimensional training single vectors into [ T ] ₁ ,T ₂ ,T ₃ , T ₄ ]Finally, an n-row x 4-column stationary feature matrix is formed, wherein the number of respiratory hospitalized patients in the first column is the true label of the data set, and the IMFs in the last three columns ₁ 、IMF ₂ 、IMF ₃ And dividing the stationary characteristic matrix into a plurality of initial training sets and a plurality of initial testing sets according to the time length of 4.

Specifically, the 1 st line to the 1752 nd line are the initial training set, and the 1753 rd line to the 2190 th line are the initial test set.

Specifically, the initial training set and the initial test set are reconstructed into a training set of 1+ (1752-w)/s training sets of w rows × 4 columns and a test set of 1+ (438-w)/s test sets of w rows × 4 columns respectively through a set sliding window w and a set step length s, and the specific reconstruction mode is as follows: and intercepting training single vectors from the jth row to the j + w row of the initial test set according to the sampling rate of the step length s, wherein j is more than or equal to 1 and less than or equal to 437-w, and sequentially arranging one by one. To make the data set diverse and rich, here w =6,s =1 is set, resulting in 1747 training sets and 433 test sets, respectively.

Step four: constructing a space-time multi-feature self-attention fusion network, wherein the space-time multi-feature self-attention fusion network consists of a space feature extraction structure, a time sequence feature extraction structure and a self-attention output structure, and inputting the training set to train the space-time multi-feature self-attention fusion network;

specifically, the spatial feature extraction structure includes 2 one-dimensional convolutional layers and 1 one-dimensional pooling layer, the number of channels of the first one-dimensional convolutional layer is 64, the size of a convolutional kernel is 1 × 1, the step size is 1, the activation function is relu, the number of channels of the second one-dimensional convolutional layer is 64, the size of the convolutional kernel is 1 × 1, the step size is 1, the activation function is relu, the size of the one-dimensional pooling layer is 2, and the step size is 1.

The timing characteristic extraction structure comprises 2 bidirectional long-short time memory layers, the number of units of the first bidirectional long-short time memory layer is 500, and the number of units of the second bidirectional long-short time memory layer is 200. The self-attention output structure contains 1 self-attention layer, 1 flat ply and 2 fully-connected layers.

The self-attention layer consists of query values query, key and value, and the calculation steps are as follows: firstly, calculating similarity between the query and each key dot product to obtain weight; and then, normalizing the weight obtained in the last step through a Softmax normalization function, and weighting and summing the normalized weight and the corresponding value to obtain the final Attention. Namely, it is

Q, K and V are matrixes of query, key and value respectively, Q = K = V is a time sequence feature extraction structure output matrix, and d _k Is the vector dimension of the query matrix. The flat layer has the function of converting the output of the self-attention mechanism into one-dimensional data, the number of the first full-connection layer neurons is 100, the activation function is relu, the number of the second full-connection layer neurons is 1, and 1 predicted value is represented and output.

After the construction of the space-time multi-feature self-attention fusion network is completed, the second-line data IMF of the training set is used ₁ Third column data IMF ₂ And a fourth column of data IMF ₃ Respectively with the first column data Num _hos Combined into 3 sets of corresponding data inputs, yielding 1747 [ IMFs ₁ ,Num _hos ]1747 pieces of [ IMF ₂ ,Num _hos ]And 1747 [ IMF ₃ ,Num _hos ]Num number of inpatients _hos For a real label, 3 groups of data are respectively input into 3 space-time multi-feature self-attention fusion networks for training until all training sets are input.

And then training the 3 space-time multi-feature self-attention fusion networks in an input mode of the training set, initializing training weight parameters by adopting Kaiming He, simultaneously adopting the same training hyper-parameters, setting learning rates to be 0.001, and training 100 batches by adopting an Adam optimizer. Loss functionSet as the mean square error, defined as

Is the Euclidean norm, where N is the number of training data per batch,

network predicted value, num, for inpatients _hos The actual value is the actual value of the inpatient. After training is completed, all the 3 space-time multi-feature self-attention fusion network models have the prediction capability on the number of hospitalized patients of a respiratory system.

Step five: after the training of the 3 space-time multi-feature self-attention fusion networks is completed, the second-line data IMF of the test set is respectively used ₁ Third column data IMF ₂ And fourth column data IMF ₃ And inputting the correspondingly trained space-time multi-feature self-attention fusion network one by one to carry out forward reasoning until the test set is completely input.

For each of said natural modal components IMF ₁ 、IMF ₂ 、IMF ₃ All can obtain the predicted number N of the respiratory system inpatients through the corresponding network output ₁ 、N ₂ And N ₃ And corresponding confidence degree P ₁ 、P ₂ And P ₃ 。

Step six: if the confidence P is predicted<0.9,P∈{P ₁ ,P ₂ ,P ₃ When the predicted values do not meet the expected requirements, acquiring the number of the respiratory system inpatients every day, the number of the patients to be hospitalized every day, seasons and the air quality index every day in the next preset period, repeating the second step, the second step and the fourth step, updating data and online learning of the space-time multi-feature self-attention fusion network, fixing the convolution layer, the two-way long-short-term memory layer and the self-attention layer parameters of the space-time multi-feature self-attention fusion network when the network model parameters are finely adjusted in online learning, and only fixing the parameters of the space-time multi-feature self-attention fusion network and only processing the space-time multi-feature self-attention fusion networkThe parameters of the full connection layer of the multi-feature self-attention fusion network are subjected to data updating, the purpose is to enable the network model to adapt to the current respiratory system patient data set sequence, namely accurate prediction can be achieved, otherwise, the respiratory system inpatient quantity prediction is considered to be accurate, and the 3 groups of average values P of the predicted patient quantity are obtained through calculation _final ＝(P ₁ +P ₂ +P ₃ ) And/3, the predicted number of hospitalized respiratory patients.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A method for predicting the number of hospitalized respiratory patients, comprising the steps of:

s1: acquiring the number of respiratory system inpatients per day in a preset period, acquiring the number of patients to be treated per day in the preset period, the season and the air quality index per day in the preset period;

s5: inputting the test set into a trained multi-feature self-attention fusion network, and predicting the number of patients in a respiratory system to obtain a predicted value;

2. The method as claimed in claim 1, wherein the step S2 comprises: arranging the number of respiratory system inpatients, the number of patients to be treated, seasons and the air quality index into 4-dimensional single vectors every day, and serially connecting the 4-dimensional single vectors in the preset period into [ t [ t ] ] ₁ ，t ₂ ，t ₃ ，t ₄ ]The data sets are finally arranged into a vector matrix with n rows and 4 columns, wherein the first column is a real label of the data sets, and the last three columns are characteristic factors of the data sets;

n is the number of days of the preset period.

3. The method of claim 2, wherein the step S3 of decomposing the data set into a plurality of initial training sets and a plurality of initial testing sets comprises: performing VMD variational modal decomposition on the special evidence factors to obtain intrinsic modal component IMF ₁ 、IMF ₂ 、IMF ₃ 、IMF ₄ 、IMF ₅ 、IMF ₆ 、IMF ₇ 、IMF ₈ And a residual component Res, taking the daily number of said respiratory hospitalizations and said intrinsic mode component IMF ₁ 、IMF ₂ 、IMF ₃ Forming 4-dimensional training single vectors, and connecting the 4-dimensional training single vectors in series to form [ T ] within the preset time limit every day ₁ ,T ₂ ,T ₃ ,T ₄ ]Finally forming a stationary feature matrix with n rows and 4 columns, wherein the first column is a real label of the stationary feature matrix, and the IMFs of the last three columns ₁ 、IMF ₂ 、IMF ₃ Dividing the stationary feature matrix into a plurality of initial training sets and a plurality of initial testing sets according to the time length of 4.

4. The method as claimed in claim 3, wherein the step S3 of reconstructing the decomposed initial training set and initial test set to obtain a plurality of training sets and a plurality of test sets comprises: respectively reconstructing the initial training set and the initial test set into a training set of 1+ (3/4 n-w)/s w rows and 4 columns and a test set of (1/4 n-w)/s w rows and 4 columns through a set sliding window and a set step length;

where w is the sliding window and s is the step size.

5. The method as claimed in claim 4, wherein the spatiotemporal multi-feature self-attention fusion network in step S4 comprises: the system comprises a spatial feature extraction structure, a time sequence feature structure and a self-attention output structure;

6. The method as claimed in claim 5, wherein the calculating of the self-attention layer comprises: firstly, calculating the similarity between the query and the key dot product to obtain a weight, then normalizing the weight through a Softmax normalization function, and performing weighted summation on the processed weight and the corresponding value to obtain a final Attention, wherein the formula is as follows:

7. The method as claimed in claim 5, wherein the step S4 further comprises: after the construction of the space-time multi-feature self-attention fusion network is completed, the intrinsic modal components IMF in the training set are used ₁ 、IMF ₂ 、IMF ₃ And respectively forming three groups of corresponding data with the number of the hospitalized patients of the respiratory system, and respectively inputting the three groups of data into the three space-time multi-feature self-attention fusion networks for training until all the training sets are input.

8. The method for predicting the number of inpatients in a respiratory system according to claim 7, wherein in the training process of the spatio-temporal multi-feature self-attention fusion network, training weight parameters are initialized by Kaiming He, the same training hyper-parameters are adopted, learning rates are set to 0.001, 100 batches of training are performed by an Adam optimizer, a loss function is set as a mean square error, and the loss function is:

wherein,

is the euclidean norm, N is the number of training data per batch,

network predicted value, num, for inpatient _hos The actual value is the inpatient value.

9. The method as claimed in claim 7, wherein the step S5 comprises: IMF the natural modal components in the test set ₁ 、IMF ₂ 、IMF ₃ Correspondingly inputting three trained time-space multi-feature self-attention fusion networks for prediction until all the test sets are input;

the intrinsic modal component IMF in the test set ₁ 、IMF ₂ 、IMF ₃ And after three trained space-time multi-feature self-attention fusion networks are correspondingly input, the prediction is finished and three groups of predicted values of the number of the respiratory system patients are output, wherein the predicted values comprise the predicted number and the confidence coefficient.

10. The method as claimed in claim 9, wherein the step S6 includes calculating an average of three sets of the predicted values if the confidence is less than 0.9 and the predicted value is not expected if the confidence is greater than or equal to 0.9;