CN115204509A - Method for predicting number of inpatients in respiratory system - Google Patents
Method for predicting number of inpatients in respiratory system Download PDFInfo
- Publication number
- CN115204509A CN115204509A CN202210893982.6A CN202210893982A CN115204509A CN 115204509 A CN115204509 A CN 115204509A CN 202210893982 A CN202210893982 A CN 202210893982A CN 115204509 A CN115204509 A CN 115204509A
- Authority
- CN
- China
- Prior art keywords
- imf
- training
- sets
- self
- attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 210000002345 respiratory system Anatomy 0.000 title claims abstract description 55
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000012549 training Methods 0.000 claims abstract description 102
- 230000004927 fusion Effects 0.000 claims abstract description 64
- 238000012360 testing method Methods 0.000 claims abstract description 36
- 238000013101 initial test Methods 0.000 claims abstract description 30
- 239000011159 matrix material Substances 0.000 claims description 26
- 239000013598 vector Substances 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 20
- 238000000605 extraction Methods 0.000 claims description 17
- 238000011176 pooling Methods 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 9
- 230000002457 bidirectional effect Effects 0.000 claims description 9
- 230000002354 daily effect Effects 0.000 claims description 8
- 230000003203 everyday effect Effects 0.000 claims description 8
- 210000002569 neuron Anatomy 0.000 claims description 8
- 230000000241 respiratory effect Effects 0.000 claims description 8
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 238000004519 manufacturing process Methods 0.000 abstract description 4
- 238000007418 data mining Methods 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000007774 longterm Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000012821 model calculation Methods 0.000 description 2
- 230000001932 seasonal effect Effects 0.000 description 2
- 238000012731 temporal analysis Methods 0.000 description 2
- 238000000700 time series analysis Methods 0.000 description 2
- 241000606268 Morina Species 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000008821 health effect Effects 0.000 description 1
- 206010022000 influenza Diseases 0.000 description 1
- 230000000968 intestinal effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000002195 synergetic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/22—Social work or social welfare, e.g. community support activities or counselling services
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Biomedical Technology (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Human Resources & Organizations (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Marketing (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- General Business, Economics & Management (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Primary Health Care (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Development Economics (AREA)
- Child & Adolescent Psychology (AREA)
- Game Theory and Decision Science (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Databases & Information Systems (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a respiratory system inpatient quantity prediction method, which belongs to the technical field of medical data mining and comprises the steps of obtaining the respiratory system inpatient quantity, the patient number to be diagnosed, seasons and air quality indexes to manufacture a data set, decomposing the data set into a plurality of initial training sets and a plurality of initial test sets, reconstructing the plurality of initial training sets and the plurality of initial test sets into a plurality of training sets and a plurality of test sets, inputting the training sets to train a multi-feature self-attention fusion network, inputting the test sets to test the trained spatio-temporal multi-feature self-attention fusion network, and obtaining the respiratory system inpatient quantity, the patient number to be diagnosed, the seasons and the air quality indexes of each day in the next preset period when a predicted value is not in expectation, so as to obtain a predicted value which is in line with expectation finally, wherein the data acquisition is simple and convenient, and the patient quantity can be accurately predicted.
Description
Technical Field
The invention relates to the technical field of medical data mining, in particular to a respiratory system inpatient quantity prediction method.
Background
How to improve the configuration efficiency of medical resources and effectively reduce invalid waiting queues of inpatients is an urgent problem to be solved by medical managers. The scientific is to hospital inpatient quantity analysis prediction, in time, accurately analyzes inpatient's flow of people change and trend characteristic, can provide scientific decision-making basis for administrator's rational configuration health resources, pool medical personnel, optimization sick bed pool etc. to improve the work efficiency and the management level of hospital, and then improve patient's satisfaction, provide timely medical health service for the patient and have positive meaning.
The number of inpatients in a respiratory system of a hospital is influenced by a plurality of factors, namely the factors of the hospital, such as medical technology, medical service, geographical position and the like; another aspect is the patient's own medical choice, including factors such as the disease type, economic status, educational level, etc. Hospital and patient-self factors are often stable and difficult to change over time, and are also influenced by natural environmental factors. Some diseases are not directly caused by meteorological changes, but are often accompanied by certain seasonal and meteorological conditions. Researches show that the synergistic effect exists between the atmospheric pollution and the health effect, and is closely related to the hospitalization rate, the morbidity, the mortality, the hospitalization number and the like. The characteristics of seasonality, short-term fluctuation and long-term trend of atmospheric pollution are obvious, and the selection of outpatient and emergency treatment is the embodiment of comprehensive consideration of hospital factors and patient factors. Therefore, the prediction of the number of the hospitalized patients of the respiratory system by researching the air quality, the seasonal factors and the number of the patients to be treated is significant.
According to the search, morina et al propose a hospital emergency service model based on a second-order integer value autoregressive time series, which is used for predicting the number of patients admitted per week due to influenza. Zhu Xiangpeng adopts a leaf bass model to predict the outpatient circulation of a certain hospital intestinal department in Shanghai city. Wang et al propose that the patient prediction time of a fuzzy minimum maximum neural network based on rule extraction is all highly demanding. However, these model methods are complex in calculation process and have high requirements on calculation capacity, the number of training samples and prediction time. Meanwhile, the methods have limitations in simply using a single timing characteristic and have a certain hysteresis in predicting the number of abrupt changes. Since the change of the number of patients is influenced by a plurality of complex factors, the methods fail to consider the linear and nonlinear characteristics of the number of patients, and fail to integrate multi-factor modeling and time series analysis, and lack the correlation consideration of time series data.
Disclosure of Invention
The invention aims to overcome the defects that the existing patient quantity prediction method cannot consider the linear and nonlinear characteristics of the patient quantity because the change of the patient quantity is influenced by various complex factors, and cannot integrate multi-factor modeling and time series analysis and lack the consideration of time series data association in the prior art, and provides the inpatient quantity prediction method for the respiratory system.
In order to achieve the above object, the present invention provides the following technical solutions:
a respiratory system inpatient quantity prediction method comprises the following steps:
s1: acquiring the number of respiratory system inpatients per day in a preset period, acquiring the number of patients to be treated per day in the preset period, the season, and the air quality index per day in the preset period;
s2: making a patient data set according to the number of inpatients of the respiratory system, the number of patients to be treated, seasons and the air quality index;
s3: decomposing the data set into a plurality of initial training sets and a plurality of initial test sets, and reconstructing the initial training sets and the initial test sets to obtain a plurality of training sets and a plurality of test sets;
s4: constructing a space-time multi-feature self-attention fusion network, and inputting the training set to train the space-time multi-feature self-attention fusion network;
s5: inputting the test set into a trained multi-feature self-attention fusion network, and predicting the number of inpatients in a respiratory system to obtain a predicted value;
s6: judging whether the predicted value meets the expectation, if so, outputting the average value of the predicted value, and if not, acquiring the number of the respiratory system inpatients per day, the number of the patients to be diagnosed per day, the season and the air quality index per day in the next preset period to update data, and repeating the steps S2-S5 to perform online learning.
By adopting the technical scheme, firstly, the number of inpatients of a respiratory system, the number of patients to be diagnosed, seasons and air quality indexes are obtained to manufacture a data set, the data acquisition is simple and convenient, the data set is decomposed into a plurality of initial training sets and a plurality of initial test sets, then the plurality of initial training sets and the plurality of initial test sets are reconstructed into a plurality of training sets and a plurality of test sets, finally, the training sets are input to train the multi-feature self-attention fusion network, the training sets are input to test the trained space-time multi-feature self-attention fusion network after the training is completed, when the predicted value is not in accordance with the expectation, the number of inpatients of the respiratory system, the number of the inpatients per day, the number of the patients to be diagnosed, the seasons and the air quality indexes per day are subjected to data updating and the online learning of the space-time multi-feature self-attention fusion network, the predicted value in accordance with the expectation is finally obtained, the space-time multi-feature self-attention fusion network can be better adapted to the current inpatients data set sequence in the next preset period, the accurate prediction is realized, and the uncertain number prediction effect caused by the multi-factor is solved, so that the accuracy of the inpatients is improved. Meanwhile, the space-time multi-feature self-attention fusion network can be used for converting the medical data prediction problem into a supervised learning problem based on data driving, and finally, the accurate prediction of the number of the hospitalized patients in the respiratory system is realized.
As a preferable embodiment of the present invention, the step S2 includes: the number of hospitalized respiratory system patients, the number of patients to be treated, season, the air, will be dailyThe quality indexes are respectively arranged into 4-dimensional single vectors, and the 4-dimensional single vectors in the preset time limit are connected in series into [ t [ [ t ] 1 ,t 2 ,t 3 ,t 4 ]The data sets of (1) are finally arranged into a vector matrix with n rows and 4 columns, wherein the first column is a real label of the data set, and the last three columns are characteristic factors of the data set;
n is the number of days of the preset period.
As a preferred aspect of the present invention, the decomposing the data set into a plurality of initial training sets and a plurality of initial test sets in step S3 includes: performing VMD variational modal decomposition on the special evidence factors to obtain intrinsic modal components IMF 1 、IMF 2 、IMF 3 、IMF 4 、IMF 5 、IMF 6 、IMF 7 、IMF 8 And a residual component Res, taking the daily number of said respiratory hospitalizations and said intrinsic mode component IMF 1 、 IMF 2 、IMF 3 Forming 4-dimensional training single vectors, and connecting the 4-dimensional training single vectors in series to form [ T ] within the preset time limit every day 1 ,T 2 ,T 3 ,T 4 ]Finally forming a stationary feature matrix with n rows and 4 columns, wherein the first column is a real label of the stationary feature matrix, and the IMFs of the last three columns 1 、IMF 2 、IMF 3 Dividing the stationary feature matrix into a plurality of initial training sets and a plurality of initial testing sets according to the time length of 4.
By adopting the technical scheme and adopting VMD variational modal decomposition, the inherent linear characteristics of the data can be effectively obtained, the data complexity and the coupling degree are reduced, more noise information is eliminated, and the prediction accuracy of the space-time multi-characteristic self-attention fusion network is improved.
As a preferred embodiment of the present invention, the reconstructing the decomposed initial training set and the initial test set in step S3 to obtain a plurality of training sets and a plurality of test sets includes: respectively reconstructing the initial training set and the initial test set into a training set of 1+ (3/4 n-w)/s w rows and 4 columns and a test set of (1/4 n-w)/s w rows and 4 columns through a set sliding window and a set step length;
where w is the sliding window and s is the step size.
As a preferred embodiment of the present invention, the spatiotemporal multi-feature self-attention fusion network in step S4 includes: the system comprises a spatial feature extraction structure, a time sequence feature structure and a self-attention output structure;
the spatial feature extraction structure comprises two one-dimensional convolutional layers and one-dimensional pooling layer, the number of channels of the two one-dimensional convolutional layers is 64, the size of a convolutional kernel is 1 multiplied by 1, an activation function is relu, and the step length is 1; the size of the pooling layer of the one-dimensional pooling layer is 2, and the step length is 1;
the self-attention output structure comprises a self-attention layer, a flat layer and two full-connection layers, wherein the self-attention layer consists of query values query, key and value; the number of neurons in the first of said fully-linked layers is 100, the activation function is relu, and the number of neurons in the second of said fully-linked layers is 1.
By adopting the technical scheme, the space-time multi-feature self-attention fusion network adopts the one-dimensional convolutional layer to effectively extract the space features of each group of data, adopts the two-way long-and-short-time memory layer structure to effectively extract the time sequence features of the data, adopts the self-attention structure to have stronger capturing capability on long-term time sequence dependency relationship and reduce the model calculation complexity, finally realizes accurate quantity prediction, and the output of the self-attention mechanism can be converted into one-dimensional data by the tiled layer.
As a preferred aspect of the present invention, the calculating of the self-attention layer includes: firstly, calculating the similarity between the query and the key dot product to obtain a weight, then normalizing the weight through a Softmax normalization function, and performing weighted summation on the processed weight and the corresponding value to obtain a final Attention, wherein the formula is as follows:
wherein Q is,K. V is matrix of query, key and value respectively, Q = K = V is time sequence characteristic extraction structure output matrix, d k Is the vector dimension of the query matrix.
As a preferable embodiment of the present invention, the step S4 further includes: after the construction of the space-time multi-feature self-attention fusion network is completed, the intrinsic modal components IMF in the training set are used 1 、IMF 2 、IMF 3 And respectively forming three groups of corresponding data with the number of the hospitalized patients of the respiratory system, and respectively inputting the three groups of data into the three space-time multi-feature self-attention fusion networks for training until all the training sets are input.
By adopting the technical scheme, the comprehensive quantity prediction is carried out by training the three space-time multi-feature self-attention fusion networks, so that the reliability of the prediction result is ensured.
As a preferred scheme of the invention, in the training process of the space-time multi-feature self-attention fusion network, training weight parameters are initialized by adopting Kaiming He, the same training hyper-parameters are adopted, learning rates are set to be 0.001, adam optimizers are adopted to train 100 batches, a loss function is set to be a mean square error, and the loss function is as follows:
wherein,is the euclidean norm, N is the number of training data per batch,network predicted value, num, for inpatients hos The actual value is the actual value of the inpatient.
As a preferable embodiment of the present invention, the step S5 includes: IMF the natural modal components in the test set 1 、IMF 2 、IMF 3 Inputting three trained space-time multi-feature self-attention respectivelyPredicting in the fusion network until all the test sets are input;
the IMF 1 、IMF 2 、IMF 3 And after three trained space-time multi-feature self-attention fusion networks are correspondingly input, completing prediction and outputting three groups of corresponding predicted values of the respiratory system inpatients, wherein the predicted values comprise the predicted number and the confidence coefficient.
As a preferable scheme of the present invention, the step S6 includes that the predicted value is not in accordance with the expectation if the confidence is less than 0.9, the predicted value is in accordance with the expectation if the confidence is greater than or equal to 0.9, and an average value of the three groups of predicted values is calculated;
when repeating the steps S2 to S5, data update needs to be performed on the full connection layer.
Compared with the prior art, the invention has the beneficial effects that: firstly, acquiring the number of inpatients of a respiratory system, the number of patients to be diagnosed, seasons and air quality indexes to manufacture a data set, wherein the data acquisition is simple and convenient, the data set is decomposed into a plurality of initial training sets and a plurality of initial test sets, the plurality of initial training sets and the plurality of initial test sets are reconstructed into a plurality of training sets and a plurality of test sets, the training sets are input to train the multi-feature self-attention fusion network, the test sets are input to test the trained space-time multi-feature self-attention fusion network after the training is finished, and when the predicted value is not in accordance with the expectation, the number of inpatients of the respiratory system per day, the number of patients to be diagnosed per day and the seasons within the next preset period are acquired, the daily air quality index is subjected to data updating and online learning of a space-time multi-feature self-attention fusion network, a predicted value which is in line with expectation is finally obtained, the space-time multi-feature self-attention fusion network can be better adapted to a current respiratory system inpatient data set sequence, accurate prediction is achieved, the problem of poor quantity prediction effect caused by uncertainty of multi-factor influence factors is solved, the accuracy of respiratory system inpatient quantity prediction is improved, meanwhile, the space-time multi-feature self-attention fusion network can be used for converting medical data prediction problems into data-driven supervised learning problems, and accurate respiratory system inpatient quantity prediction is finally achieved; by adopting VMD variational modal decomposition, the inherent linear characteristics of the data can be effectively obtained, the complexity and the coupling degree of the data are reduced, more noise information is eliminated, and the prediction accuracy of the space-time multi-characteristic self-attention fusion network is improved; the spatial-temporal multi-feature self-attention fusion network adopts a one-dimensional convolutional layer to effectively extract spatial features of each group of data, adopts a two-way long-and-short-time memory layer structure to effectively extract time sequence features of the data, adopts a self-attention structure to have stronger capturing capability on long-term time sequence dependency and reduce model calculation complexity, and finally realizes accurate quantity prediction, and a tiled layer can convert output of a self-attention mechanism into one-dimensional data; the method has wide application, can expand the patient quantity prediction applied to a plurality of scenes by correspondingly modifying the input end or the output end of the time-space multi-feature self-attention fusion network, and has stronger scene adaptability and expansibility.
Drawings
Fig. 1 is a flowchart of a method for predicting the number of hospitalized patients in a respiratory system according to embodiment 1 of the present invention;
fig. 2 is a structural diagram of a spatiotemporal multi-feature self-attention fusion network of a respiratory system inpatient quantity prediction method according to embodiment 1 of the present invention.
Detailed Description
The present invention will be described in further detail with reference to test examples and specific embodiments. It should be understood that the scope of the above-described subject matter is not limited to the following examples, and any techniques implemented based on the disclosure of the present invention are within the scope of the present invention.
Example 1
A method for predicting the number of hospitalized patients in a respiratory system, as shown in fig. 1, comprising the steps of:
s1: acquiring the number of respiratory system inpatients per day in a preset period, acquiring the number of patients to be treated per day in the preset period, the season, and the air quality index per day in the preset period;
s2: making a patient data set according to the number of inpatients of the respiratory system, the number of patients to be treated, seasons and the air quality index;
s3: decomposing the data set into a plurality of initial training sets and a plurality of initial test sets, and reconstructing the initial training sets and the initial test sets to obtain a plurality of training sets and a plurality of test sets;
s4: constructing a space-time multi-feature self-attention fusion network, and inputting the training set to train the space-time multi-feature self-attention fusion network;
s5: inputting the test set into a trained multi-feature self-attention fusion network, and predicting the number of inpatients in a respiratory system to obtain a predicted value;
s6: judging whether the predicted value meets the expectation, if so, outputting the average value of the predicted value, and if not, acquiring the number of the respiratory system inpatients per day, the number of the patients to be diagnosed per day, the season and the air quality index per day in the next preset period to update data, and repeating the steps S2-S5 to perform online learning.
The step S2 includes: arranging the number of respiratory system inpatients, the number of patients to be treated, the season and the air quality index into 4-dimensional single vectors each day, and serially connecting the 4-dimensional single vectors in the preset time limit into [ t 1 ,t 2 ,t 3 ,t 4 ]The data sets are finally arranged into a vector matrix with n rows and 4 columns, wherein the first column is a real label of the data sets, and the last three columns are characteristic factors of the data sets;
n is the number of days of the preset period.
As a preferred aspect of the present invention, the decomposing the data set into a plurality of initial training sets and a plurality of initial test sets in step S3 includes: performing VMD variational modal decomposition on the special evidence factors to obtain intrinsic modal component IMF 1 、IMF 2 、IMF 3 、IMF 4 、IMF 5 、IMF 6 、IMF 7 、IMF 8 And a residual component Res, taking the daily number of said respiratory hospitalizations and said intrinsic mode component IMF 1 、 IMF 2 、IMF 3 Forming 4-dimensional training single vectors, and connecting the 4-dimensional training single vectors in series to form [ T ] within the preset time limit every day 1 ,T 2 ,T 3 ,T 4 ]Finally forming a stationary feature matrix with n rows and 4 columns, wherein the first column is a real label of the stationary feature matrix, and the IMFs of the last three columns 1 、IMF 2 、IMF 3 Dividing the stationary feature matrix into a plurality of initial training sets and a plurality of initial testing sets according to the time length of 4.
Num for inpatients hos Num for patients to be diagnosed out Season is Sea and daily air quality index is Q.
Reconstructing the decomposed initial training set and the initial test set in the step S3 to obtain a plurality of training sets and a plurality of test sets includes: respectively reconstructing the initial training set and the initial test set into a training set of 1+ (3/4 n-w)/s w rows and 4 columns and a test set of (1/4 n-w)/s w rows and 4 columns through a set sliding window and a set step length;
where w is the sliding window and s is the step size.
As shown in fig. 2, the spatiotemporal multi-feature self-attention fusion network in step S4 includes: the system comprises a spatial feature extraction structure, a time sequence feature structure and a self-attention output structure;
the spatial feature extraction structure comprises two one-dimensional convolutional layers and a one-dimensional pooling layer, the number of channels of the two one-dimensional convolutional layers is 64, the size of a convolutional kernel is 1 multiplied by 1, an activation function is relu, and the step length is 1; the size of the pooling layer of the one-dimensional pooling layer is 2, and the step length is 1;
the time sequence extraction characteristic structure comprises two bidirectional long-short time memory layers, the number of units of the first bidirectional long-short time memory layer is 500, and the number of units of the second bidirectional long-short time memory layer is 200;
the self-attention output structure comprises a self-attention layer, a flat layer and two full-connection layers, wherein the self-attention layer consists of query values query, key and value; the number of neurons in the first of said fully-linked layers is 100, the activation function is relu, and the number of neurons in the second of said fully-linked layers is 1.
The calculation of the self-attention layer includes: firstly, calculating the similarity between the query and the key dot product to obtain a weight, then normalizing the weight through a Softmax normalization function, and performing weighted summation on the processed weight and the corresponding value to obtain a final Attention, wherein the formula is as follows:
q, K and V are matrixes of query, key and value respectively, Q = K = V is a time sequence feature extraction structure output matrix, and d k Is the vector dimension of the query matrix.
The step S4 further includes: after the construction of the space-time multi-feature self-attention fusion network is completed, the inherent modal component IMF in the training set is used 1 、IMF 2 、IMF 3 And respectively forming three groups of corresponding data with the number of the hospitalized patients of the respiratory system, and respectively inputting the three groups of data into the three space-time multi-feature self-attention fusion networks for training until all the training sets are input.
In the training process of the space-time multi-feature self-attention fusion network, training weight parameters are initialized by adopting Kaiming He, the same training super-parameters are adopted, the learning rate is set to be 0.001, 100 batches of training are carried out by adopting an Adam optimizer, the loss function is set to be mean square error, and the loss function is as follows:
wherein,is the euclidean norm, N is the number of training data per batch,network predicted value, num, for inpatients hos The actual value is the actual value of the inpatient.
The step S5 includes: IMF the natural modal components in the test set 1 、IMF 2 、IMF 3 Correspondingly inputting three trained time-space multi-feature self-attention fusion networks for prediction until all the test sets are input;
the natural modal component IMF 1 、IMF 2 、IMF 3 And after three trained space-time multi-feature self-attention fusion networks are correspondingly input, completing prediction and outputting three groups of predicted values of corresponding respiratory system inpatients, wherein the predicted values comprise the prediction number and the confidence coefficient.
The step S6 comprises that the predicted value is not in accordance with expectation when the confidence coefficient is less than 0.9, the predicted value is in accordance with expectation when the confidence coefficient is more than or equal to 0.9, and the average value of the three groups of predicted values is calculated;
when the steps S2 to S5 are repeated, data update needs to be performed on the full connection layer.
By adopting the technical scheme, firstly, the number of inpatients of the respiratory system, the number of patients to be diagnosed, seasons and air quality indexes are obtained to manufacture a data set, the data acquisition is simple and convenient, the data set is decomposed into a plurality of initial training sets and a plurality of initial test sets, then the plurality of initial training sets and the plurality of initial test sets are reconstructed into a plurality of training sets and a plurality of test sets, finally, the training sets are input to train the multi-feature self-attention fusion network, the test sets are input to test the trained space-time multi-feature self-attention fusion network after the training is finished, when the predicted value is not in accordance with the expectation, the number of inpatients of the respiratory system per day, the number of patients to be diagnosed per day and the seasons are obtained within the next preset period, and performing data updating and online learning of the space-time multi-feature self-attention fusion network by the daily air quality index to finally obtain a predicted value which is in line with expectation, so that the space-time multi-feature self-attention fusion network can better adapt to a current respiratory system inpatient data set sequence, accurate prediction is realized, and the problem of poor quantity prediction effect caused by uncertain multi-factor influence factors is solved, thereby improving the accuracy of the respiratory system inpatient quantity prediction.
Example 2
This example is a specific example of example 1:
the method comprises the following steps: the number Num of respiratory system inpatients daily during 2013-2018 in the hospital in the designated area of the medical record system is obtained hos Acquiring the number Num of patients to be treated by the respiratory system every day from an outpatient service system out And in the season Sea, simultaneously acquiring a daily air quality index Q from the air quality online analysis monitoring platform in the vacuum network China from 2013 to 2018.
Step two: to represent characteristic factors and real labels affecting respiratory hospitalized patients, the four types of data are expressed as { Num } hos ,Num out Sea, Q } into 4-dimensional single vectors, and concatenating the 4-dimensional single vectors for each day for 6 years into [ t ] 1 ,t 2 ,t 3 ,t 4 ]N =2190, which is a matrix of n rows by 4 columns, where the number Num of respiratory hospitalized patients is the first column hos For the true label of the dataset, the last three columns [ Num out ,Sea,Q]Is a characteristic factor of the data set.
Step three: in order to obtain the stable linear characteristics of the matrix, VMD variational modal decomposition is carried out on the special diagnosis factors of the matrix, and the special diagnosis factors are decomposed into intrinsic modal components IMF 1 、IMF 2 、IMF 3 、IMF 4 、IMF 5 、 IMF 6 、IMF 7 、IMF 8 And a residual component Res, while taking the number of respiratory system hospitalizations per day and the intrinsic modal component IMF in order to reduce model complexity 1 、IMF 2 、IMF 3 Form a 4-dimensional training single vector Num hos ,IMF 1 ,IMF 2 ,IMF 3 ]And serially connecting the 4-dimensional training single vectors into [ T ] 1 ,T 2 ,T 3 , T 4 ]Finally, an n-row x 4-column stationary feature matrix is formed, wherein the number of respiratory hospitalized patients in the first column is the true label of the data set, and the IMFs in the last three columns 1 、IMF 2 、IMF 3 And dividing the stationary characteristic matrix into a plurality of initial training sets and a plurality of initial testing sets according to the time length of 4.
Specifically, the 1 st line to the 1752 nd line are the initial training set, and the 1753 rd line to the 2190 th line are the initial test set.
Specifically, the initial training set and the initial test set are reconstructed into a training set of 1+ (1752-w)/s training sets of w rows × 4 columns and a test set of 1+ (438-w)/s test sets of w rows × 4 columns respectively through a set sliding window w and a set step length s, and the specific reconstruction mode is as follows: and intercepting training single vectors from the jth row to the j + w row of the initial test set according to the sampling rate of the step length s, wherein j is more than or equal to 1 and less than or equal to 437-w, and sequentially arranging one by one. To make the data set diverse and rich, here w =6,s =1 is set, resulting in 1747 training sets and 433 test sets, respectively.
Step four: constructing a space-time multi-feature self-attention fusion network, wherein the space-time multi-feature self-attention fusion network consists of a space feature extraction structure, a time sequence feature extraction structure and a self-attention output structure, and inputting the training set to train the space-time multi-feature self-attention fusion network;
specifically, the spatial feature extraction structure includes 2 one-dimensional convolutional layers and 1 one-dimensional pooling layer, the number of channels of the first one-dimensional convolutional layer is 64, the size of a convolutional kernel is 1 × 1, the step size is 1, the activation function is relu, the number of channels of the second one-dimensional convolutional layer is 64, the size of the convolutional kernel is 1 × 1, the step size is 1, the activation function is relu, the size of the one-dimensional pooling layer is 2, and the step size is 1.
The timing characteristic extraction structure comprises 2 bidirectional long-short time memory layers, the number of units of the first bidirectional long-short time memory layer is 500, and the number of units of the second bidirectional long-short time memory layer is 200. The self-attention output structure contains 1 self-attention layer, 1 flat ply and 2 fully-connected layers.
The self-attention layer consists of query values query, key and value, and the calculation steps are as follows: firstly, calculating similarity between the query and each key dot product to obtain weight; and then, normalizing the weight obtained in the last step through a Softmax normalization function, and weighting and summing the normalized weight and the corresponding value to obtain the final Attention. Namely, it isQ, K and V are matrixes of query, key and value respectively, Q = K = V is a time sequence feature extraction structure output matrix, and d k Is the vector dimension of the query matrix. The flat layer has the function of converting the output of the self-attention mechanism into one-dimensional data, the number of the first full-connection layer neurons is 100, the activation function is relu, the number of the second full-connection layer neurons is 1, and 1 predicted value is represented and output.
After the construction of the space-time multi-feature self-attention fusion network is completed, the second-line data IMF of the training set is used 1 Third column data IMF 2 And a fourth column of data IMF 3 Respectively with the first column data Num hos Combined into 3 sets of corresponding data inputs, yielding 1747 [ IMFs 1 ,Num hos ]1747 pieces of [ IMF 2 ,Num hos ]And 1747 [ IMF 3 ,Num hos ]Num number of inpatients hos For a real label, 3 groups of data are respectively input into 3 space-time multi-feature self-attention fusion networks for training until all training sets are input.
And then training the 3 space-time multi-feature self-attention fusion networks in an input mode of the training set, initializing training weight parameters by adopting Kaiming He, simultaneously adopting the same training hyper-parameters, setting learning rates to be 0.001, and training 100 batches by adopting an Adam optimizer. Loss functionSet as the mean square error, defined as Is the Euclidean norm, where N is the number of training data per batch,network predicted value, num, for inpatients hos The actual value is the actual value of the inpatient. After training is completed, all the 3 space-time multi-feature self-attention fusion network models have the prediction capability on the number of hospitalized patients of a respiratory system.
Step five: after the training of the 3 space-time multi-feature self-attention fusion networks is completed, the second-line data IMF of the test set is respectively used 1 Third column data IMF 2 And fourth column data IMF 3 And inputting the correspondingly trained space-time multi-feature self-attention fusion network one by one to carry out forward reasoning until the test set is completely input.
For each of said natural modal components IMF 1 、IMF 2 、IMF 3 All can obtain the predicted number N of the respiratory system inpatients through the corresponding network output 1 、N 2 And N 3 And corresponding confidence degree P 1 、P 2 And P 3 。
Step six: if the confidence P is predicted<0.9,P∈{P 1 ,P 2 ,P 3 When the predicted values do not meet the expected requirements, acquiring the number of the respiratory system inpatients every day, the number of the patients to be hospitalized every day, seasons and the air quality index every day in the next preset period, repeating the second step, the second step and the fourth step, updating data and online learning of the space-time multi-feature self-attention fusion network, fixing the convolution layer, the two-way long-short-term memory layer and the self-attention layer parameters of the space-time multi-feature self-attention fusion network when the network model parameters are finely adjusted in online learning, and only fixing the parameters of the space-time multi-feature self-attention fusion network and only processing the space-time multi-feature self-attention fusion networkThe parameters of the full connection layer of the multi-feature self-attention fusion network are subjected to data updating, the purpose is to enable the network model to adapt to the current respiratory system patient data set sequence, namely accurate prediction can be achieved, otherwise, the respiratory system inpatient quantity prediction is considered to be accurate, and the 3 groups of average values P of the predicted patient quantity are obtained through calculation final =(P 1 +P 2 +P 3 ) And/3, the predicted number of hospitalized respiratory patients.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (10)
1. A method for predicting the number of hospitalized respiratory patients, comprising the steps of:
s1: acquiring the number of respiratory system inpatients per day in a preset period, acquiring the number of patients to be treated per day in the preset period, the season and the air quality index per day in the preset period;
s2: making a patient data set according to the number of inpatients of the respiratory system, the number of patients to be treated, seasons and the air quality index;
s3: decomposing the data set into a plurality of initial training sets and a plurality of initial test sets, and reconstructing the initial training sets and the initial test sets to obtain a plurality of training sets and a plurality of test sets;
s4: constructing a space-time multi-feature self-attention fusion network, and inputting the training set to train the space-time multi-feature self-attention fusion network;
s5: inputting the test set into a trained multi-feature self-attention fusion network, and predicting the number of patients in a respiratory system to obtain a predicted value;
s6: judging whether the predicted value meets the expectation, if so, outputting the average value of the predicted value, and if not, acquiring the number of the respiratory system inpatients per day, the number of the patients to be diagnosed per day, the season and the air quality index per day in the next preset period to update data, and repeating the steps S2-S5 to perform online learning.
2. The method as claimed in claim 1, wherein the step S2 comprises: arranging the number of respiratory system inpatients, the number of patients to be treated, seasons and the air quality index into 4-dimensional single vectors every day, and serially connecting the 4-dimensional single vectors in the preset period into [ t [ t ] ] 1 ,t 2 ,t 3 ,t 4 ]The data sets are finally arranged into a vector matrix with n rows and 4 columns, wherein the first column is a real label of the data sets, and the last three columns are characteristic factors of the data sets;
n is the number of days of the preset period.
3. The method of claim 2, wherein the step S3 of decomposing the data set into a plurality of initial training sets and a plurality of initial testing sets comprises: performing VMD variational modal decomposition on the special evidence factors to obtain intrinsic modal component IMF 1 、IMF 2 、IMF 3 、IMF 4 、IMF 5 、IMF 6 、IMF 7 、IMF 8 And a residual component Res, taking the daily number of said respiratory hospitalizations and said intrinsic mode component IMF 1 、IMF 2 、IMF 3 Forming 4-dimensional training single vectors, and connecting the 4-dimensional training single vectors in series to form [ T ] within the preset time limit every day 1 ,T 2 ,T 3 ,T 4 ]Finally forming a stationary feature matrix with n rows and 4 columns, wherein the first column is a real label of the stationary feature matrix, and the IMFs of the last three columns 1 、IMF 2 、IMF 3 Dividing the stationary feature matrix into a plurality of initial training sets and a plurality of initial testing sets according to the time length of 4.
4. The method as claimed in claim 3, wherein the step S3 of reconstructing the decomposed initial training set and initial test set to obtain a plurality of training sets and a plurality of test sets comprises: respectively reconstructing the initial training set and the initial test set into a training set of 1+ (3/4 n-w)/s w rows and 4 columns and a test set of (1/4 n-w)/s w rows and 4 columns through a set sliding window and a set step length;
where w is the sliding window and s is the step size.
5. The method as claimed in claim 4, wherein the spatiotemporal multi-feature self-attention fusion network in step S4 comprises: the system comprises a spatial feature extraction structure, a time sequence feature structure and a self-attention output structure;
the spatial feature extraction structure comprises two one-dimensional convolutional layers and one-dimensional pooling layer, the number of channels of the two one-dimensional convolutional layers is 64, the size of a convolutional kernel is 1 multiplied by 1, an activation function is relu, and the step length is 1; the size of the pooling layer of the one-dimensional pooling layer is 2, and the step length is 1;
the time sequence extraction characteristic structure comprises two bidirectional long-short time memory layers, the number of units of the first bidirectional long-short time memory layer is 500, and the number of units of the second bidirectional long-short time memory layer is 200;
the self-attention output structure comprises a self-attention layer, a flat layer and two full-connection layers, wherein the self-attention layer consists of query values query, key and value; the number of neurons in the first of said fully-linked layers is 100, the activation function is relu, and the number of neurons in the second of said fully-linked layers is 1.
6. The method as claimed in claim 5, wherein the calculating of the self-attention layer comprises: firstly, calculating the similarity between the query and the key dot product to obtain a weight, then normalizing the weight through a Softmax normalization function, and performing weighted summation on the processed weight and the corresponding value to obtain a final Attention, wherein the formula is as follows:
q, K and V are matrixes of query, key and value respectively, Q = K = V is a time sequence feature extraction structure output matrix, and d k Is the vector dimension of the query matrix.
7. The method as claimed in claim 5, wherein the step S4 further comprises: after the construction of the space-time multi-feature self-attention fusion network is completed, the intrinsic modal components IMF in the training set are used 1 、IMF 2 、IMF 3 And respectively forming three groups of corresponding data with the number of the hospitalized patients of the respiratory system, and respectively inputting the three groups of data into the three space-time multi-feature self-attention fusion networks for training until all the training sets are input.
8. The method for predicting the number of inpatients in a respiratory system according to claim 7, wherein in the training process of the spatio-temporal multi-feature self-attention fusion network, training weight parameters are initialized by Kaiming He, the same training hyper-parameters are adopted, learning rates are set to 0.001, 100 batches of training are performed by an Adam optimizer, a loss function is set as a mean square error, and the loss function is:
9. The method as claimed in claim 7, wherein the step S5 comprises: IMF the natural modal components in the test set 1 、IMF 2 、IMF 3 Correspondingly inputting three trained time-space multi-feature self-attention fusion networks for prediction until all the test sets are input;
the intrinsic modal component IMF in the test set 1 、IMF 2 、IMF 3 And after three trained space-time multi-feature self-attention fusion networks are correspondingly input, the prediction is finished and three groups of predicted values of the number of the respiratory system patients are output, wherein the predicted values comprise the predicted number and the confidence coefficient.
10. The method as claimed in claim 9, wherein the step S6 includes calculating an average of three sets of the predicted values if the confidence is less than 0.9 and the predicted value is not expected if the confidence is greater than or equal to 0.9;
when the steps S2 to S5 are repeated, data update needs to be performed on the full connection layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210893982.6A CN115204509A (en) | 2022-07-27 | 2022-07-27 | Method for predicting number of inpatients in respiratory system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210893982.6A CN115204509A (en) | 2022-07-27 | 2022-07-27 | Method for predicting number of inpatients in respiratory system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115204509A true CN115204509A (en) | 2022-10-18 |
Family
ID=83583925
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210893982.6A Withdrawn CN115204509A (en) | 2022-07-27 | 2022-07-27 | Method for predicting number of inpatients in respiratory system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115204509A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117079821A (en) * | 2023-10-12 | 2023-11-17 | 北京大学第三医院(北京大学第三临床医学院) | Patient hospitalization event prediction method |
CN117235487A (en) * | 2023-10-12 | 2023-12-15 | 北京大学第三医院(北京大学第三临床医学院) | Feature extraction method and system for predicting hospitalization event of asthma patient |
-
2022
- 2022-07-27 CN CN202210893982.6A patent/CN115204509A/en not_active Withdrawn
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117079821A (en) * | 2023-10-12 | 2023-11-17 | 北京大学第三医院(北京大学第三临床医学院) | Patient hospitalization event prediction method |
CN117235487A (en) * | 2023-10-12 | 2023-12-15 | 北京大学第三医院(北京大学第三临床医学院) | Feature extraction method and system for predicting hospitalization event of asthma patient |
CN117079821B (en) * | 2023-10-12 | 2023-12-19 | 北京大学第三医院(北京大学第三临床医学院) | Patient hospitalization event prediction method |
CN117235487B (en) * | 2023-10-12 | 2024-03-12 | 北京大学第三医院(北京大学第三临床医学院) | Feature extraction method and system for predicting hospitalization event of asthma patient |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111310707B (en) | Bone-based graph annotation meaning network action recognition method and system | |
CN115204509A (en) | Method for predicting number of inpatients in respiratory system | |
CN110111901B (en) | Migratable patient classification system based on RNN neural network | |
CN111370084B (en) | BiLSTM-based electronic health record representation learning method and system | |
CN109635917A (en) | A kind of multiple agent Cooperation Decision-making and training method | |
CN116682553A (en) | Diagnosis recommendation system integrating knowledge and patient representation | |
CN111222992A (en) | Stock price prediction method of long-short term memory neural network based on attention mechanism | |
CN113298131B (en) | Attention mechanism-based time sequence data missing value interpolation method | |
CN110889496A (en) | Human brain effect connection identification method based on confrontation generation network | |
CN112598165A (en) | Private car data-based urban functional area transfer flow prediction method and device | |
CN116109978A (en) | Self-constrained dynamic text feature-based unsupervised video description method | |
CN115115830A (en) | Improved Transformer-based livestock image instance segmentation method | |
CN114299006A (en) | Self-adaptive multi-channel graph convolution network for joint graph comparison learning | |
CN114912666A (en) | Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism | |
CN115859792A (en) | Medium-term power load prediction method and system based on attention mechanism | |
CN114969078A (en) | Method for updating expert research interest of federated learning through real-time online prediction | |
CN118447994A (en) | Diagnostic report generation system based on medical image and disease attribute description pair | |
CN113935458A (en) | Air pollution multi-site combined prediction method based on convolution self-coding deep learning | |
Xuan et al. | A comprehensive evaluation of statistical, machine learning and deep learning models for time series prediction | |
US20240070690A1 (en) | Method and system for forecasting agricultural product price based on signal decomposition and deep learning | |
CN111932007A (en) | Power prediction method and device for photovoltaic power station and storage medium | |
CN116257786A (en) | Asynchronous time sequence classification method based on multi-element time sequence diagram structure | |
CN116612860A (en) | Cerebral apoplexy training method and training system based on TST deep learning | |
Wang et al. | Multivariate Time Series Imputation Based on Masked Autoencoding with Transformer | |
CN118194139B (en) | Spatio-temporal data prediction method based on adaptive graph learning and nerve controlled differential equation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20221018 |
|
WW01 | Invention patent application withdrawn after publication |