CN115114979A - Infectious disease propagation state prediction method and device based on clustering - Google Patents

Infectious disease propagation state prediction method and device based on clustering Download PDF

Info

Publication number
CN115114979A
CN115114979A CN202210720633.4A CN202210720633A CN115114979A CN 115114979 A CN115114979 A CN 115114979A CN 202210720633 A CN202210720633 A CN 202210720633A CN 115114979 A CN115114979 A CN 115114979A
Authority
CN
China
Prior art keywords
state
infectious disease
rate
clustering
markov
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210720633.4A
Other languages
Chinese (zh)
Inventor
许丹
郭云霄
谢欣嘉
赵润豪
盖顺
朱成岚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202210720633.4A priority Critical patent/CN115114979A/en
Publication of CN115114979A publication Critical patent/CN115114979A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Landscapes

  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The application relates to a cluster-based infectious disease propagation state prediction method and a cluster-based infectious disease propagation state prediction device, which are characterized in that relevant data of infectious disease prevalence conditions in a preset time period of a target area are obtained in real time, a definite diagnosis rate index, a death rate index and a recovery rate index are constructed, a Markov state space is constructed according to the three indexes, the state space is divided according to a method of minimizing deviation square sum to obtain a preset number of state subspaces, a Markov chain is constructed according to the state subspaces to obtain a probability transition matrix, and the propagation state of infectious diseases is predicted according to the probability transition matrix. The invention comprehensively considers the diagnosis rate index, the death rate index and the recovery rate index of the infectious disease, analyzes the development of the infectious disease by adopting a clustering method according to the characteristics of the indexes, constructs and analyzes the stationarity of a probability transfer matrix, and accurately predicts the state development of the epidemic situation of the infectious disease.

Description

Infectious disease propagation state prediction method and device based on clustering
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for predicting an infectious disease propagation status based on clustering.
Background
Infectious disease prevalence assessment is difficult to quantify. Along with the development of the era, viruses can generate various mutations, and governments can also set corresponding policies, so that the epidemic situation of infectious diseases and the related data collection cannot meet the independent and same-distribution assumption. Therefore, conventional statistical methods tend to be difficult to work with. In terms of prediction, the conventional SIR infectious disease model uses only initial historical data, and the prediction error increases with time. Aiming at the defects of the traditional infectious disease model, in order to overcome the defects, many scholars predict the infectious disease incidence rate by the Markov chain theory and establish some mathematical models. Researchers have used the markov chain method to comprehensively analyze and study the incidence of various infectious diseases, and have adopted a quantitative analysis method. The Markov chain model has low dependency on historical data and has loose requirement on the condition of independent and same distribution of the data due to the non-aftereffect of the Markov chain model, so that the accuracy is high. However, the markov chain method has limitations, and the markov chain model adopts an artificial preset interval to divide the state space, so that the method is often subjective, and the result is influenced by certain subjective bias; secondly, the traditional Markov chain model focuses on transferring the state space of the number of infected persons, the situation of the epidemic situation of the infectious disease is difficult to be effectively evaluated by a single index, and when the index is gradually increased, the division of the state space of the infectious disease has more subjectivity. In addition, the adjustment of the transition probability matrix, the prediction accuracy which is greatly influenced by objective factors, and the like are also insufficient in the traditional Markov chain model.
The establishment of a mathematical model by applying a Markov chain theory and the prediction of the incidence of the infectious diseases aim to absorb the advantage of no aftereffect of the traditional Markov chain method, relax the assumption of independent and same distribution of data, fully play the role of historical data by virtue of the advantages of a related analysis method, and further evaluate the epidemic situation if a related evaluation index and a division method of the epidemic situation state space of the infectious diseases under multiple indexes can be established and the attributes of the spread speed, the risk degree, the influence of policies, major events and the like of the epidemic situation of the infectious diseases are fully mined.
Disclosure of Invention
In view of the above, there is a need to provide a method and an apparatus for predicting an infectious disease transmission status based on clustering, which take evaluation of an infectious disease prevalence situation as a starting point.
A cluster-based infectious disease transmission status prediction method, the method comprising:
acquiring an infectious disease data set of a target area, wherein the infectious disease data set comprises infectious disease data at a plurality of different moments; attributes of individual infectious disease data include the rate of diagnosis, mortality, and recovery;
constructing a Markov state space according to the diagnosis rate, mortality rate and recovery rate of the infectious disease data;
clustering the infectious disease data based on a method of minimizing deviation sum of squares to further divide the Markov state space to obtain a plurality of state subspaces;
and constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix, and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.
In one embodiment, the method for minimizing sum of squared deviations clustering infectious disease data to partition the markov state space into a plurality of state subspaces comprises:
clustering the infectious disease data based on a method of minimizing deviation sum of squares to obtain an initial clustering result; wherein the sum of squared deviations of the ith cluster in the initial clustering result is:
Figure RE-GDA0003753124290000021
wherein x is k,i =[R k,c ,R k,d ,R k,r ]T, the definite diagnosis rate, mortality rate and recovery rate at time k of the infectious disease data grouped as the ith cluster, n i Represents the total number of steps of the ith cluster,
Figure RE-GDA0003753124290000022
represents the center of the ith cluster;
and performing pairwise pre-combination on each cluster in the initial clustering result, and calculating the increment of the deviation square sum inside the pre-combined clusters relative to the deviation square sum inside the two corresponding clusters before pre-combination:
ΔL=L i∪j -(L i +L j )
where Δ L is the increment of the sum of squares of deviations, L i∪j For pre-merging the i, j clusters to obtain the deviation square sum, L, in the new cluster i Is the sum of squares of deviations within i clusters before precombinations, L j The sum of squares of deviations inside the j clusters before precombinations;
and taking the pre-merged clusters corresponding to the minimum increment and the pre-merged clusters corresponding to the non-minimum increment as initial Markov state spaces of the next clustering process until a preset number of clusters are obtained, taking the preset number of clusters as each cluster of the final clustering, and taking each cluster of the final clustering as each state subspace obtained after the Markov state spaces are divided.
In one embodiment, the constructing a markov chain from the plurality of state subspaces results in a state transition matrix, including:
constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix as follows:
Figure RE-GDA0003753124290000031
where m is the number of state subspaces, P ij Is a state subspace S i Transfer to the State subspace S by 1 step j The transition probability of (2).
The method for calculating the diagnosis rate, the death rate and the recovery rate comprises the following steps:
the confirmed diagnosis rate is calculated according to the confirmed diagnosis number in the preset time period of the target area and the total population of the target area as follows:
Figure RE-GDA0003753124290000032
wherein R is c For confirmation of diagnosis rate, N c The number of the confirmed persons in a preset time period, N t Is the general population of the target area;
and calculating the death rate according to the death number in the preset time period of the target area and the accumulated diagnosed number of the infectious diseases in the target area as follows:
Figure RE-GDA0003753124290000041
wherein R is d For mortality, N d (t) the number of people diagnosed within a preset time period,
Figure RE-GDA0003753124290000042
accumulating the number of deaths for the target area infectious disease;
calculating the cure rate according to the number of healed people in the preset time period of the target area and the accumulated number of confirmed diagnosis people of the infectious diseases in the target area as follows:
Figure RE-GDA0003753124290000043
wherein R is r For recovery rate, N r (t) the number of healings within a predetermined time period.
A cluster-based infectious disease transmission status prediction apparatus, the apparatus comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring an infectious disease data set of a target area, and the infectious disease data set comprises a plurality of infectious disease data at different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;
the construction module is used for constructing a Markov state space according to the diagnosis rate, the death rate and the recovery rate of the infectious disease data;
the partitioning module is used for clustering the infectious disease data based on a method of minimizing deviation sum of squares so as to partition the Markov state space to obtain a plurality of state subspaces;
and the prediction module is used for constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring an infectious disease data set of a target area, wherein the infectious disease data set comprises infectious disease data at a plurality of different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;
constructing a Markov state space according to the diagnosis rate, mortality rate and recovery rate of the infectious disease data;
clustering the infectious disease data based on a method of minimizing deviation sum of squares to further divide the Markov state space to obtain a plurality of state subspaces;
and constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix, and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring an infectious disease data set of a target area, wherein the infectious disease data set comprises infectious disease data at a plurality of different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;
constructing a Markov state space according to the diagnosis rate, mortality rate and recovery rate of the infectious disease data;
clustering the infectious disease data based on a method of minimizing deviation sum of squares to further divide the Markov state space to obtain a plurality of state subspaces;
and constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix, and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.
According to the infectious disease propagation state prediction method and device based on clustering, relevant data of infectious disease prevalence conditions in a preset time period of a target area are obtained in real time, a definite diagnosis rate index, a mortality rate index and a recovery rate index are constructed, a Markov state space is constructed according to the three indexes, the state space is divided according to a method of minimizing deviation square sum, a preset number of state subspaces are obtained, a Markov chain is constructed according to the state subspaces to obtain a probability transition matrix, and the propagation state of infectious diseases is predicted according to the probability transition matrix. The invention comprehensively considers the diagnosis rate index, the death rate index and the recovery rate index of the infectious disease, analyzes the development of the infectious disease by adopting a clustering method according to the characteristics of the indexes, constructs and analyzes the property (mainly the stability) of a probability transfer matrix, and accurately predicts the state development of the epidemic situation of the infectious disease.
Drawings
FIG. 1 is a schematic flow chart illustrating a method for cluster-based infectious disease transmission status prediction in one embodiment;
FIG. 2 is a diagram illustrating a method for uniformly partitioning a state space according to an embodiment;
FIG. 3 is a diagram illustrating the construction and partitioning of a Markov state space for region A, according to one embodiment;
FIG. 4 is a result of the Markov state space construction and partitioning of the B region under one embodiment;
FIG. 5 is a block diagram of an apparatus for predicting an infectious disease propagation status based on clustering according to an embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, there is provided a cluster-based infectious disease transmission status prediction method, comprising the steps of:
step 102, an infectious disease data set of a target area is obtained.
The infectious disease data set comprises infectious disease data of a plurality of different moments, the attribute of each infectious disease data comprises diagnosis rate, death rate and recovery rate,
the data set in the embodiment is dynamic stream data acquired in real time, and according to the characteristics of the stream data, the influence that the epidemic situation data does not meet the independent homogeneous distribution assumption can be avoided, and the epidemic situation evaluation is performed only on the basis of local areas.
A markov state space is constructed based on the rate of certainty, mortality, and recovery of the infectious disease data, step 104.
Each disease, regardless of its occurrence, progression and cure, is affected by many external and internal factors of the human body, which have complex relationships and are difficult to interpret by structural causal models. This interdependence between data is the most important and useful feature of the study. According to the change rule of the infectious disease data along with the time sequence, the establishment of the dynamic model is an effective method.
The embodiment constructs three indexes (R) for measuring the epidemic situation development of the infectious diseases c 、R d 、R r ) The three indexes represent the development state of the epidemic situation of the infectious disease from different angles, so that the constructed Markov state space contains rich information. (ii) a For example, when the virus is highly infectious but easily cured, the index may be expressed as R c Larger, R r Is also large; when the virus is less infectious but the mortality is higher, the index may indicate R c Smaller, R d Is relatively large. It can be seen that different scale of index combinations can be used to measure different epidemic states of infectious diseases.
And 106, clustering the infectious disease data based on a method of minimizing deviation sum of squares to further divide the Markov state space to obtain a plurality of state subspaces.
Infectious disease prevalence assessment is difficult to quantify. Along with the development of the era, viruses can generate various mutations, and governments can also set corresponding policies, so that the data related to the epidemic situation of the infectious diseases cannot meet the independent and identically distributed hypothesis. Therefore, the traditional statistical method is often difficult to work, and a streaming data clustering method is adopted to set sampling results in the same distribution in the same cluster, so that a plurality of state subspaces corresponding to a plurality of clusters can be obtained.
In the method for uniformly dividing the state space, each index is divided into two cases of high and low, and then the two cases of the three indexes are combined to obtain eight states in total, as shown in fig. 2, but in the actual case, the distribution of the epidemic situation development states of the infectious disease at different times in the same target area is uneven and relatively concentrated, so that a more reasonable state space division method which is more consistent with the actual case is required, namely, the state space division method is based on the characteristics (R) of the historical data c 、R d 、R r ) The data is divided into K clusters to minimize the difference between each data sample in each cluster. The sum of squared differences represents the scatter of the data well, so the sum of the variances of all data in the same cluster represents the similarity of the data attributes in the same cluster well.
And 108, constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix, and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.
The multiple state subspaces are correlated in time and space, so that a Markov chain can be constructed according to the multiple state subspaces, the Markov chain is further constructed according to the time and space correlation among the state subspaces, a state transition matrix is further calculated, and the state subspace in which the three indexes of the epidemic situation of the infectious disease are gradually stabilized is obtained by analyzing the stationarity of the state transition matrix, so that the propagation state of the infectious disease is predicted.
The infectious disease propagation state prediction method based on clustering is characterized in that relevant data of infectious disease prevalence conditions in a preset time period of a target area are obtained in real time, a definite diagnosis rate index, a death rate index and a recovery rate index are constructed, a Markov state space is constructed according to the three indexes, the state space is divided according to a method of minimizing deviation square sum to obtain a preset number of state subspaces, a Markov chain is constructed according to the state subspaces to obtain a probability transition matrix, and the propagation state of infectious diseases is predicted according to the probability transition matrix. The invention comprehensively considers the diagnosis rate index, the death rate index and the recovery rate index of the infectious disease, analyzes the development of the infectious disease by adopting a clustering method according to the characteristics of the indexes, constructs and analyzes the property (mainly the stability) of a probability transfer matrix, and accurately predicts the state development of the epidemic situation of the infectious disease.
In one embodiment, the method for calculating the diagnosis rate, the death rate and the recovery rate comprises the following steps:
the confirmed diagnosis rate is calculated according to the confirmed diagnosis number in the preset time period of the target area and the total population of the target area as follows:
Figure RE-GDA0003753124290000081
wherein R is c For confirmation of diagnosis rate, N c The number of the confirmed persons in a preset time period, N t Is the general population of the target area;
the death rate is calculated according to the death number in the preset time period of the target area and the accumulated confirmed diagnosis number of the infectious diseases in the target area as follows:
Figure RE-GDA0003753124290000082
wherein R is d For mortality, N d (t) is the number of deaths within a predetermined time period,
Figure RE-GDA0003753124290000083
the accumulated number of confirmed people for the target region infectious diseases;
calculating the cure rate according to the number of healed people in the preset time period of the target area and the accumulated number of confirmed diagnosis people of the infectious diseases in the target area as follows:
Figure RE-GDA0003753124290000084
wherein R is r For recovery rate, N r (t) the number of healings within a predetermined time period.
In one embodiment, clustering the infectious disease data based on a method of minimizing a sum of squared deviations to further partition a markov state space to obtain a plurality of state subspaces, comprises:
clustering the infection data based on a method of minimizing deviation sum of squares to obtain an initial clustering result; wherein the sum of squared deviations of the ith cluster in the initial clustering result is:
Figure RE-GDA0003753124290000085
wherein x is k,i =[R k,c ,R k,d ,R k,r ] T The definite diagnosis rate, mortality rate and recovery rate at time k of infectious disease data grouped as the ith cluster, n i Represents the total number of steps of the ith cluster,
Figure RE-GDA0003753124290000091
represents the center of the ith cluster;
and performing pairwise pre-combination on each cluster in the initial clustering result, and calculating the increment of the deviation square sum inside the pre-combined clusters relative to the deviation square sum inside the two clusters before the corresponding pre-combination:
ΔL=L i∪j -(L i +L j )
where Δ L is the increment of the sum of squares of deviations, L i∪j For pre-combining the i, j clusters to obtain a new cluster internal deviation square sum, L i Is the sum of squares of deviations within i clusters before precombinations, L j The sum of squares of deviations inside the j clusters before precombinations;
and taking the pre-merged clusters corresponding to the minimum increment and the pre-merged clusters corresponding to the non-minimum increment as initial Markov state spaces of the next clustering process until a preset number of clusters are obtained, taking the preset number of clusters as each cluster of the final clustering, and taking each cluster of the final clustering as each state subspace obtained after the Markov state spaces are divided.
For example, the data of epidemic situation of infectious disease in a certain period of time in the region a is selected to construct and divide the markov state space, and as a result, as shown in fig. 2, 5 clusters, i.e., 5 state subspaces, are obtained, and it can be seen that the state subspace 1 is characterized by a high R r Low R d (ii) a The state subspace 2 is characterized by a middle R r High R c Low R d (ii) a The state subspace 3 is characterized by a low R r Low R d (ii) a The state subspace 4 is characterized by a low R r High R d (ii) a The state subspace 5 is characterized by a low R r Wherein R is d . The data of the prevalence of infectious diseases in a certain time period in the B region is selected to construct and divide a markov state space, and the result is shown in fig. 3. It can be seen that the distribution of the acquired state subspace of the data of the epidemic situation of the infectious diseases in different areas is different, which accords with the real situation, different control strategies for the epidemic situation of the infectious diseases are adopted in different areas, and the corresponding development trends of the epidemic situation of the infectious diseases are different.
In one embodiment, constructing a Markov chain from a plurality of state subspaces results in a state transition matrix comprising:
constructing a Markov chain from the plurality of state subspaces to obtain a state transition matrix as follows:
Figure RE-GDA0003753124290000101
where m is the number of state subspaces, P ij Is a state subspace S i Transfer to the State subspace S through 1 step j The transition probability of (2). The constructed state space may be divided into different state subspaces. Of course, different prevention and control strategies are adopted by the regional managers, which will cause the epidemic situation of the infectious diseases to spread in the whole national space. Therefore, the Markov state transition matrix changes in different periods, namely, the influence of different prevention and control strategies can be known by analyzing the change of the Markov chain transition matrix in different periods.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
A specific example is provided herein to illustrate the method in detail:
the prevention and control strategy of the area A is as follows:
2 month 23 day: restricted access to areas with more severe epidemics.
3 month 16 day: a social distance policy is adopted, and more than 50 people are prohibited from gathering for 26 days to 7 months.
Therefore, we selected data from day 23/2 to day 30/3 for calculation. Considering that the incubation period of the infection is 14 days, we have chosen 3 months and 30 days instead of 3 months and 16 days, because even if a new control strategy is released at 3 months and 16 days, the next 14 days will show the epidemic transmission characteristics of the previous strategy. As shown in fig. 4, a markov chain is obtained, and the corresponding state transition matrix is obtained:
Figure RE-GDA0003753124290000111
by applying the stationarity judging method of the markov chain, it can be known that the markov chain is stationary. Therefore, under the control strategy, the influence state of the spread of the epidemic situation of the infectious disease can be kept stable. It is possible to obtain:
Figure RE-GDA0003753124290000112
this means that under this strategy the epidemic status will approach status 2, R gradually c Higher, and R is r Low, indicating that if the strategy is continued, the spread of the epidemic will not improve, requiring adjustment of the strategy.
As can be seen from the final smooth distribution, the isolation control strategy taken 3 months and 16 days helps control the epidemic. During smooth distribution, the initial strategy (day 23 and 2 months) will transition from state 1 to state 2.
Then, selecting data from 3 months and 30 days to 5 months and 26 days for calculation to obtain a probability transition matrix:
Figure RE-GDA0003753124290000113
this markov chain is also smooth. Under this policy, the impact of the spread of the epidemic of the infectious disease will also remain stable, resulting in:
Figure RE-GDA0003753124290000121
the measures taken on day 16/3 reduced the probability of transitioning from state 1 to state 2 to 0.82 and the probability of transitioning to state 5 to 0.18. State 5 represents high R r Low R d And R c I.e., low infection rate, low mortality rate, and high cure rate. Therefore, the measures taken 3 months and 16 days have a positive effect on controlling the spread of the epidemic.
And finally, selecting data from 26 days in 5 months to 4 days in 7 months to obtain a probability transition matrix:
Figure RE-GDA0003753124290000122
the markov chain is also stationary, we get:
Figure RE-GDA0003753124290000123
if continued, the epidemic in area A will be in State 1: high levels of R c And R d While at the same time at a low level of R c A state of complete runaway with high mortality, high diagnostic rate, low cure rate; or at 59% R c And R r In the low state, the effect of disease transmission cannot be controlled. This indicates that in the context of epidemic spread, paradoxical movement will exacerbate the epidemic.
In one embodiment, as shown in fig. 5, there is provided a cluster-based infectious disease transmission status prediction apparatus including: the system comprises an acquisition module, an index construction module and a division module, wherein:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring an infectious disease data set of a target area, and the infectious disease data set comprises a plurality of infectious disease data at different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;
the construction module is used for constructing a Markov state space according to the diagnosis rate, the death rate and the recovery rate of the infectious disease data;
the division module is used for clustering the infectious disease data based on a method of minimizing deviation sum of squares so as to divide the Markov state space to obtain a plurality of state subspaces;
and the prediction module is used for constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.
For specific limitations of the device for predicting infectious disease transmission status based on clustering, see the above limitations on the method for predicting infectious disease transmission status based on clustering, which are not described herein again. The modules in the cluster-based infectious disease propagation state prediction apparatus may be wholly or partially implemented by software, hardware, or a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing the related data of the epidemic situation of the infectious diseases. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a cluster-based infectious disease transmission status prediction method.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the method in the above embodiments when the processor executes the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method in the above-mentioned embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
All possible combinations of the technical features in the above embodiments may not be described for the sake of brevity, but should be considered as being within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (7)

1. An infectious disease transmission state prediction method based on clustering is characterized by comprising the following steps:
acquiring an infectious disease data set of a target area; the infectious disease data set comprises infectious disease data at a plurality of different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;
constructing a Markov state space according to the diagnosis rate, mortality rate and recovery rate of the infectious disease data;
clustering the infectious disease data based on a method of minimizing deviation sum of squares to further divide the Markov state space to obtain a plurality of state subspaces;
and constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix, and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.
2. The method of claim 1, wherein clustering the infectious disease data and partitioning the markov state space into a plurality of state subspaces based on the least squares of deviation sum of squares method comprises:
clustering the infectious disease data based on a method of minimizing deviation sum of squares to obtain an initial clustering result; wherein the sum of squared deviations of the ith cluster in the initial clustering result is:
Figure RE-FDA0003753124280000011
wherein x is k,i =[R k,c ,R k,d ,R k,r ] T Denotes the definite diagnosis rate, mortality rate and recovery rate at time k of infectious disease data clustered as the ith cluster, n i Represents the total number of steps of the ith cluster,
Figure RE-FDA0003753124280000012
represents the center of the ith cluster;
and performing pairwise pre-combination on each cluster in the initial clustering result, and calculating the increment of the deviation square sum inside the pre-combined clusters relative to the deviation square sum inside the two corresponding clusters before pre-combination:
ΔL=L i∪j -(L i +L j )
where Δ L is the increment of the sum of squares of deviations, L i∪j For pre-merging the i, j clusters to obtain the deviation square sum, L, in the new cluster i Is the sum of squares of deviations within i clusters before precombinations, L j The square sum of the deviation in the j clusters before pre-combination;
and taking the pre-merged clusters corresponding to the minimum increment and the pre-merged clusters corresponding to the non-minimum increment as initial Markov state spaces of the next clustering process until a preset number of clusters are obtained, taking the preset number of clusters as each cluster of the final clustering, and taking each cluster of the final clustering as each state subspace obtained after the Markov state spaces are divided.
3. The method of claim 1, wherein constructing a Markov chain from the plurality of state subspaces results in a state transition matrix comprising:
constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix as follows:
Figure RE-FDA0003753124280000021
where m is the number of state subspaces, P ij Is a state subspace S i Transfer to the State subspace S through 1 step j The transition probability of (2).
4. The method of claim 1, wherein the rate of diagnosis, mortality, and recovery are calculated by:
the confirmed diagnosis rate is calculated according to the confirmed diagnosis number in the preset time period of the target area and the total population of the target area as follows:
Figure RE-FDA0003753124280000022
wherein R is c For confirmation of diagnosis rate, N c The number of the confirmed persons in a preset time period, N t Is the general population of the target area;
the death rate is calculated according to the death number in the preset time period of the target area and the accumulated confirmed diagnosis number of the infectious diseases in the target area as follows:
Figure RE-FDA0003753124280000031
wherein R is d For mortality, N d (t) is the number of deaths within a predetermined time period,
Figure RE-FDA0003753124280000032
the accumulated number of confirmed people for the target region infectious diseases;
calculating the cure rate according to the number of cure people in a preset time period of the target area and the cumulative number of confirmed patients of the target area infectious diseases as follows:
Figure RE-FDA0003753124280000033
wherein R is r For recovery rate, N r (t) the number of healings within a predetermined time period.
5. An apparatus for predicting infectious disease transmission status based on clustering, the apparatus comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring an infectious disease data set of a target area, and the infectious disease data set comprises a plurality of infectious disease data at different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;
the construction module is used for constructing a Markov state space according to the diagnosis rate, the death rate and the recovery rate of the infectious disease data;
the partitioning module is used for clustering the infectious disease data based on a method of minimizing deviation sum of squares so as to partition the Markov state space to obtain a plurality of state subspaces;
and the prediction module is used for constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix and predicting the transmission state of the infectious diseases according to the stationarity of the state transition matrix.
6. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 4 when executing the computer program.
7. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 4.
CN202210720633.4A 2022-06-23 2022-06-23 Infectious disease propagation state prediction method and device based on clustering Pending CN115114979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210720633.4A CN115114979A (en) 2022-06-23 2022-06-23 Infectious disease propagation state prediction method and device based on clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210720633.4A CN115114979A (en) 2022-06-23 2022-06-23 Infectious disease propagation state prediction method and device based on clustering

Publications (1)

Publication Number Publication Date
CN115114979A true CN115114979A (en) 2022-09-27

Family

ID=83328426

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210720633.4A Pending CN115114979A (en) 2022-06-23 2022-06-23 Infectious disease propagation state prediction method and device based on clustering

Country Status (1)

Country Link
CN (1) CN115114979A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168847A (en) * 2023-04-26 2023-05-26 南京邮电大学 Infectious disease prediction method based on optimized next generation reserve pool calculation
CN116703528A (en) * 2023-07-31 2023-09-05 山东资略信息技术有限公司 Medical sales management system and management method thereof
CN117457231A (en) * 2023-10-27 2024-01-26 中山大学 Virus propagation risk calculation method and device based on Markov chain model
CN117690601A (en) * 2024-02-02 2024-03-12 江西省胸科医院(江西省第三人民医院) Tuberculosis epidemic trend prediction system based on big data analysis
CN117457231B (en) * 2023-10-27 2024-06-11 中山大学 Virus propagation risk calculation method and device based on Markov chain model

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168847A (en) * 2023-04-26 2023-05-26 南京邮电大学 Infectious disease prediction method based on optimized next generation reserve pool calculation
CN116168847B (en) * 2023-04-26 2023-08-11 南京邮电大学 Infectious disease prediction method based on optimized next generation reserve pool calculation
CN116703528A (en) * 2023-07-31 2023-09-05 山东资略信息技术有限公司 Medical sales management system and management method thereof
CN116703528B (en) * 2023-07-31 2023-11-17 山东资略信息技术有限公司 Medical sales management system and management method thereof
CN117457231A (en) * 2023-10-27 2024-01-26 中山大学 Virus propagation risk calculation method and device based on Markov chain model
CN117457231B (en) * 2023-10-27 2024-06-11 中山大学 Virus propagation risk calculation method and device based on Markov chain model
CN117690601A (en) * 2024-02-02 2024-03-12 江西省胸科医院(江西省第三人民医院) Tuberculosis epidemic trend prediction system based on big data analysis
CN117690601B (en) * 2024-02-02 2024-05-24 江西省胸科医院(江西省第三人民医院) Tuberculosis epidemic trend prediction system based on big data analysis

Similar Documents

Publication Publication Date Title
CN115114979A (en) Infectious disease propagation state prediction method and device based on clustering
CN111524611B (en) Method, device and equipment for constructing infectious disease trend prediction model
US8594982B2 (en) Systems and methods for distributed calculation of fatigue-risk prediction and optimization
Osthus et al. Forecasting seasonal influenza with a state-space SIR model
WO2019237523A1 (en) Safety risk evaluation method and apparatus, computer device, and storage medium
Carvalho et al. Conservation planning under climate change: Toward accounting for uncertainty in predicted species distributions to increase confidence in conservation investments in space and time
CN110675959A (en) Intelligent data analysis method and device, computer equipment and storage medium
WO2021190658A1 (en) Infectious disease prediction device, method, and apparatus, and storage medium
Yan et al. An improved method for the fitting and prediction of the number of covid-19 confirmed cases based on lstm
US8762319B2 (en) Method, system, and computer-accessible medium for inferring and/or determining causation in time course data with temporal logic
Devarajan et al. Healthcare operations and black swan event for COVID-19 pandemic: A predictive analytics
WO2009052633A1 (en) Systems and methods for individualized alertness predictions
WO2021180245A1 (en) Server, data processing method and apparatus, and readable storage medium
WO2023087917A1 (en) Cognitive decision-making evaluation method and system based on multi-dimensional hierarchical drift diffusion model
Xia et al. Controlling epidemics through optimal allocation of test kits and vaccine doses across networks
JP2021018821A (en) Data intelligent prediction method, device, computer device, and storage medium
Ho et al. How to go viral: A COVID-19 model with endogenously time-varying parameters
US20220172085A1 (en) Methods and Systems to Account for Uncertainties from Missing Covariates in Generative Model Predictions
CN113566831B (en) Unmanned aerial vehicle cluster navigation method, device and equipment based on human-computer interaction
De Santis et al. A simulation-based optimization approach for the calibration of a discrete event simulation model of an emergency department
Zhang et al. FalsifAI: Falsification of AI-enabled hybrid control systems guided by time-aware coverage criteria
CN112070129B (en) Ground settlement risk identification method, device and system
CN113962476A (en) Insect pest prediction method, device, equipment and storage medium
CN113161004A (en) Epidemic situation prediction system and method
CN113158435A (en) Complex system simulation running time prediction method and device based on ensemble learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination