CN115114979A

CN115114979A - Infectious disease propagation state prediction method and device based on clustering

Info

Publication number: CN115114979A
Application number: CN202210720633.4A
Authority: CN
Inventors: 许丹; 郭云霄; 谢欣嘉; 赵润豪; 盖顺; 朱成岚
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2022-06-23
Filing date: 2022-06-23
Publication date: 2022-09-27

Abstract

The application relates to a cluster-based infectious disease propagation state prediction method and a cluster-based infectious disease propagation state prediction device, which are characterized in that relevant data of infectious disease prevalence conditions in a preset time period of a target area are obtained in real time, a definite diagnosis rate index, a death rate index and a recovery rate index are constructed, a Markov state space is constructed according to the three indexes, the state space is divided according to a method of minimizing deviation square sum to obtain a preset number of state subspaces, a Markov chain is constructed according to the state subspaces to obtain a probability transition matrix, and the propagation state of infectious diseases is predicted according to the probability transition matrix. The invention comprehensively considers the diagnosis rate index, the death rate index and the recovery rate index of the infectious disease, analyzes the development of the infectious disease by adopting a clustering method according to the characteristics of the indexes, constructs and analyzes the stationarity of a probability transfer matrix, and accurately predicts the state development of the epidemic situation of the infectious disease.

Description

Infectious disease propagation state prediction method and device based on clustering

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for predicting an infectious disease propagation status based on clustering.

Background

Infectious disease prevalence assessment is difficult to quantify. Along with the development of the era, viruses can generate various mutations, and governments can also set corresponding policies, so that the epidemic situation of infectious diseases and the related data collection cannot meet the independent and same-distribution assumption. Therefore, conventional statistical methods tend to be difficult to work with. In terms of prediction, the conventional SIR infectious disease model uses only initial historical data, and the prediction error increases with time. Aiming at the defects of the traditional infectious disease model, in order to overcome the defects, many scholars predict the infectious disease incidence rate by the Markov chain theory and establish some mathematical models. Researchers have used the markov chain method to comprehensively analyze and study the incidence of various infectious diseases, and have adopted a quantitative analysis method. The Markov chain model has low dependency on historical data and has loose requirement on the condition of independent and same distribution of the data due to the non-aftereffect of the Markov chain model, so that the accuracy is high. However, the markov chain method has limitations, and the markov chain model adopts an artificial preset interval to divide the state space, so that the method is often subjective, and the result is influenced by certain subjective bias; secondly, the traditional Markov chain model focuses on transferring the state space of the number of infected persons, the situation of the epidemic situation of the infectious disease is difficult to be effectively evaluated by a single index, and when the index is gradually increased, the division of the state space of the infectious disease has more subjectivity. In addition, the adjustment of the transition probability matrix, the prediction accuracy which is greatly influenced by objective factors, and the like are also insufficient in the traditional Markov chain model.

The establishment of a mathematical model by applying a Markov chain theory and the prediction of the incidence of the infectious diseases aim to absorb the advantage of no aftereffect of the traditional Markov chain method, relax the assumption of independent and same distribution of data, fully play the role of historical data by virtue of the advantages of a related analysis method, and further evaluate the epidemic situation if a related evaluation index and a division method of the epidemic situation state space of the infectious diseases under multiple indexes can be established and the attributes of the spread speed, the risk degree, the influence of policies, major events and the like of the epidemic situation of the infectious diseases are fully mined.

Disclosure of Invention

In view of the above, there is a need to provide a method and an apparatus for predicting an infectious disease transmission status based on clustering, which take evaluation of an infectious disease prevalence situation as a starting point.

A cluster-based infectious disease transmission status prediction method, the method comprising:

acquiring an infectious disease data set of a target area, wherein the infectious disease data set comprises infectious disease data at a plurality of different moments; attributes of individual infectious disease data include the rate of diagnosis, mortality, and recovery;

constructing a Markov state space according to the diagnosis rate, mortality rate and recovery rate of the infectious disease data;

clustering the infectious disease data based on a method of minimizing deviation sum of squares to further divide the Markov state space to obtain a plurality of state subspaces;

and constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix, and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.

In one embodiment, the method for minimizing sum of squared deviations clustering infectious disease data to partition the markov state space into a plurality of state subspaces comprises:

clustering the infectious disease data based on a method of minimizing deviation sum of squares to obtain an initial clustering result; wherein the sum of squared deviations of the ith cluster in the initial clustering result is:

wherein x is _k，i ＝[R _k，c ，R _k，d ，R _k，r ]T, the definite diagnosis rate, mortality rate and recovery rate at time k of the infectious disease data grouped as the ith cluster, n _i Represents the total number of steps of the ith cluster,

represents the center of the ith cluster;

and performing pairwise pre-combination on each cluster in the initial clustering result, and calculating the increment of the deviation square sum inside the pre-combined clusters relative to the deviation square sum inside the two corresponding clusters before pre-combination:

ΔL＝L _i∪j -(L _i +L _j )

where Δ L is the increment of the sum of squares of deviations, L _i∪j For pre-merging the i, j clusters to obtain the deviation square sum, L, in the new cluster _i Is the sum of squares of deviations within i clusters before precombinations, L _j The sum of squares of deviations inside the j clusters before precombinations;

and taking the pre-merged clusters corresponding to the minimum increment and the pre-merged clusters corresponding to the non-minimum increment as initial Markov state spaces of the next clustering process until a preset number of clusters are obtained, taking the preset number of clusters as each cluster of the final clustering, and taking each cluster of the final clustering as each state subspace obtained after the Markov state spaces are divided.

In one embodiment, the constructing a markov chain from the plurality of state subspaces results in a state transition matrix, including:

constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix as follows:

where m is the number of state subspaces, P _ij Is a state subspace S _i Transfer to the State subspace S by 1 step _j The transition probability of (2).

The method for calculating the diagnosis rate, the death rate and the recovery rate comprises the following steps:

the confirmed diagnosis rate is calculated according to the confirmed diagnosis number in the preset time period of the target area and the total population of the target area as follows:

wherein R is _c For confirmation of diagnosis rate, N _c The number of the confirmed persons in a preset time period, N _t Is the general population of the target area;

and calculating the death rate according to the death number in the preset time period of the target area and the accumulated diagnosed number of the infectious diseases in the target area as follows:

wherein R is _d For mortality, N _d (t) the number of people diagnosed within a preset time period,

accumulating the number of deaths for the target area infectious disease;

calculating the cure rate according to the number of healed people in the preset time period of the target area and the accumulated number of confirmed diagnosis people of the infectious diseases in the target area as follows:

wherein R is _r For recovery rate, N _r (t) the number of healings within a predetermined time period.

A cluster-based infectious disease transmission status prediction apparatus, the apparatus comprising:

the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring an infectious disease data set of a target area, and the infectious disease data set comprises a plurality of infectious disease data at different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;

the construction module is used for constructing a Markov state space according to the diagnosis rate, the death rate and the recovery rate of the infectious disease data;

the partitioning module is used for clustering the infectious disease data based on a method of minimizing deviation sum of squares so as to partition the Markov state space to obtain a plurality of state subspaces;

and the prediction module is used for constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

acquiring an infectious disease data set of a target area, wherein the infectious disease data set comprises infectious disease data at a plurality of different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

According to the infectious disease propagation state prediction method and device based on clustering, relevant data of infectious disease prevalence conditions in a preset time period of a target area are obtained in real time, a definite diagnosis rate index, a mortality rate index and a recovery rate index are constructed, a Markov state space is constructed according to the three indexes, the state space is divided according to a method of minimizing deviation square sum, a preset number of state subspaces are obtained, a Markov chain is constructed according to the state subspaces to obtain a probability transition matrix, and the propagation state of infectious diseases is predicted according to the probability transition matrix. The invention comprehensively considers the diagnosis rate index, the death rate index and the recovery rate index of the infectious disease, analyzes the development of the infectious disease by adopting a clustering method according to the characteristics of the indexes, constructs and analyzes the property (mainly the stability) of a probability transfer matrix, and accurately predicts the state development of the epidemic situation of the infectious disease.

Drawings

FIG. 1 is a schematic flow chart illustrating a method for cluster-based infectious disease transmission status prediction in one embodiment;

FIG. 2 is a diagram illustrating a method for uniformly partitioning a state space according to an embodiment;

FIG. 3 is a diagram illustrating the construction and partitioning of a Markov state space for region A, according to one embodiment;

FIG. 4 is a result of the Markov state space construction and partitioning of the B region under one embodiment;

FIG. 5 is a block diagram of an apparatus for predicting an infectious disease propagation status based on clustering according to an embodiment;

FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, there is provided a cluster-based infectious disease transmission status prediction method, comprising the steps of:

step 102, an infectious disease data set of a target area is obtained.

The infectious disease data set comprises infectious disease data of a plurality of different moments, the attribute of each infectious disease data comprises diagnosis rate, death rate and recovery rate,

the data set in the embodiment is dynamic stream data acquired in real time, and according to the characteristics of the stream data, the influence that the epidemic situation data does not meet the independent homogeneous distribution assumption can be avoided, and the epidemic situation evaluation is performed only on the basis of local areas.

A markov state space is constructed based on the rate of certainty, mortality, and recovery of the infectious disease data, step 104.

Each disease, regardless of its occurrence, progression and cure, is affected by many external and internal factors of the human body, which have complex relationships and are difficult to interpret by structural causal models. This interdependence between data is the most important and useful feature of the study. According to the change rule of the infectious disease data along with the time sequence, the establishment of the dynamic model is an effective method.

The embodiment constructs three indexes (R) for measuring the epidemic situation development of the infectious diseases _c 、R _d 、R _r ) The three indexes represent the development state of the epidemic situation of the infectious disease from different angles, so that the constructed Markov state space contains rich information. (ii) a For example, when the virus is highly infectious but easily cured, the index may be expressed as R _c Larger, R _r Is also large; when the virus is less infectious but the mortality is higher, the index may indicate R _c Smaller, R _d Is relatively large. It can be seen that different scale of index combinations can be used to measure different epidemic states of infectious diseases.

And 106, clustering the infectious disease data based on a method of minimizing deviation sum of squares to further divide the Markov state space to obtain a plurality of state subspaces.

Infectious disease prevalence assessment is difficult to quantify. Along with the development of the era, viruses can generate various mutations, and governments can also set corresponding policies, so that the data related to the epidemic situation of the infectious diseases cannot meet the independent and identically distributed hypothesis. Therefore, the traditional statistical method is often difficult to work, and a streaming data clustering method is adopted to set sampling results in the same distribution in the same cluster, so that a plurality of state subspaces corresponding to a plurality of clusters can be obtained.

In the method for uniformly dividing the state space, each index is divided into two cases of high and low, and then the two cases of the three indexes are combined to obtain eight states in total, as shown in fig. 2, but in the actual case, the distribution of the epidemic situation development states of the infectious disease at different times in the same target area is uneven and relatively concentrated, so that a more reasonable state space division method which is more consistent with the actual case is required, namely, the state space division method is based on the characteristics (R) of the historical data _c 、R _d 、R _r ) The data is divided into K clusters to minimize the difference between each data sample in each cluster. The sum of squared differences represents the scatter of the data well, so the sum of the variances of all data in the same cluster represents the similarity of the data attributes in the same cluster well.

And 108, constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix, and predicting the propagation state of the infectious diseases according to the stationarity of the state transition matrix.

The multiple state subspaces are correlated in time and space, so that a Markov chain can be constructed according to the multiple state subspaces, the Markov chain is further constructed according to the time and space correlation among the state subspaces, a state transition matrix is further calculated, and the state subspace in which the three indexes of the epidemic situation of the infectious disease are gradually stabilized is obtained by analyzing the stationarity of the state transition matrix, so that the propagation state of the infectious disease is predicted.

The infectious disease propagation state prediction method based on clustering is characterized in that relevant data of infectious disease prevalence conditions in a preset time period of a target area are obtained in real time, a definite diagnosis rate index, a death rate index and a recovery rate index are constructed, a Markov state space is constructed according to the three indexes, the state space is divided according to a method of minimizing deviation square sum to obtain a preset number of state subspaces, a Markov chain is constructed according to the state subspaces to obtain a probability transition matrix, and the propagation state of infectious diseases is predicted according to the probability transition matrix. The invention comprehensively considers the diagnosis rate index, the death rate index and the recovery rate index of the infectious disease, analyzes the development of the infectious disease by adopting a clustering method according to the characteristics of the indexes, constructs and analyzes the property (mainly the stability) of a probability transfer matrix, and accurately predicts the state development of the epidemic situation of the infectious disease.

In one embodiment, the method for calculating the diagnosis rate, the death rate and the recovery rate comprises the following steps:

the death rate is calculated according to the death number in the preset time period of the target area and the accumulated confirmed diagnosis number of the infectious diseases in the target area as follows:

wherein R is _d For mortality, N _d (t) is the number of deaths within a predetermined time period,

the accumulated number of confirmed people for the target region infectious diseases;

In one embodiment, clustering the infectious disease data based on a method of minimizing a sum of squared deviations to further partition a markov state space to obtain a plurality of state subspaces, comprises:

clustering the infection data based on a method of minimizing deviation sum of squares to obtain an initial clustering result; wherein the sum of squared deviations of the ith cluster in the initial clustering result is:

wherein x is _k，i ＝[R _k，c ，R _k，d ，R _k，r ] ^T The definite diagnosis rate, mortality rate and recovery rate at time k of infectious disease data grouped as the ith cluster, n _i Represents the total number of steps of the ith cluster,

represents the center of the ith cluster;

and performing pairwise pre-combination on each cluster in the initial clustering result, and calculating the increment of the deviation square sum inside the pre-combined clusters relative to the deviation square sum inside the two clusters before the corresponding pre-combination:

ΔL＝L _i∪j -(L _i +L _j )

where Δ L is the increment of the sum of squares of deviations, L _i∪j For pre-combining the i, j clusters to obtain a new cluster internal deviation square sum, L _i Is the sum of squares of deviations within i clusters before precombinations, L _j The sum of squares of deviations inside the j clusters before precombinations;

For example, the data of epidemic situation of infectious disease in a certain period of time in the region a is selected to construct and divide the markov state space, and as a result, as shown in fig. 2, 5 clusters, i.e., 5 state subspaces, are obtained, and it can be seen that the state subspace 1 is characterized by a high R _r Low R _d (ii) a The state subspace 2 is characterized by a middle R _r High R _c Low R _d (ii) a The state subspace 3 is characterized by a low R _r Low R _d (ii) a The state subspace 4 is characterized by a low R _r High R _d (ii) a The state subspace 5 is characterized by a low R _r Wherein R is _d . The data of the prevalence of infectious diseases in a certain time period in the B region is selected to construct and divide a markov state space, and the result is shown in fig. 3. It can be seen that the distribution of the acquired state subspace of the data of the epidemic situation of the infectious diseases in different areas is different, which accords with the real situation, different control strategies for the epidemic situation of the infectious diseases are adopted in different areas, and the corresponding development trends of the epidemic situation of the infectious diseases are different.

In one embodiment, constructing a Markov chain from a plurality of state subspaces results in a state transition matrix comprising:

constructing a Markov chain from the plurality of state subspaces to obtain a state transition matrix as follows:

where m is the number of state subspaces, P _ij Is a state subspace S _i Transfer to the State subspace S through 1 step _j The transition probability of (2). The constructed state space may be divided into different state subspaces. Of course, different prevention and control strategies are adopted by the regional managers, which will cause the epidemic situation of the infectious diseases to spread in the whole national space. Therefore, the Markov state transition matrix changes in different periods, namely, the influence of different prevention and control strategies can be known by analyzing the change of the Markov chain transition matrix in different periods.

It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

A specific example is provided herein to illustrate the method in detail:

the prevention and control strategy of the area A is as follows:

2 month 23 day: restricted access to areas with more severe epidemics.

3 month 16 day: a social distance policy is adopted, and more than 50 people are prohibited from gathering for 26 days to 7 months.

Therefore, we selected data from day 23/2 to day 30/3 for calculation. Considering that the incubation period of the infection is 14 days, we have chosen 3 months and 30 days instead of 3 months and 16 days, because even if a new control strategy is released at 3 months and 16 days, the next 14 days will show the epidemic transmission characteristics of the previous strategy. As shown in fig. 4, a markov chain is obtained, and the corresponding state transition matrix is obtained:

by applying the stationarity judging method of the markov chain, it can be known that the markov chain is stationary. Therefore, under the control strategy, the influence state of the spread of the epidemic situation of the infectious disease can be kept stable. It is possible to obtain:

this means that under this strategy the epidemic status will approach status 2, R gradually _c Higher, and R is _r Low, indicating that if the strategy is continued, the spread of the epidemic will not improve, requiring adjustment of the strategy.

As can be seen from the final smooth distribution, the isolation control strategy taken 3 months and 16 days helps control the epidemic. During smooth distribution, the initial strategy (day 23 and 2 months) will transition from state 1 to state 2.

Then, selecting data from 3 months and 30 days to 5 months and 26 days for calculation to obtain a probability transition matrix:

this markov chain is also smooth. Under this policy, the impact of the spread of the epidemic of the infectious disease will also remain stable, resulting in:

the measures taken on day 16/3 reduced the probability of transitioning from state 1 to state 2 to 0.82 and the probability of transitioning to state 5 to 0.18. State 5 represents high R _r Low R _d And R _c I.e., low infection rate, low mortality rate, and high cure rate. Therefore, the measures taken 3 months and 16 days have a positive effect on controlling the spread of the epidemic.

And finally, selecting data from 26 days in 5 months to 4 days in 7 months to obtain a probability transition matrix:

the markov chain is also stationary, we get:

if continued, the epidemic in area A will be in State 1: high levels of R _c And R _d While at the same time at a low level of R _c A state of complete runaway with high mortality, high diagnostic rate, low cure rate; or at 59% R _c And R _r In the low state, the effect of disease transmission cannot be controlled. This indicates that in the context of epidemic spread, paradoxical movement will exacerbate the epidemic.

In one embodiment, as shown in fig. 5, there is provided a cluster-based infectious disease transmission status prediction apparatus including: the system comprises an acquisition module, an index construction module and a division module, wherein:

the division module is used for clustering the infectious disease data based on a method of minimizing deviation sum of squares so as to divide the Markov state space to obtain a plurality of state subspaces;

For specific limitations of the device for predicting infectious disease transmission status based on clustering, see the above limitations on the method for predicting infectious disease transmission status based on clustering, which are not described herein again. The modules in the cluster-based infectious disease propagation state prediction apparatus may be wholly or partially implemented by software, hardware, or a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing the related data of the epidemic situation of the infectious diseases. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a cluster-based infectious disease transmission status prediction method.

Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the method in the above embodiments when the processor executes the computer program.

In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method in the above-mentioned embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

All possible combinations of the technical features in the above embodiments may not be described for the sake of brevity, but should be considered as being within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An infectious disease transmission state prediction method based on clustering is characterized by comprising the following steps:

acquiring an infectious disease data set of a target area; the infectious disease data set comprises infectious disease data at a plurality of different moments; attributes of individual infectious disease data include rate of diagnosis, mortality, and recovery;

2. The method of claim 1, wherein clustering the infectious disease data and partitioning the markov state space into a plurality of state subspaces based on the least squares of deviation sum of squares method comprises:

wherein x is _k，i ＝[R _k，c ，R _k，d ，R _k，r ] ^T Denotes the definite diagnosis rate, mortality rate and recovery rate at time k of infectious disease data clustered as the ith cluster, n _i Represents the total number of steps of the ith cluster,

represents the center of the ith cluster;

ΔL＝L _i∪j -(L _i +L _j )

where Δ L is the increment of the sum of squares of deviations, L _i∪j For pre-merging the i, j clusters to obtain the deviation square sum, L, in the new cluster _i Is the sum of squares of deviations within i clusters before precombinations, L _j The square sum of the deviation in the j clusters before pre-combination;

3. The method of claim 1, wherein constructing a Markov chain from the plurality of state subspaces results in a state transition matrix comprising:

where m is the number of state subspaces, P _ij Is a state subspace S _i Transfer to the State subspace S through 1 step _j The transition probability of (2).

4. The method of claim 1, wherein the rate of diagnosis, mortality, and recovery are calculated by:

calculating the cure rate according to the number of cure people in a preset time period of the target area and the cumulative number of confirmed patients of the target area infectious diseases as follows:

5. An apparatus for predicting infectious disease transmission status based on clustering, the apparatus comprising:

and the prediction module is used for constructing a Markov chain according to the plurality of state subspaces to obtain a state transition matrix and predicting the transmission state of the infectious diseases according to the stationarity of the state transition matrix.

6. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 4 when executing the computer program.

7. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 4.